Skip to content

Obtaining SE from the hessian matrix

4 messages · Thomas Lumley, Spencer Graves, Timur Elzhov

#
Dear R experts,

In R-intro, under the 'Nonlinear least squares and maximum likelihood
models' there are ttwo examples considered how to use 'nlm' function.
In 'Least squares' the Standard Errors obtained as follows:

    After the fitting, out$minimum is the SSE, and out$estimates are the
    least squares estimates of the parameters. To obtain the approximate
    standard errors (SE) of the estimates we do:

    > sqrt(diag(2*out$minimum/(length(y) - 2) * solve(out$hessian)))

But under 'Maximum likelihood' section I've read:

    After the fitting, out$minimum is the negative log-likelihood, and
    out$estimates are the maximum likelihood estimates of the parameters.
    To obtain the approximate SEs of the estimates we do:

    > sqrt(diag(solve(out$hessian)))

As for me, I use MINPACK fortran library for NLS fitting in R, and there
I also get the hessian matrix. What formula should I use in _this_ case?
Well, some times ago I had a glance at gsl, GNU Scientific Library. It
use converted-to-C MINPACK for NLS fit too.  And, in the GSL ref. manual
example
    http://www.gnu.org/software/gsl/manual/html_node/gsl-ref_36.html#SEC475
SE calculated as
    #define ERR(i) sqrt(gsl_matrix_get(covar,i,i))

where covar = (J^T * J)^-1 (i.e. how in the second formulae above).
So, what is the _right_ way for obtatining SE? Why two those formulas above
differ?

Thank you!

--
WBR,
Timur.
#
On Thu, 19 Feb 2004, Timur Elzhov wrote:
If you are maximising a likelihood then the covariance matrix of the
estimates is (asymptotically) the inverse of the negative of the Hessian.

The standard errors are the square roots of the diagonal elements of the
covariance.

So if you have the Hessian you need to invert it, if you have the
covariance matrix, you don't.

	-thomas
#
Minor correction:  Most likely, Prof. Lumley's statement is 
correct.  However, as I'm sure he knows, it depends on what you are 
maximizing or minimizing:  If you are maximizing the log(likelihood), 
then the NEGATIVE of the hessian is the "observed information".  This 
latter should be positive semi-definite, and if nonsingular, its inverse 
will be the covariance matrix of the standard normal approximation.  
Alternatively, if you MINIMIZE a "deviance" = (-2)*log(likelihood), then 
the HALF of the hessian is the observed information.  In the unlikely 
event that you are maximizing the likelihood itself, you need to divide 
the negative of the hessian by the likelihood to get the observed 
information. 

      hope this helps.  spencer graves
Thomas Lumley wrote:

            
#
On Thu, Feb 19, 2004 at 09:22:09AM -0800, Thomas Lumley wrote:

            
Yes, the covariance matrix is inverse of the Hessian, that's clear.
But my queston is, why in the first example:

    > sqrt(diag(2*out$minimum/(length(y) - 2) * solve(out$hessian)))
	      
    The 2 in the line above represents the number of parameters. A 95%
    confidence interval would be the parameter estimate +/- 1.96 SE. We
    can superimpose the least squares fit on a new plot:

- we don _not_ use simply 'sqrt(diag(solve(out$hessian)))', how in the
second example, but also include in some way "number of parameters" == 2?
What does '2*out$minimum/(length(y) - 2)' multiplier mean?

Thanks!

--
WBR,
Timur.