Dear R experts,
In R-intro, under the 'Nonlinear least squares and maximum likelihood
models' there are ttwo examples considered how to use 'nlm' function.
In 'Least squares' the Standard Errors obtained as follows:
After the fitting, out$minimum is the SSE, and out$estimates are the
least squares estimates of the parameters. To obtain the approximate
standard errors (SE) of the estimates we do:
> sqrt(diag(2*out$minimum/(length(y) - 2) * solve(out$hessian)))
But under 'Maximum likelihood' section I've read:
After the fitting, out$minimum is the negative log-likelihood, and
out$estimates are the maximum likelihood estimates of the parameters.
To obtain the approximate SEs of the estimates we do:
> sqrt(diag(solve(out$hessian)))
As for me, I use MINPACK fortran library for NLS fitting in R, and there
I also get the hessian matrix. What formula should I use in _this_ case?
Well, some times ago I had a glance at gsl, GNU Scientific Library. It
use converted-to-C MINPACK for NLS fit too. And, in the GSL ref. manual
example
http://www.gnu.org/software/gsl/manual/html_node/gsl-ref_36.html#SEC475
SE calculated as
#define ERR(i) sqrt(gsl_matrix_get(covar,i,i))
where covar = (J^T * J)^-1 (i.e. how in the second formulae above).
So, what is the _right_ way for obtatining SE? Why two those formulas above
differ?
Thank you!
--
WBR,
Timur.
Obtaining SE from the hessian matrix
4 messages · Thomas Lumley, Spencer Graves, Timur Elzhov
On Thu, 19 Feb 2004, Timur Elzhov wrote:
So, what is the _right_ way for obtatining SE? Why two those formulas above differ?
If you are maximising a likelihood then the covariance matrix of the estimates is (asymptotically) the inverse of the negative of the Hessian. The standard errors are the square roots of the diagonal elements of the covariance. So if you have the Hessian you need to invert it, if you have the covariance matrix, you don't. -thomas
Minor correction: Most likely, Prof. Lumley's statement is
correct. However, as I'm sure he knows, it depends on what you are
maximizing or minimizing: If you are maximizing the log(likelihood),
then the NEGATIVE of the hessian is the "observed information". This
latter should be positive semi-definite, and if nonsingular, its inverse
will be the covariance matrix of the standard normal approximation.
Alternatively, if you MINIMIZE a "deviance" = (-2)*log(likelihood), then
the HALF of the hessian is the observed information. In the unlikely
event that you are maximizing the likelihood itself, you need to divide
the negative of the hessian by the likelihood to get the observed
information.
hope this helps. spencer graves
Thomas Lumley wrote:
On Thu, 19 Feb 2004, Timur Elzhov wrote:
So, what is the _right_ way for obtatining SE? Why two those formulas above differ?
If you are maximising a likelihood then the covariance matrix of the estimates is (asymptotically) the inverse of the negative of the Hessian. The standard errors are the square roots of the diagonal elements of the covariance. So if you have the Hessian you need to invert it, if you have the covariance matrix, you don't. -thomas
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
On Thu, Feb 19, 2004 at 09:22:09AM -0800, Thomas Lumley wrote:
So, what is the _right_ way for obtatining SE? Why two those formulas above differ?
If you are maximising a likelihood then the covariance matrix of the estimates is (asymptotically) the inverse of the negative of the Hessian. The standard errors are the square roots of the diagonal elements of the covariance. So if you have the Hessian you need to invert it, if you have the covariance matrix, you don't.
Yes, the covariance matrix is inverse of the Hessian, that's clear.
But my queston is, why in the first example:
> sqrt(diag(2*out$minimum/(length(y) - 2) * solve(out$hessian)))
The 2 in the line above represents the number of parameters. A 95%
confidence interval would be the parameter estimate +/- 1.96 SE. We
can superimpose the least squares fit on a new plot:
- we don _not_ use simply 'sqrt(diag(solve(out$hessian)))', how in the
second example, but also include in some way "number of parameters" == 2?
What does '2*out$minimum/(length(y) - 2)' multiplier mean?
Thanks!
--
WBR,
Timur.