lme4/glmer convergence warnings
On 14-04-10 04:28 PM, W Robert Long wrote:
Hi Ben Could you clarify what you mean by
params <- getME(fitted_model,"theta")
## or unlist(getME(fitted_model,c("theta","fixef")) for GLMMs
grad(dd,params)
This is a logistic model, so I assume I need to use unlist, however it seems that "fixef" is not a structure in fitted_model because: Error in match.arg(name) : 'arg' should be one of ?X?, ?Z?, ?Zt?, ?Ztlist?, ?y?, ?mu?, ?u?, ?b?, ?Gp?, ?Tp?, ?L?, ?Lambda?, ?Lambdat?, ?Lind?, ?A?, ?RX?, ?RZX?, ?sigma?, ?flist?, ?beta?, ?theta?, ?ST?, ?REML?, ?is_REML?, ?n_rtrms?, ?n_rfacs?, ?cnms?, ?devcomp?, ?offset?, ?lower? Also, grad() does not seem to be an available function. Please advise Thanks Rob
Oops, too much of a hurry.
* We just added 'fixef' as a synonym for 'beta' in getME, but it's only
on github. unlist(getME(fitted_model,c("theta","beta"))) should work ...
* grad() is in the numDeriv package -- install/load it first, or use
numDeriv::grad() if it's installed and you don't feel like loading.
cheers
Ben Bolker
On 10/04/2014 20:08, Ben Bolker wrote:
On 14-04-10 04:38 AM, W Robert Long wrote:
Hi Ben For my model, I get
max(abs(relgrad))
[1] 1.081706 Does this help ? Thanks Rob
Thanks.
Unfortunately it means there's still something we don't understand.
To sum up:
* we calculate the finite-difference gradients and Hessians with
respect to the model parameters at the estimated MLE (or restricted MLE).
- these might be poor estimates; we are using our own hand-rolled,
naive central finite difference code rather than the better tested code
in the numDeriv package, which uses the more expensive but more accurate
Richardson method. To see if this is the problem, compare
fitted_model at optinfo$derivs$gradient
to
dd <- update(fitted_model,devFunOnly=TRUE)
params <- getME(fitted_model,"theta")
## or unlist(getME(fitted_model,c("theta","fixef")) for GLMMs
grad(dd,params)
and see if they differ significantly. I mostly have *not* seen a
discrepancy in these, although Rune Haubo has reported some
discrepancies (although I think they were in the Hessian rather than the
gradient ... ?)
It might make sense to compare the Hessian in @optinfo with
hessian(dd,params) from numDeriv as well.
* we then check those gradients, or the relative gradients from
with(fitted_model at optinfo$derivs,solve(Hessian,gradient))
against some tolerance. It makes sense that large gradients at the
estimated parameters (where they should be zero) are a problem, but it
continues to be unclear to me on exactly what scale they should be
small/how we should make this comparison ...
It might make sense to see if John Nash's optfntools package (on
r-forge) has any useful tools or ideas, although as I recall (vaguely)
those implementations of (e.g.) the KKT criteria
<http://en.wikipedia.org/wiki/Karush%E2%80%93Kuhn%E2%80%93Tucker_conditions>
were pretty sensitive ...
* as a crude test of whether the actual value you got is a false
convergence (I know in this case you don't think it is, but including
this for completeness) the things I know to do are (1) try re-fitting
with different optimizers, either starting from the putative best fit or
from scratch (see
<https://github.com/lme4/lme4/blob/master/misc/issues/allFit.R>); (2)
try re-starting the _same_ optimizer from the putative best fit; (3)
explore the likelihood surface (e.g. with profiling or bbmle::slice2D).
Ben Bolker
On 10/04/2014 03:33, Ben Bolker wrote:
Ben Bolker <bbolker at ...> writes:
On 14-04-06 04:31 AM, Tibor Kiss wrote:
Hi, being somewhat nonplussed by similar messages, I also applied
Ben's recent suggestion to one of my models to get:
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.343e-05 3.530e-05 5.756e-05 7.631e-05 9.841e-05 1.932e-04
So following up on Rob's message: What does it mean?
With kind regards
Tibor
It means that on the scale of the _standard deviations_ of the
parameters, the estimated gradients at the MLE (or restricted MLE) are
not large. I was surprised in Rob's case that these scaled gradients
were not that small; much smaller than without the scaling, but not
small enough to make me think really understand what's going on.
To recapitulate: the appearance of all of these new messages in
the
latest version of lme4 is **not** due to a degradation or change in
the
optimization or fitting procedure -- it's due to a new set of
convergence tests that we implemented, that we think are giving a
lot of
false positives. You can easily shut them off yourself, or raise the
tolerance for the warnings (see ?lmerControl/?glmerControl). As
developers, we're a bit stuck now because we don't want to turn the
warnings _off_ until we understand the circumstances that are
triggering
them better, and that takes more time and effort than we've been
able to
muster so far.
[much context snipped]
Just to follow up on this: more technical discussion is going on at
https://github.com/lme4/lme4/issues/120 ... at present, it is looking
like scaling the gradient by the hessian is going to solve a lot of
problems. If you are experiencing convergence warnings about
max|grad| that you suspect are false positives, it would be a great
help if you could try
relgrad <-
with(fitted_model at optinfo$derivs,solve(Hessian,gradient))
max(abs(relgrad))
check if the result is a small number (e.g. <0.001) and report **one
way or the other** on this list, or at the Github url above, or
(least preferred) by e-mailing lme4-authors at lists.r-forge.r-project.org
We also hope that this test *will* pick up the cases where people have
reported problems with Nelder-Mead not working properly ...
Ben Bolker
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models