Skip to content

Wrong convergence warnings with glmer in lme4 version 1.1-6??

3 messages · Tom Davis, Ben Bolker

#
Quick answer: these are probably false positives, driven by some
combination of the following:

 * we are using derivative-free methods (bobyqa, Nelder-Mead) rather
than derivative (or approximate-derivative-) based methods (nlminb,
L-BFGS-B) to optimize over the parameters.  Therefore, while using
approximate derivatives to assess convergence is a good idea, we don't
have any formal guarantee of how small the derivatives are supposed to
be (the stopping conditions for the derivative-free methods are not
based explicitly on derivatives, although they should of course be
related to the local derivatives); nor do we know how large a gradient
we should be worried about.
 * it's possible that the finite-difference approximations to the
gradients are themselves inaccurate
 * one flaw of our current approach is that we don't take singular fits
into account: that is, we include even the gradient elements for which
the parameters are up against a boundary (zero variances, or perfect
correlations). (That said, I've seen at least a few cases of relatively
large max-abs-grad values where the fit was *not* singular.)

  We (lme4 maintainers) are sorry for any inconvenience or worry, and
are working to resolve these issues.  However, we'd rather try to
understand what's going on here and convince ourselves that there is
*not* something worrisome going on, rather than just increase the
tolerances for the tests and make everything quiet again ...

  When I run your example (thanks for the reproducible example!) I get

 m1 at optinfo$derivs$gradient

 [1]  0.339026660 -0.008029951  0.145529325  0.093438612  0.123937398
 [6]  0.138252908  0.113161827  0.050304411  0.172043387  0.061015458
[11]  0.122968431  0.159325086  0.117684399  0.072351372  0.134278462
[16]  0.052197579 -0.001719254  0.109679304  0.199987785  0.129893883
[21]  0.138903688  0.024782721  0.047988965  0.028119425  0.096921221

Double-check internal gradient calculation with (slightly more
expensive, and non-boundary-respecting) numDeriv::grad

library(numDeriv)
dd <- update(m1,devFunOnly=TRUE)
grad(dd,unlist(getME(m1,c("theta","beta"))))

[1]  0.33902876 -0.00802679  0.14553678  0.09346970  0.12393419 0.13824924
 [7]  0.11335159  0.05030383  0.17205489  0.06102088  0.12296410  0.15933547
[13]  0.11768797  0.07235296  0.13426145  0.05219938 -0.00172568  0.10990842
[19]  0.19998764  0.12991478  0.13889906  0.02478346  0.04799346  0.02810442
[25]  0.09692275

  Looks sensible.

  I don't have more time to devote this right now, but my next steps
would/will be:

 * restart the optimization from the fitted optimum (possibly? using
refit(), or update(m1,start=getME(m1,c("theta","beta")))) and see if
that makes the max grad smaller
 * try a range of optimizers, especially nlminb, as described here:
http://stackoverflow.com/questions/21344555/convergence-error-for-development-version-of-lme4/21370041#21370041

  Ben Bolker
On 14-03-26 02:19 PM, Tom Davis wrote:
#
Ben Bolker <bbolker at ...> writes:

PS ...
Steve Walker tried this, and it does 'work' -- successive refits
don't change the answer much, and the final gradient gets
progressively smaller (although it takes two refits to get below the
default test tolerance).  (The code above doesn't work as written
because the fixed effect component of the starting parameter list has
to be named 'fixef' -- we should change this to allow 'beta' as well
...)
convergence-error-for-development-version-of-lme4/21370041#21370041
[URL broken to make gmane happy]
I tried this with the full range of optimizers listed at that link.
*Only* the built-in Nelder-Mead has the problem; all other optimizers
(optimx + nlminb or L-BFGS-B; nloptr + Nelder-Mead or BOBYQA; built-in
bobyqa) get max(abs(gradient)) considerably less than the tolerance --
but the actual fitted model changes very little (the log-likelihood increases
by <0.01).

  So this does seem to be a false positive.  Still doesn't explain
why this is happening with Nelder-Mead, or under what circumstances
it's likely to happen (although big models do look prone to it).  We
should probably switch away from Nelder-Mead as the default throughout
(this was already done for lmer models in the last release, but not
for glmer), although I would love to do some more testing before jumping
out of the frying pan ...