Convergence problem example
Dear Ben I understand. Help us think of ways we can help you. I wonder why you don't take the easy route. Have glmer do the convergence test. If failed, then change the optimizer to bobyqa, and IF that silences the convergence warning, report that result? That's what I'd do, if I knew how :) Dear everybody else Lets see if we can help! Questions/Ideas 1. How often does this come up? Casual googling indicates the problem was widespread 2 years ago, maybe not so many posts about it now. Do you think so? 2. What do you think about making a survey for lme4 users to find out how often convergence warnings happen? We could make a place to attach files that have code and R objects. If that survey existed, could we insert its address into the lme4 convergence warning along with instructions on what to do. I'm thinking something simple like "Help with diagnostics. Run this: saveRDS(the_data_frame, file = "your-last-name-20180425.rds") and then paste in the trouble-causing function call into the following box... Some cases have private data, I understand, but I doubt that all of them do. 3. Should we archive & categorize the reproducible examples? What do you say if we (I) make a GitHub project for convergence failure examples? Self contained examples. Maybe the one I had today is example 1. I suppose we need maybe 20 or 30 more, with a variety of different warning messages. I wish I had an example where the convergence diagnostic is correct and the problem is not solved in a superficial way. I'd be especially enthusiastic if Stata or SAS don't notice and report non-converged results on same model. I found a post that hits the high points for me: https://rstudio-pubs-static.s3.amazonaws.com/33653_57fc7b8e5d484c909b615d8633c01d51.html (I wish the author would put his/her name in it!) pj
On Wed, Apr 25, 2018 at 1:47 PM, Ben Bolker <bbolker at gmail.com> wrote:
Just a quick comment. As long-time readers of this forum may know (I think I've said it here before), the convergence warnings provided by lme4 are often over-sensitive. They're based on *post hoc* attempts (after the nonlinear optimizer thinks it has converged) to evaluate the KKT criteria (in this case reducing to zero gradient, negative positive-definite Hessian) by computing the finite-difference approximations of the gradient and Hessian of the negative log-likelihood. The problem is that this computation is just as subject to numeric error as the nonlinear optimization itself, so it doesn't work very well. (Some of this is stated in ?lme4::convergence) Doug Bates feels that attempts to fix the problem are "toast-scraping", i.e. attempts to fix something that was a bad idea in the first place (i.e., I/we should throw away the metaphorical burned toast and scrap the post-hoc convergence tests, relying on the optimizer to tell us if it thinks it has converged successfully). The Catch-22 is that (1) I'm afraid to throw a system that might catch a reasonable fraction of truly bad convergence cases; (2) I don't want to flip-flop (take the tests out, then decide that they were a good idea after all and reintroduce them); (3) making an *informed* decision what to do (e.g. throwing out the tests or setting a more appropriate threshold) would require a lot of work generating test cases -- an in order to get a really good handle on the problem, we'd have to generate not just "nice" test cases but all kinds of pathological examples where there really are convergence problems. Besides being hard and tedious, this is time I don't really have. Volunteers ... ? cheers Ben Bolker On 2018-04-25 11:12 AM, Phillip Alday wrote:
Changing the optimizer seems to address the convergence warnings and has no impact on the other aspects of the fit:
m1 <- update(m1, control=glmerControl(optimizer="bobyqa")) summary(m1)
Generalized linear mixed model fit by maximum likelihood (Adaptive
Gauss-Hermite Quadrature, nAGQ = 30) [glmerMod]
Family: binomial ( logit )
Formula: fscr ~ lncfs + ai + heifer + (1 | cow)
Data: dairy
Control: glmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
4017.8 4047.9 -2003.9 4007.8 3022
Scaled residuals:
Min 1Q Median 3Q Max
-1.8136 -0.7652 -0.6479 1.0420 1.6213
Random effects:
Groups Name Variance Std.Dev.
cow (Intercept) 0.3419 0.5847
Number of obs: 3027, groups: cow, 1575
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.55696 0.43481 -3.581 0.000343 ***
lncfs 0.52222 0.10007 5.218 1.8e-07 ***
aiai -1.09560 0.12341 -8.878 < 2e-16 ***
heiferprimiparous -0.08259 0.09718 -0.850 0.395418
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) lncfs aiai
lncfs -0.965
aiai -0.193 -0.046
heifrprmprs -0.054 0.016 -0.051
Phillip
On 04/25/2018 04:56 PM, Paul Johnson wrote:
In the book Multilevel and Longitudinal Modeling using Stata,
Rabe-Hesketh and Skrondal have a lot of exercises and over the years
I've been trying to write Stata and R code to demonstrate
similarities/differences.
I've run into an example where Stata (either with builtin methods or
the addon gllamm) seems to think it gets estimates but lme4
diagnostics say there is a convergence failure. I want to impose on
you, get your advice about it. We see convergence warnings quite often
with lme4, but they are usually the "rescale your variables" errors,
not as blunt as this.
The first part retrieves "dairy.dta" from the book website
library(foreign)
library(lme4)
fn <- "dairy"
if (!file.exists(paste0(fn, ".dta12"))) {
download.file(paste0("http://www.stata-press.com/data/mlmus3/", fn, ".dta"),
destfile = paste0(fn, ".dta12"))
}
dairy <- read.dta(paste0(fn, ".dta12"))
#1
m1 <- glmer(fscr ~ lncfs + ai + heifer + (1 | cow),
family = binomial(link = "logit"), nAGQ = 30,
data = dairy)
summary(m1)
From that, I see:
m1 <- glmer(fscr ~ lncfs + ai + heifer + (1 | cow),
+ family = binomial(link = "logit"), nAGQ = 30, + data = dairy) Warning messages: 1: 'rBind' is deprecated. Since R version 3.2.0, base's rbind() should work fine with S4 objects 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.00396932 (tol = 0.001, component 1)
summary(m1)
Generalized linear mixed model fit by maximum likelihood (Adaptive
Gauss-Hermite Quadrature, nAGQ = 30) [glmerMod]
Family: binomial ( logit )
Formula: fscr ~ lncfs + ai + heifer + (1 | cow)
Data: dairy
AIC BIC logLik deviance df.resid
4017.8 4047.9 -2003.9 4007.8 3022
Scaled residuals:
Min 1Q Median 3Q Max
-1.8136 -0.7652 -0.6479 1.0420 1.6213
Random effects:
Groups Name Variance Std.Dev.
cow (Intercept) 0.3419 0.5847
Number of obs: 3027, groups: cow, 1575
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.55693 0.43481 -3.581 0.000343 ***
lncfs 0.52221 0.10007 5.218 1.81e-07 ***
aiai -1.09560 0.12341 -8.878 < 2e-16 ***
heiferprimiparous -0.08259 0.09718 -0.850 0.395410
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) lncfs aiai
lncfs -0.965
aiai -0.193 -0.046
heifrprmprs -0.054 0.016 -0.051
convergence code: 0
Model failed to converge with max|grad| = 0.00396932 (tol = 0.001, component 1)
Stata output for comparison purposes, using gllamm:
. gllamm fscr lncfs ai heifer, i(cow) link(logit) fam(binom) adapt
Running adaptive quadrature
Iteration 0: log likelihood = -2004.6011
Iteration 1: log likelihood = -2003.9085
Iteration 2: log likelihood = -2003.9069
Adaptive quadrature has converged, running Newton-Raphson
Iteration 0: log likelihood = -2003.9069
Iteration 1: log likelihood = -2003.9069
Iteration 2: log likelihood = -2003.9064
Iteration 3: log likelihood = -2003.9064
number of level 1 units = 3027
number of level 2 units = 1575
Condition Number = 47.123877
gllamm model
log likelihood = -2003.9064
------------------------------------------------------------------------------
fscr | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lncfs | .5222157 .1000693 5.22 0.000 .3260834 .718348
ai | -1.095598 .1234099 -8.88 0.000 -1.337477 -.8537193
heifer | -.0825878 .0971811 -0.85 0.395 -.2730593 .1078837
_cons | -1.556961 .4348008 -3.58 0.000 -2.409155 -.7047673
------------------------------------------------------------------------------
Variances and covariances of random effects
------------------------------------------------------------------------------
***level 2 (cow)
var(1): .34188062 (.1263136)
------------------------------------------------------------------------------
Using the newer meglm that is provided with Stata
. meglm fscr lncfs ai heifer || cow: , family(binom) link(logit)
Fitting fixed-effects model:
Iteration 0: log likelihood = -2011.8253
Iteration 1: log likelihood = -2009.1421
Iteration 2: log likelihood = -2009.1412
Iteration 3: log likelihood = -2009.1412
Refining starting values:
Grid node 0: log likelihood = -2015.6021
Fitting full model:
Iteration 0: log likelihood = -2015.6021
Iteration 1: log likelihood = -2006.709
Iteration 2: log likelihood = -2003.9174
Iteration 3: log likelihood = -2003.9065
Iteration 4: log likelihood = -2003.9065
Mixed-effects GLM Number of obs = 3,027
Family: binomial
Link: logit
Group variable: cow Number of groups = 1,575
Obs per group:
min = 1
avg = 1.9
max = 5
Integration method: mvaghermite Integration pts. = 7
Wald chi2(3) = 103.86
Log likelihood = -2003.9065 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
fscr | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lncfs | .5222154 .1000694 5.22 0.000 .326083 .7183479
ai | -1.095597 .1234095 -8.88 0.000 -1.337476 -.8537191
heifer | -.0825878 .0971811 -0.85 0.395 -.2730593 .1078836
_cons | -1.556961 .4348012 -3.58 0.000 -2.409155 -.7047659
-------------+----------------------------------------------------------------
cow |
var(_cons)| .3418776 .1263105 .1657237 .7052721
------------------------------------------------------------------------------
LR test vs. logistic model: chibar2(01) = 10.47 Prob >= chibar2 = 0.0006
As far as I can see, parameter estimates are the same. I did not
compare conditional modes.
If lme4's converge warning is valid, then we have a concrete case
where Stata is reporting the wrong thing. I can see some upside there
:)
R details
sessionInfo()
R version 3.4.4 (2018-03-15) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 17.10 Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lme4_1.1-15 Matrix_1.2-14 foreign_0.8-69 loaded via a namespace (and not attached): [1] minqa_1.2.4 MASS_7.3-49 compiler_3.4.4 tools_3.4.4 [5] Rcpp_0.12.15 splines_3.4.4 nlme_3.1-137 grid_3.4.4 [9] nloptr_1.0.4 lattice_0.20-35
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Paul E. Johnson http://pj.freefaculty.org Director, Center for Research Methods and Data Analysis http://crmda.ku.edu To write to me directly, please address me at pauljohn at ku.edu.