AIC in nlmer
Helen Sofaer <helen at ...> writes:
Dear modelers, I have been using nlmer to fit a logistic function with random effects, and I?d like to use AIC values to compare models with different random effect structures. I?ve noticed that I get wildly different AIC values (e.g. delta AIC values of over 100) when I build models with the random effect on different parameters. In my dataset, this occurs with both the Laplace approximation and when using adaptive quadrature. In the example Orange dataset, this occurs most clearly with the Laplace transformation. Is this due to false convergence? Are there any known issues with AIC in nlmer, even when using nAGQ >1?
I'm working on this, but a quick answer just to get back to you:
I suspect this is a bug in nlmer.
The Details section of ?nlmer says:
"... the ?nAGQ? argument only applies to calls to ?glmer?."
However, this doesn't in practice seem to be true -- nlmer
is doing *something* different when nAGQ>1 ... more disturbingly,
the answer from AGQ=5 agrees with nlme (which in the absence of
other evidence I would take to be correct). lme4 thinks it got a
slightly better answer than nlme, but that may be due to a difference
in the way the log-likelihoods are calculated (i.e. different additive
constants).
sapply(fitlist,logLik)
nlme lme4_Laplace lme4_AGQ5
-131.5846 -945.3120 -129.8548
Coefficients shown below (extracted using some code I've been
extending from stuff in the arm package; it's the same information
as in summary(), just a bit more compact).
The fixed effect estimates are all pretty much the same
(not quite trivial differences, but all << the standard error)
If I were you, until this gets sorted out, I would definitely
use nlme (which is much better tested), unless you need something
nlmer supplies (e.g. crossed random efffects, speed?).
Thanks for the report: while I will probably only be poking around
the edges of this, I will presume on behalf of Doug Bates to encourage
you to keep reporting issues, and to check back if you don't see
any improvements coming out.
In the meantime, I have two reading suggestions. On AIC:
Greven, Sonja, and Thomas Kneib. 2010. ?On the Behaviour of Marginal
and Conditional Akaike Information Criteria in Linear Mixed Models.?
Biometrika 97 (4):
773-789. <http://www.bepress.com/jhubiostat/paper202/>.
They focus slightly more on conditional AIC (rather than the
marginal AICs which nlme/lme4 report), while I think the marginal
AIC is more applicable to your case, but they give a nice discussion
of the issues with AIC in the mixed-model case.
Also, if you're interested in reading more about the particular
case of the orange trees, Madsen and Thyre have a nice section at the
end:
Madsen, Henrik, and Poul Thyregod. 2010. Introduction to General and
Generalized Linear Models. 1st ed. CRC Press.
cheers
Ben Bolker
=============
$nlme
Estimate Std. Error
Asym 191.0501 16.154
xmid 722.5598 35.152
scal 344.1687 27.148
4 31.4826 NA
5 7.8463 NA
$lme4_Laplace
Estimate Std. Error
Asym 192.041 104.086
xmid 727.891 31.966
scal 347.968 24.416
sd.Tree.Asym 232.349 NA
sd.resid 7.271 NA
$lme4_AGQ5
Estimate Std. Error
Asym 192.059 15.585
xmid 727.934 34.443
scal 348.092 26.310
sd.Tree.Asym 31.647 NA
sd.resid 7.843 NA
The major problem I?ve had with these AIC values is that the relative AIC values don?t seem to always agree with the estimated RE standard deviations. For example, using the orange tree data, running a model with a random effect of tree identity on both the asymptote and on the scale parameter shows that the RE on the scale parameter does not explain any additional variance in the data. However, the AIC value for this model is lower (AIC = 250.2) than for the model with a RE only on the asymptote (AIC = 269.7). This is typical of what I have seen with my dataset, where the biological inference drawn from the estimates does not agree with the relative AIC values. I recognize that it may be a stretch to fit multiple random effects in this tiny Orange dataset, but I see the same issues using a much larger dataset. I have included the code for the orange data below. Thank you for your input, Helen Helen Sofaer PhD Candidate in Ecology Colorado State University Orange_Asym <- nlmer(circumference ~ SSlogis(age, Asym, xmid, scal) ~ Asym|Tree, data = Orange, start = c(Asym = 200, xmid = 725, scal = 350), verbose = TRUE, nAGQ = 5) summary(Orange_Asym) # adaptive quad: AIC = 269.7 # Laplace: AIC = 1901 Orange_scal <- nlmer(circumference ~ SSlogis(age, Asym, xmid, scal) ~ scal|Tree, data = Orange, start = c(Asym = 200, xmid = 725, scal = 350), verbose = TRUE, nAGQ = 5) summary(Orange_scal) # adaptive quad: AIC = 314.4 # Laplace: AIC = 8927 Orange_Ascal <- nlmer(circumference ~ SSlogis(age, Asym, xmid, scal) ~ (scal|Tree) + (Asym|Tree), data = Orange, start = c(Asym = 200, xmid = 725, scal = 350), verbose = TRUE, nAGQ = 5) summary(Orange_Ascal) # adaptive quad: AIC = 250.2 # Laplace: AIC = 1574
_______________________________________________ R-sig-mixed-models <at> r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models