Dear list,
I apologize in advance if this question is stupid; it's just that there's
something that I do not understand and I would like to learn more. I know
that for linear mixed models, optimizing via ML provides slightly
anticonservative parameter estimates; as I understand it, this is because
ML estimates ? without taking into account uncertainty in the estimates for
? (since the two are estimated simultaneously). This uncertainty can be
taken into account by optimizing the REML criterion instead, which
integrates out the unknown ?, and hence finds estimates for ? that should
be appropriate for *any* value that ? could have had.
If this is correct, the REML criterion contains an added constant that
varies depending on the provided fixed-effects structure, and hence REML
fits with different fixed effects cannot be compared. When asking for the
anova() of two REML-fitted lme4 objects, the package thus refits the models
using ML:
Loading required package: Matrix
big <- lmer(Reaction ~ Days + (1|Subject),sleepstudy)
small <- lmer(Reaction ~ (1|Subject),sleepstudy)
anova(small,big) # refits model
refitting model(s) with ML (instead of REML)
Data: sleepstudy
Models:
small: Reaction ~ (1 | Subject)
big: Reaction ~ Days + (1 | Subject)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
small 3 1916.5 1926.1 -955.27 1910.5
big 4 1802.1 1814.8 -897.04 1794.1 116.46 1 < 2.2e-16 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
This is what I do not understand: why do we need to re-optimize the (now
ML) objective function, when we already have the optimal values in big at theta
and small at theta? Starting from the REML criterion, we would only need to
take out the integrated fixed effects, and then we end up with an ML
likelihood, correct? But then, why can't I simply do:
big.LLML <- update(big,REML=F,devFunOnly=T)(big at theta) #compute ML
deviance based on optimal theta
small.LLML <- update(small,REML=F,devFunOnly=T)(small at theta)
small.LLML - big.LLML # same difference, without needing to re-fit!
[1] 116.4677
i.e. take the already-found optimal ? values and plug them into the ML
formula? Why do we need to search for the optimal ? values again -- even if
the values found by ML differ from those found by REML, shouldn't the REML
values be preferred?
Thank you very much for any insights,
Cesko