anova function gives (irrelevant) results on two models with different numbers of observations
On Sat, Apr 12, 2008 at 11:29 AM, Olivier Renaud
<Olivier.Renaud at pse.unige.ch> wrote:
Hi,
I have encountered what I believe is a bug or an extremely unwanted "feature" in lme4. In short, the anova function does not check that the two models are based on the same number of observations and can thus give irrelevant results, as exemplified below. Note that lme (at least on S+) correctly refuses to compare the models.
Perhaps you would like to contribute the code that performs the
appropriate checks on the models before creating the analysis of
variance table. Check out a copy of the lme4 package sources from the
R-forge repository and search for
setMethod("anova"
in the file pkg/R/lmer.R You will see that there are already checks
in that method for the models being fit to the same data argument (if
used). Patches are welcome.
I found this problem, because I wanted to understand why the LRT gave extremely different value than the Wald/t-stat index/statistics/test (choose one ;-) ), see below. It took me a while to realize that the number of observations was not the same in the two models (156 vs. 160). It happens when an explanatory variable is present in only one of the two models and contains NA's, which in not an uncommon situation. The MLdeviances then are very different but not comparable. I'm afraid it might have unnoticed by other users. I have reproduced the calls in lmer (R) and lme (S+). Olivier R version 2.6.1 (2007-11-26) Package: lme4 Version: 0.99875-9
> (Fu.mod0.lmer <- lmer(Y ~ 1 + ( 1|subjec), data = Fu,
na.action=na.omit, method="ML") )
Linear mixed-effects model fit by maximum likelihood
Formula: Y ~ 1 + (1 | subjec)
Data: Fu
AIC BIC logLik MLdeviance REMLdeviance
654.7 660.9 -325.4 650.7 651.4
Random effects:
Groups Name Variance Std.Dev.
subjec (Intercept) 0.93595 0.96745
Residual 2.96389 1.72160
number of obs: 160, groups: subjec, 16
Fixed effects:
Estimate Std. Error t value
(Intercept) 3.4875 0.2775 12.57
> (Fu.mod1.lmer <- lmer(Y ~ expvariable + ( 1|subjec), data = Fu,
na.action=na.omit, method="ML") )
Linear mixed-effects model fit by maximum likelihood
Formula: Y ~ expvariable + (1 | subjec)
Data: Fu
AIC BIC logLik MLdeviance REMLdeviance
637.8 646.9 -315.9 631.8 633.1
Random effects:
Groups Name Variance Std.Dev.
subjec (Intercept) 1.0357 1.0177
Residual 2.8796 1.6969
number of obs: 156, groups: subjec, 16
Fixed effects:
Estimate Std. Error t value
(Intercept) 3.2673 0.3145 10.39
expvariable 0.4140 0.2856 1.45
Correlation of Fixed Effects:
(Intr)
expvariable -0.398
> anova(Fu.mod0.lmer, Fu.mod1.lmer)
Data: Fu
Models:
Fu.mod0.lmer: Y ~ 1 + (1 | subjec)
Fu.mod1.lmer: Y ~ expvariable + (1 | subjec)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
Fu.mod0.lmer 2 654.70 660.85 -325.35
Fu.mod1.lmer 3 637.78 646.93 -315.89 18.917 1 1.365e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
######## In S+8.0
> Fu.mod0.lme <- lme(Y ~ 1, random=~1|subjec, data = Fu,
na.action=na.omit, method="ML")
> summary(Fu.mod0.lme)
Linear mixed-effects model fit by maximum likelihood
Data: Fu
AIC BIC logLik
656.7007 665.9262 -325.3503
Random effects:
Formula: ~ 1 | subjec
(Intercept) Residual
StdDev: 0.9674476 1.721595
Fixed effects: Y ~ 1
Value Std.Error DF t-value p-value
(Intercept) 3.4875 0.2783988 144 12.52699 <.0001
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.244476 -0.7371921 -0.2445665 0.8289158 2.608248
Number of Observations: 160
Number of Groups: 16
> Fu.mod1.lme <- lme(Y ~ expvariable, random=~1|subjec, data = Fu,
na.action=na.omit, method="ML")
> summary(Fu.mod1.lme)
Linear mixed-effects model fit by maximum likelihood
Data: Fu
AIC BIC logLik
639.7835 651.983 -315.8918
Random effects:
Formula: ~ 1 | subjec
(Intercept) Residual
StdDev: 1.017715 1.696945
Fixed effects: Y ~ expvariable
Value Std.Error DF t-value p-value
(Intercept) 3.267305 0.3164881 139 10.32363 <.0001
expvariable 0.413960 0.2874080 139 1.44032 0.152
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.4028 -0.6934754 -0.2075537 0.7346619 2.768138
Number of Observations: 156
Number of Groups: 16
> anova(Fu.mod0.lme, Fu.mod1.lme)
Problem in anova.lme(Fu.mod0.lme, Fu.mod1.lme): All fitted objects must use the same number of observations Use traceback() to see the call stack -- Olivier.Renaud at pse.unige.ch http://www.unige.ch/~renaud/ Methodology & Data Analysis - Psychology Dept - University of Geneva UniMail, Office 4142 - 40, Bd du Pont d'Arve - CH-1211 Geneva 4
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models