John Maindonald email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473 fax : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 22 Mar 2008, at 12:51 AM, Andy Fugard wrote:
I do not think it quite true that the aov model that has an
Error() term is a fixed effects model. The use of the word
"stratum" implies that a mixed effects model is lurking
somewhere. The F-tests surely assume such a model.
This is something that has been bothering me for a while. Often
it's argued that ANOVA is just regression; clearly this is not true
when it's a repeated measures ANOVA, unless "regression" is
interpreted broadly. I think Andrew Gelman argues this somewhere.
I don't see how to get aov to give me a formula, and lm doesn't fit
stuff with an Error() term, but if it could, logically I would
expect the formula to resemble closely the sort of thing you get
with a mixed effects models.
The closest analogy I can find is that doing this...
aov1 = aov(yield ~ N+P+K + Error(block), npk)
summary(aov1)
Error: block
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 5 343 69
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
N 1 189.3 189.3 11.82 0.0037 **
P 1 8.4 8.4 0.52 0.4800
K 1 95.2 95.2 5.95 0.0277 *
Residuals 15 240.2 16.0
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
[I believe the F's and p's here come from a linear construction of
nested models (i.e., N versus intercept only; N+P versus N; plus
N+P+K versus N+P).]
... is a bit like doing this with lmer...
lmer0 = lmer(yield ~ 1 + (1|block), npk)
lmer1 = lmer(yield ~ N + (1|block), npk)
lmer2 = lmer(yield ~ N+P + (1|block), npk)
lmer3 = lmer(yield ~ N+P+K + (1|block), npk)
anova(lmer0,lmer1)
Data: npk
Models:
lmer0: yield ~ 1 + (1 | block)
lmer1: yield ~ N + (1 | block)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
lmer0 2 157.5 159.8 -76.7
lmer1 3 151.5 155.1 -72.8 7.93 1 0.0049 **
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Data: npk
Models:
lmer1: yield ~ N + (1 | block)
lmer2: yield ~ N + P + (1 | block)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
lmer1 3 151.5 155.1 -72.8
lmer2 4 153.0 157.8 -72.5 0.47 1 0.49
Data: npk
Models:
lmer2: yield ~ N + P + (1 | block)
lmer3: yield ~ N + P + K + (1 | block)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
lmer2 4 153.0 157.8 -72.5
lmer3 5 149.0 154.9 -69.5 6.02 1 0.014 *
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Namely, fitting the models, and making comparisons using likelihood
ratio tests. Okay, the LLR is asymptotically chisq distributed
with df the difference in parameters whereas... well F is different
- I always get confused with the df and what bits of the residuals
plug in where.
But, is this vaguely right? (I'm aware that order matters when the
design isn't balanced, and have read much about the type I errors
versus type III errors disagreements.)
Actually this does lead to one question: following through the
analogy, are repeated measures ANOVAs (those fitted with aov)
always random intercept models, i.e. with no random slopes?
Cheers,
Andy
--
Andy Fugard, Postgraduate Research Student
Psychology (Room F3), The University of Edinburgh,
7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190 http://www.possibly.me.uk