two way ANOVA with unequal sample sizes
julien claude <claude at isem.univ-montp2.fr> writes:
Hi, I am trying a two way anova with unequal sample sizes but results are not as expected: I take the example from Applied Linear Statistical Models (Neter et al. pp889-897, 1996) growth rate gender bone development 1.4 1 1 2.4 1 1 2.2 1 1 2.4 1 2 2.1 2 1 1.7 2 1 2.5 2 2 1.8 2 2 2 2 2 0.7 3 1 1.1 3 1 0.5 3 2 0.9 3 2 1.3 3 2 expected results are source of variation SS df MS F gender 0.12 1 0.12 0.74 bone development 4.1897 2 2.0949 12.89** interaction 0.0754 2 0.377 0.23 Error 1.3 8 0.1625 # I use aov (growrate ~ gender * bonedevelopment)->m summary(m) Df Sum Sq Mean Sq F value Pr(>F) as.factor(gender) 2 4.3063 2.1531 13.2501 0.002891 ** as.factor(bonedevlopment) 1 0.0926 0.0926 0.5697 0.472022 as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034 Residuals 8 1.3000 0.1625
Ahem. Tab damage detected... and your command and output don't match up. The as.factor(gender:bonedevlopment) is playing with fire... You should calculate factor() of each term. However, it would seem that you already did manage to convert things to factors or you would have gotten something to this effect:
evalq(as.factor(gender:bone.development),d)
[1] 1 Levels: 1 Warning messages: 1: Numerical expression has 14 elements: only the first used in: gender:bone.development 2: Numerical expression has 14 elements: only the first used in: gender:bone.development
#if I change the order of factors, results are different
aov (growrate ~ bonedevelopment * gender)->m
summary(m)
Df Sum Sq Mean Sq F value
Pr(>F)
as.factor(bonedevlopment) 1 0.0029 0.0029 0.0176
0.897785
as.factor(gender) 2 4.3960 2.1980 13.5262 0.002713 **
as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034
Residuals 8 1.3000 0.1625
#In the both cases, results for main effects differ from those expected in
Neter et al.
However interaction and residuals are well estimated.
Can anyone help, either I am wrong in the formula, or either is there an
other problem? Is there a mean to conduct easily the test as in it is in
Neter et al. ?
The same problems occurs with anova(lm(....))?
I don't think we're the ones with the problem... There are various boneheaded ways in which people try to use to assign some kind of SumSq to main effects in the presence of interaction, and they are all wrong - although maybe not very wrong if the unbalance is slight. The tests *should* depend on the test order, as is most clearly seen if the predictors are highly collinear.
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._