Message-ID: <A4E5A0B016B8CB41A485FC629B633CED4695363AEB@GOLD.corp.lgc-group.com>
Date: 2012-08-14T11:30:01Z
From: S Ellison
Subject: anova in unbalanced data
In-Reply-To: <CAGuusR_OintB+nqB33R2EWQDBY-3FEa2nodjKxQf+NL5eGjHoQ@mail.gmail.com>
> -----Original Message-----
> Say I have the following data:
>
> a<-data.frame(col1=c(rep("a",5),rep("b",7)),col2=runif(12))
>
> a_aov<-aov(a$col2~a$col1)
>
> summary(aov)
>
>
> Note that there are 5 observations for a and 7 for b, thus is
> unbalanced. What would be the correct way of doing anova for this set?
>
As this is a single factor case, the imbalance doesn't affect the interpretation. For two-way and higher models, it would affect the interpretation, and john fox's post (and a very large literature) then applies. But here, the usual variants and contrast choices will all return the same p-value, so aov works, as does
anova(lm(col2~col1, data=a)) #note that the 'data' argument also works in aov
S
*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}