Skip to content

ANOVA 1 too few degrees of freedom

6 messages · S Ellison, Peter Dalgaard, Richard M. Heiberger +1 more

#
Yes; if you had only Plot+Day you'd have a completely balanced full
factorial ... for Plot and Day.

But I think I now see an answer to your puzzlement. 

Your original ANOVA table was 
                                                              Df  Sum
Sq    Mean Sq    F value           Pr(>F) 
    
Combined.Trt                                           1   52.80      
52.805     2.0186e+30  < 2.2e-16 *** 
as.factor(Combined.Plot)                         10  677.69    67.769  
  2.5907e+30  < 2.2e-16 *** 
as.factor(Combined.Day)                         16  2817.47  176.092  
6.7317e+30  < 2.2e-16 *** 
Combined.Trt:as.factor(Combined.Day)   16   47.82      2.989      
1.1426e+29  < 2.2e-16 *** 
as.factor(Combined.Day):Combined.Plot 160  611.21   3.820 1.4604e+29 <
2.2e-16 *** 

Residuals                                                 204    0.00  
0.000 

Now, your Day:Plot combination has 17*12=204 combinations. And you are
showing 204 residual DOF, which is exactly right for two observations
per group and 204 groups. That's exactly right for the model
Day+Plot+Day:Plot.

But you have Trt in the model as well. Where are the Treatment DoF
coming from? 
Clearly, the treatment DoF has not come out of the residual term. That
means the treatment DoF must have come from somewhere else. Since you've
lost 2 instead of 1 DoF from Plot and kept your n[Day]-1 degrees of
freedom for Day, I would guess that Plot is nested in Trt. If that is
the case, you'd 'obviously' expect n[Plot]-2 dof for Plot. And indeed i
could replicate your ANOVA table DoF with 

Trt<-gl(2,204)
Plot<-gl(12,34) #Note that Plot is NOT fully crossed with Trt; each Trt
has only 6 Plots
Day <- gl(17,2,408)
y<-rnorm(408)
anova(lm(y~Trt+Plot+Day+Trt:Day+Day:Plot))

So that would be one data structure that would explain your DoF.

But if that is indeed the structure, there are other concerns that you
may need to look out for. 
First, compare the two models
anova(lm(y~Trt+Plot+Day+Trt:Day+Day:Plot))
#and
anova(lm(y~Trt+Plot+Day+Day:Plot+Trt:Day))

Same terms in the model specification, but different DoF and Trt:Day
has completely vanished from the second ANOVA table. Something is
clearly either aliased, unbalanced or incorrectly specified. In my
presumed version of events, because the Plot is nested in Trt, Day:Plot
is also nested in Trt. To get _consistent_ results independent of model
order you would need to reflect that additional nesting in the model:

anova(lm(y~Trt+Plot+Day+Trt:Day:Plot+Trt:Day))
#vs
anova(lm(y~Trt+Plot+Day+Trt:Day+Trt:Day:Plot))

and this time, the two models give identical results for all rows in
the table. Somewhat reassuringly 
anova(lm(y~Day+Trt/Plot+Trt:Day+Trt:Day:Plot))
and even more reassuringly car's Anova does too.

But there's another more fundamental thing that may suggest this is
still not quite the right thing to do. If Plot is nested in Trt, it
matters whether plot is random or fixed when asking about the treatment
effect. I'd guess Plot is random. If plot is random and nested in Trt it
would not be appropriate to simply compare any calculated Trt MS with
the residual MS. Rather, one would compare the Trt MS with the Trt:Plot
interaction. Or perhaps resort to a mixed effects model using lme (if
Plot is the only random term) or lmer if Plot and Day are both random. 




*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}
#
Thanks slre,

I seem to be making some progress now.

Using a colon instead of an asterisk in the code really changes things. I
had been getting residual SS and MS of zero. Which is ridiculous. Now I get
much more plausible values. 

Also, When I used an asterisk instead of a colon It wouldn't give results
for three way interactions. With colons it will.

You are correct about plot being nested within treatment. There are six
plots in each of 2 treatments.

So, I guess I will have to perform a separate analysis to quantify the
effect of treatment.

Thanks again.

Analysis of Variance Table

Response: Combined.Rs
                                                                                
Df  Sum Sq   Mean Sq   F value          Pr(>F)    
Combined.Trt                                                             1  
52.80     52.805     96.2601       < 2.2e-16 ***
Combined.Plot                                                            10 
677.69   67.769     123.5380    < 2.2e-16 ***
as.factor(Combined.Day)                                            16 
2817.47 176.092   321.0041    < 2.2e-16 ***
Combined.Trt:as.factor(Combined.Day)                      16   47.82    
2.989       5.4487        4.048e-10 ***
Combined.Trt:Combined.Plot:as.factor(Combined.Day)80  455.42   5.693      
10.3776      < 2.2e-16 ***
Residuals                                                                   
284 155.79   0.549                       
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 


--
View this message in context: http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3499649.html
Sent from the R help mailing list archive at Nabble.com.
#
On May 5, 2011, at 23:30 , Rovinpiper wrote:

            
Beware that as you have highly significant effects of plot and its interaction with day, and plot being nested in treatment, you can't test for treatment or treatment:day effect with a systematic effect of plot  and plot:treatment in the model (you are only getting p values because of the sequential computation of the anova table - if you put plot before treatment, you'd get zero df). More likely, you want to make the "plot" terms random, as in ~treatment*day + Error(plot/day)

  
    
4 days later
#
Please read the first two paragraphs of Details from ?formula

Ok, I just did.

So, If I understand this properly, the term Plot*Day would include both the
main effects of a and b and their second order interactions. So it could be
written Plot + Day + Plot:Day. 

The term Plot:Day includes only the second order interactions of Plot and
Day. 

So would that make this model redundant:

Rs ~ Plot + Day + Plot:Day

Since it includes the main effects of Plot and Day twice?

Thanks.

--
View this message in context: http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3514828.html
Sent from the R help mailing list archive at Nabble.com.