Skip to content

Anova - adjusted or sequential sums of squares?

2 messages · michael watson (IAH-C), Brian Ripley

#
Thanks for the response.  Answers to your questions in turn:

My null hypothesis is that these is no difference between the treatment
means.  I guess that makes my alternative there is a difference.

I understand all about interactions, and yes, there's an interaction
term in my model.  Moreover, it is a pretty easy to understand and
interpret interaction.  In this example case, yes the interaction term
is significant, and so I know I can and should only interpret this term
and not any of the lower order terms.  

However, I will be repeating this analysis for other response variables,
some of which inevitably will not have a significant interaction term.
What then?  I guess one answer would be to say that as it's not
significant, I could remove it from the model and perform some model
comparisons as you suggest?

Doug agrees with the guy who taught me stats, and I should only be
looking at the type I sequential sums of squares.  I also like that as
it comes out of R.  It's just minitab freaked me out.  

I guess what I want to know is if I use the type I sequential SS, as
reported by R, on my factorial anova which is unbalanced, am I doing
something horribly wrong?  I think the answer is no.  

I guess I could use drop1() to get from the type I to the type III in
R...

-----Original Message-----
From: Liaw, Andy [mailto:andy_liaw at merck.com] 
Sent: 20 April 2005 15:05
To: michael watson (IAH-C); r-help at stat.math.ethz.ch
Subject: RE: [R] Anova - adjusted or sequential sums of squares?
Here we go again...  The `type I vs. type III SS' controversy has long
been debated here and elsewhere.  I'll give my personal bias, and leave
you to dig deeper if you care to.

The `types' of sum of squares are creation of SAS.  Each type 
corresponds to different hypothesis being considered.  The short answer
to your question would be: `What are your null and alternative
hypotheses'?

One of the problems with categorizing like that is it tends to keep
people from thinking about the question above, and thus leading to the
confusion of which to use.

The school of thought I was broght up in says you need (and should) not
think that way.  Rather, frame your question in terms of 
model comparisons.  This approach avoids the notorious problem of
comparing the full model to ones that contain interaction, but 
lack one main effect that is involved in that interaction.

More practically:  Do you have interaction in your model?  If so, the
result for the interaction term should be the same in either `type' of
test.  If that interaction term is significant, you should find other
ways to understand the effects, and 
_not_ test for significance of the main effects in the presence of
interaction.  If there is no interaction term, you can 
assess effects by model comparisons such as:

m.full <- lm(y ~ A + B)
m.A <- lm(y ~ A)
m.B <- lm(y ~ B)
anova(m.B, m.full)  ## test for A effect
anova(m.A, m.full)  ## test for B effect

HTH,
Andy
------------------------------------------------------------------------
------
Notice:  This e-mail message, together with any attachments,...{{dropped}}
#
On Wed, 20 Apr 2005, michael watson (IAH-C) wrote:

            
Sort of.  You really should test a hypothesis at a time.  See Bill's 
examples in MASS.
Only if you respect marginality.  The quote Doug gave is based on a longer 
paper available at

http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf

Do read it all.