Skip to content

ANOVA contrast matrix vs. TukeyHSD?

2 messages · Sam Yeaman, John Fox

#
Dear Help List,

Thanks in advance for reading...I hope my questions are not too ignorant.

I have an experiment looking at evolution of wing size [centroid] in 
fruitflies and the effect of 6 different experimental treatments 
[treatment]. I have five replicate populations [replic] in each 
treatment and have reared the flies in two different temperatures [cond] 
to assay the wing size, making measurements on males and females 
[gender]. My design can be summarized as follows:

This is my model (I think it's right, ignoring interaction terms for 
simplicity):

 > lm1 ~ aov (centroid ~ gender + cond + treatment/replic, data = parents)

The treatments are:

 > levels (parents$treatment)
[1] "c"  "h"  "mc" "mh" "s"  "t"

I only care about a few of the pairwise comparisons between the levels 
of "treatment", as only certain contrasts are scientifically interesting:

c vs. h
mh vs. mc
(c + h) vs. s   [I would like to compare the mean of c and h (my 
controls) to s and t)
(c + h) vs. t
s vs. t
h vs. mh
c vs. mc

These are two more than I can specify using "contrasts()" and they are 
not orthogonal. I can use the TukeyHSD and only look at the comparisons 
I care about, but I think this should give me much less power than 
specifying a few a priori contrasts (?). Also, I don't know how to 
combine my controls (c + h) into a single comparison using TukeyHSD.


My first problem is that when I specify the matrix shown below (the 
first 5 comparisons from above), I get a much higher p-value on some of 
the planned contrasts than I do on the TukeyHSD:

contrasts (parents$treatment) <- cbind 
(c(-1,1,0,0,0,0),c(-1,-1,0,0,2,0),c(-1,-1,0,0,0,2),c(0,0,-1,1,0,0),c(0,0,0,0,1,-1)) 


 > contrasts(parents$treatment)
  [,1] [,2] [,3] [,4] [,5]
c    -1   -1   -1    0    0
h     1   -1   -1    0    0
mc    0    0    0   -1    0
mh    0    0    0    1    0
s     0    2    0    0    1
t     0    0    2    0   -1


#### THE OUTPUT (truncated) ####

Call:
lm(formula = centroid ~ gender + cond + treatment/replic, data = parents)

Residuals:
     Min        1Q    Median        3Q       Max
-81.58846  -4.53540   0.00803   4.76568  39.84177

Coefficients: (1 not defined because of singularities)
                    Estimate Std. Error  t value Pr(>|t|)   
(Intercept)         328.73096    0.26303 1249.770  < 2e-16 ***
genders             -37.39069    0.19661 -190.179  < 2e-16 ***
condu               -37.47740    0.19693 -190.308  < 2e-16 ***
treatment1            0.51026    0.40084    1.273 0.203079   
treatment2           -0.17333    0.23175   -0.748 0.454541   
treatment3            0.07761    0.22535    0.344 0.730566   
treatment4           -1.96020    0.38524   -5.088 3.73e-07 ***
treatment5                 NA         NA       NA       NA  

###### The TukeyHSD output (truncated) #####

 Tukey multiple comparisons of means
   95% family-wise confidence level

Fit: aov(formula = centroid ~ gender + cond + treatment/replic, data = 
parents)
*
*
*
$treatment
            diff          lwr        upr     p adj
h-c   -1.38085941 -2.382732615 -0.3789862 0.0012123
mc-c  -2.22026936 -3.198423972 -1.2421147 0.0000000
mh-c  -2.27157901 -3.268013478 -1.2751445 0.0000000
s-c   -1.19540471 -2.170272952 -0.2205365 0.0063382
t-c   -0.39899955 -1.374954044  0.5769549 0.8533107
mc-h  -0.83940995 -1.813993060  0.1351732 0.1378366
mh-h  -0.89071960 -1.883648319  0.1022091 0.1081954
s-h    0.18545470 -0.785829956  1.1567394 0.9943136
t-h    0.98185986  0.009484949  1.9542348 0.0462121
mh-mc -0.05130965 -1.020300865  0.9176816 0.9999892
s-mc   1.02486465  0.078064558  1.9716647 0.0249356
t-mc   1.82126980  0.873351301  2.7691883 0.0000007
s-mh   1.07617430  0.110500644  2.0418480 0.0187007
t-mh   1.87257946  0.905809220  2.8393497 0.0000005
t-s    0.79640515 -0.148121782  1.7409321 0.1550137


When I specify the c vs. h comparison, I am getting a p-value of 
0.203079, but the TukeyHSD gives the same contrast a p-value of 
0.0012123. Also, the fifth comparison gives "NA"; I assume this is due 
to it being non-orthogonal? I feel like I am either misunderstanding the 
point of contrasts() completely or I have done something wrong, so I 
would really appreciate any help.

My other question is related...just wondering why I need to limit myself 
to only orthogonal comparisons using contrasts()? This eliminates 
comparisons of scientific interest, for example if c vs. h, mc vs. c, 
and mh vs. h are all different, I have no way of knowing if mc vs. mh is 
significantly different by examining the other contrasts.

Sorry if these questions are ignorant...I have spent a long time trying 
to figure it out and haven't found the answer in either the available 
books or the help list.

Many thanks,
Sam Yeaman
#
Dear Sam,

The "basis" matrix for your contrasts is not orthogonal, and so the tests
are almost surely not what you intended. Moreover, the "basis" matrix is not
even of rank 5, as you can see from the message produced by lm(),
"Coefficients: (1 not defined because of singularities)," and the missing
coefficient.

Regards,
 John

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
On
http://www.R-project.org/posting-guide.html