Dear Help List,
Thanks in advance for reading...I hope my questions are not too ignorant.
I have an experiment looking at evolution of wing size [centroid] in
fruitflies and the effect of 6 different experimental treatments
[treatment]. I have five replicate populations [replic] in each
treatment and have reared the flies in two different temperatures [cond]
to assay the wing size, making measurements on males and females
[gender]. My design can be summarized as follows:
This is my model (I think it's right, ignoring interaction terms for
simplicity):
> lm1 ~ aov (centroid ~ gender + cond + treatment/replic, data = parents)
The treatments are:
> levels (parents$treatment)
[1] "c" "h" "mc" "mh" "s" "t"
I only care about a few of the pairwise comparisons between the levels
of "treatment", as only certain contrasts are scientifically interesting:
c vs. h
mh vs. mc
(c + h) vs. s [I would like to compare the mean of c and h (my
controls) to s and t)
(c + h) vs. t
s vs. t
h vs. mh
c vs. mc
These are two more than I can specify using "contrasts()" and they are
not orthogonal. I can use the TukeyHSD and only look at the comparisons
I care about, but I think this should give me much less power than
specifying a few a priori contrasts (?). Also, I don't know how to
combine my controls (c + h) into a single comparison using TukeyHSD.
My first problem is that when I specify the matrix shown below (the
first 5 comparisons from above), I get a much higher p-value on some of
the planned contrasts than I do on the TukeyHSD:
contrasts (parents$treatment) <- cbind
(c(-1,1,0,0,0,0),c(-1,-1,0,0,2,0),c(-1,-1,0,0,0,2),c(0,0,-1,1,0,0),c(0,0,0,0,1,-1))
> contrasts(parents$treatment)
[,1] [,2] [,3] [,4] [,5]
c -1 -1 -1 0 0
h 1 -1 -1 0 0
mc 0 0 0 -1 0
mh 0 0 0 1 0
s 0 2 0 0 1
t 0 0 2 0 -1
#### THE OUTPUT (truncated) ####
Call:
lm(formula = centroid ~ gender + cond + treatment/replic, data = parents)
Residuals:
Min 1Q Median 3Q Max
-81.58846 -4.53540 0.00803 4.76568 39.84177
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 328.73096 0.26303 1249.770 < 2e-16 ***
genders -37.39069 0.19661 -190.179 < 2e-16 ***
condu -37.47740 0.19693 -190.308 < 2e-16 ***
treatment1 0.51026 0.40084 1.273 0.203079
treatment2 -0.17333 0.23175 -0.748 0.454541
treatment3 0.07761 0.22535 0.344 0.730566
treatment4 -1.96020 0.38524 -5.088 3.73e-07 ***
treatment5 NA NA NA NA
###### The TukeyHSD output (truncated) #####
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = centroid ~ gender + cond + treatment/replic, data =
parents)
*
*
*
$treatment
diff lwr upr p adj
h-c -1.38085941 -2.382732615 -0.3789862 0.0012123
mc-c -2.22026936 -3.198423972 -1.2421147 0.0000000
mh-c -2.27157901 -3.268013478 -1.2751445 0.0000000
s-c -1.19540471 -2.170272952 -0.2205365 0.0063382
t-c -0.39899955 -1.374954044 0.5769549 0.8533107
mc-h -0.83940995 -1.813993060 0.1351732 0.1378366
mh-h -0.89071960 -1.883648319 0.1022091 0.1081954
s-h 0.18545470 -0.785829956 1.1567394 0.9943136
t-h 0.98185986 0.009484949 1.9542348 0.0462121
mh-mc -0.05130965 -1.020300865 0.9176816 0.9999892
s-mc 1.02486465 0.078064558 1.9716647 0.0249356
t-mc 1.82126980 0.873351301 2.7691883 0.0000007
s-mh 1.07617430 0.110500644 2.0418480 0.0187007
t-mh 1.87257946 0.905809220 2.8393497 0.0000005
t-s 0.79640515 -0.148121782 1.7409321 0.1550137
When I specify the c vs. h comparison, I am getting a p-value of
0.203079, but the TukeyHSD gives the same contrast a p-value of
0.0012123. Also, the fifth comparison gives "NA"; I assume this is due
to it being non-orthogonal? I feel like I am either misunderstanding the
point of contrasts() completely or I have done something wrong, so I
would really appreciate any help.
My other question is related...just wondering why I need to limit myself
to only orthogonal comparisons using contrasts()? This eliminates
comparisons of scientific interest, for example if c vs. h, mc vs. c,
and mh vs. h are all different, I have no way of knowing if mc vs. mh is
significantly different by examining the other contrasts.
Sorry if these questions are ignorant...I have spent a long time trying
to figure it out and haven't found the answer in either the available
books or the help list.
Many thanks,
Sam Yeaman
ANOVA contrast matrix vs. TukeyHSD?
2 messages · Sam Yeaman, John Fox
Dear Sam, The "basis" matrix for your contrasts is not orthogonal, and so the tests are almost surely not what you intended. Moreover, the "basis" matrix is not even of rank 5, as you can see from the message produced by lm(), "Coefficients: (1 not defined because of singularities)," and the missing coefficient. Regards, John ------------------------------ John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
Behalf Of Sam Yeaman Sent: September-17-08 4:58 PM To: r-help at r-project.org Subject: [R] ANOVA contrast matrix vs. TukeyHSD? Dear Help List, Thanks in advance for reading...I hope my questions are not too ignorant. I have an experiment looking at evolution of wing size [centroid] in fruitflies and the effect of 6 different experimental treatments [treatment]. I have five replicate populations [replic] in each treatment and have reared the flies in two different temperatures [cond] to assay the wing size, making measurements on males and females [gender]. My design can be summarized as follows: This is my model (I think it's right, ignoring interaction terms for simplicity):
> lm1 ~ aov (centroid ~ gender + cond + treatment/replic, data = parents)
The treatments are:
> levels (parents$treatment)
[1] "c" "h" "mc" "mh" "s" "t" I only care about a few of the pairwise comparisons between the levels of "treatment", as only certain contrasts are scientifically interesting: c vs. h mh vs. mc (c + h) vs. s [I would like to compare the mean of c and h (my controls) to s and t) (c + h) vs. t s vs. t h vs. mh c vs. mc These are two more than I can specify using "contrasts()" and they are not orthogonal. I can use the TukeyHSD and only look at the comparisons I care about, but I think this should give me much less power than specifying a few a priori contrasts (?). Also, I don't know how to combine my controls (c + h) into a single comparison using TukeyHSD. My first problem is that when I specify the matrix shown below (the first 5 comparisons from above), I get a much higher p-value on some of the planned contrasts than I do on the TukeyHSD: contrasts (parents$treatment) <- cbind (c(-1,1,0,0,0,0),c(-1,-1,0,0,2,0),c(-1,-1,0,0,0,2),c(0,0,- 1,1,0,0),c(0,0,0,0,1,-1))
> contrasts(parents$treatment)
[,1] [,2] [,3] [,4] [,5]
c -1 -1 -1 0 0
h 1 -1 -1 0 0
mc 0 0 0 -1 0
mh 0 0 0 1 0
s 0 2 0 0 1
t 0 0 2 0 -1
#### THE OUTPUT (truncated) ####
Call:
lm(formula = centroid ~ gender + cond + treatment/replic, data = parents)
Residuals:
Min 1Q Median 3Q Max
-81.58846 -4.53540 0.00803 4.76568 39.84177
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 328.73096 0.26303 1249.770 < 2e-16 ***
genders -37.39069 0.19661 -190.179 < 2e-16 ***
condu -37.47740 0.19693 -190.308 < 2e-16 ***
treatment1 0.51026 0.40084 1.273 0.203079
treatment2 -0.17333 0.23175 -0.748 0.454541
treatment3 0.07761 0.22535 0.344 0.730566
treatment4 -1.96020 0.38524 -5.088 3.73e-07 ***
treatment5 NA NA NA NA
###### The TukeyHSD output (truncated) #####
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = centroid ~ gender + cond + treatment/replic, data =
parents)
*
*
*
$treatment
diff lwr upr p adj
h-c -1.38085941 -2.382732615 -0.3789862 0.0012123
mc-c -2.22026936 -3.198423972 -1.2421147 0.0000000
mh-c -2.27157901 -3.268013478 -1.2751445 0.0000000
s-c -1.19540471 -2.170272952 -0.2205365 0.0063382
t-c -0.39899955 -1.374954044 0.5769549 0.8533107
mc-h -0.83940995 -1.813993060 0.1351732 0.1378366
mh-h -0.89071960 -1.883648319 0.1022091 0.1081954
s-h 0.18545470 -0.785829956 1.1567394 0.9943136
t-h 0.98185986 0.009484949 1.9542348 0.0462121
mh-mc -0.05130965 -1.020300865 0.9176816 0.9999892
s-mc 1.02486465 0.078064558 1.9716647 0.0249356
t-mc 1.82126980 0.873351301 2.7691883 0.0000007
s-mh 1.07617430 0.110500644 2.0418480 0.0187007
t-mh 1.87257946 0.905809220 2.8393497 0.0000005
t-s 0.79640515 -0.148121782 1.7409321 0.1550137
When I specify the c vs. h comparison, I am getting a p-value of
0.203079, but the TukeyHSD gives the same contrast a p-value of
0.0012123. Also, the fifth comparison gives "NA"; I assume this is due
to it being non-orthogonal? I feel like I am either misunderstanding the
point of contrasts() completely or I have done something wrong, so I
would really appreciate any help.
My other question is related...just wondering why I need to limit myself
to only orthogonal comparisons using contrasts()? This eliminates
comparisons of scientific interest, for example if c vs. h, mc vs. c,
and mh vs. h are all different, I have no way of knowing if mc vs. mh is
significantly different by examining the other contrasts.
Sorry if these questions are ignorant...I have spent a long time trying
to figure it out and haven't found the answer in either the available
books or the help list.
Many thanks,
Sam Yeaman
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.