How do we specify a contrast interaction matrix for an ANOVA model?
We have a two-factor, repeated measures design, with
Cue Direction (2) x Brain Hemisphere(2)
Each of these has 2 levels, 'left' and 'right', so it's a simple 2x2 design
matrix. We have 8 subjects in each cell (a balanced design) and we want to
specify the interaction contrast so that:
CueLeft>CueRght for the Right Hemisphere
CueRght>CueLeft for the Left Hemisphere.
Here is a copy of the relevant commands for R:
########################################
lh_cueL <- rowMeans( LHroi.cueL[,t1:t2] )
lh_cueR <- rowMeans( LHroi.cueR[,t1:t2] )
rh_cueL <- rowMeans( RHroi.cueL[,t1:t2] )
rh_cueR <- rowMeans( RHroi.cueR[,t1:t2] )
roiValues <- c( lh_cueL, lh_cueR, rh_cueL, rh_cueR )
cuelabels <- c("CueLeft", "CueRight")
hemlabels <- c("LH", "RH")
roiDataframe <- data.frame( roi=roiValues, Subject=gl(8,1,32,subjectlabels),
Hemisphere=gl(2,16,32,hemlabels), Cue=gl(2,8,32,cuelabels) )
roi.aov <- aov(roi ~ (Cue*Hemisphere) + Error(Subject/(Cue*Hemisphere)),
data=roiDataframe)
print(summary(roi.aov))
########################################
I've tried to create a contrast matrix like this:
cm <- contrasts(roiDataframe$Cue)
which gives the main effect contrasts for the Cue factor. I really want to
specify the interaction contrasts, so I tried this:
########################################
# c( lh_cueL, lh_cueR, rh_cueL, rh_cueR )
# CueRight>CueLeft for the Left Hemisphere.
# CueLeft>CueRight for the Right Hemisphere
cm <- c(-1, 1, 1, -1)
dim(cm) <- c(2,2)
roi.aov <- aov( roi ~ (Cue*Hemisphere) + Error(Subject/(Cue*Hemisphere)),
contrasts=cm, data=roiDataframe)
print(summary(roi.aov))
########################################
but the results of these two aov commands are identical. Is it the case that
the 2x2 design matrix is always going to give the same F values for the
interaction regardless of the contrast direction? OR, is there some way to get
a summary output for the contrasts that is not available from the print method?
contrast matrix for aov
8 messages · Brian Ripley, Peter Dalgaard, Christophe Pallier +1 more
On Wed, 9 Mar 2005, Darren Weber wrote:
How do we specify a contrast interaction matrix for an ANOVA model? We have a two-factor, repeated measures design, with
Where does `repeated measures' come into this? You appear to have repeated a 2x2 experiment in each of 8 blocks (subjects). Such a design is usually analysed with fixed effects. (Perhaps you averaged over repeats in the first few lines of your code?)
Cue Direction (2) x Brain Hemisphere(2)
Each of these has 2 levels, 'left' and 'right', so it's a simple 2x2 design
matrix. We have 8 subjects in each cell (a balanced design) and we want to
specify the interaction contrast so that:
CueLeft>CueRght for the Right Hemisphere
CueRght>CueLeft for the Left Hemisphere.
Here is a copy of the relevant commands for R:
########################################
lh_cueL <- rowMeans( LHroi.cueL[,t1:t2] )
lh_cueR <- rowMeans( LHroi.cueR[,t1:t2] )
rh_cueL <- rowMeans( RHroi.cueL[,t1:t2] )
rh_cueR <- rowMeans( RHroi.cueR[,t1:t2] )
roiValues <- c( lh_cueL, lh_cueR, rh_cueL, rh_cueR )
cuelabels <- c("CueLeft", "CueRight")
hemlabels <- c("LH", "RH")
roiDataframe <- data.frame( roi=roiValues, Subject=gl(8,1,32,subjectlabels),
Hemisphere=gl(2,16,32,hemlabels), Cue=gl(2,8,32,cuelabels) )
roi.aov <- aov(roi ~ (Cue*Hemisphere) + Error(Subject/(Cue*Hemisphere)),
data=roiDataframe)
I think the error model should be Error(Subject). In what sense are `Cue'
and `Cue:Hemisphere' random effects nested inside `Subject'?
Let me fake some `data':
set.seed(1); roiValues <- rnorm(32)
subjectlabels <- paste("V"1:8, sep = "")
options(contrasts = c("contr.helmert", "contr.poly"))
roi.aov <- aov(roi ~ Cue*Hemisphere + Error(Subject), data=roiDataframe)
roi.aov
Call:
aov(formula = roi ~ Cue * Hemisphere + Error(Subject), data = roiDataframe)
Grand Mean: 0.1165512
Stratum 1: Subject
Terms:
Residuals
Sum of Squares 4.200946
Deg. of Freedom 7
Residual standard error: 0.7746839
Stratum 2: Within
Terms:
Cue Hemisphere Cue:Hemisphere Residuals
Sum of Squares 0.216453 0.019712 0.057860 21.896872
Deg. of Freedom 1 1 1 21
Residual standard error: 1.021131
Estimated effects are balanced
Note that all the action is in one stratum, and the SSQs are the same
as
aov(roi ~ Subject + Cue * Hemisphere, data = roiDataframe)
(and also the same as for your fit).
print(summary(roi.aov))
It auto-prints, so you don't need print().
######################################## I've tried to create a contrast matrix like this: cm <- contrasts(roiDataframe$Cue) which gives the main effect contrasts for the Cue factor. I really want to specify the interaction contrasts, so I tried this: ######################################## # c( lh_cueL, lh_cueR, rh_cueL, rh_cueR ) # CueRight>CueLeft for the Left Hemisphere. # CueLeft>CueRight for the Right Hemisphere cm <- c(-1, 1, 1, -1) dim(cm) <- c(2,2)
(That is up to sign what Helmert contrasts give you.)
roi.aov <- aov( roi ~ (Cue*Hemisphere) + Error(Subject/(Cue*Hemisphere)), contrasts=cm, data=roiDataframe) print(summary(roi.aov)) ######################################## but the results of these two aov commands are identical. Is it the case that the 2x2 design matrix is always going to give the same F values for the interaction regardless of the contrast direction?
Yes, as however you code the design (via `contrasts') you are fitting the
same subspaces. Not sure what you mean by `contrast direction', though.
However, you have not specified `contrasts' correctly:
contrasts: A list of contrasts to be used for some of the factors in
the formula.
and cm is not a list, and an interaction is not a factor.
OR, is there some way to get a summary output for the contrasts that is not available from the print method?
For more than two levels, yes: see `split' under ?summary.aov. Also, see se.contrasts which allows you to find the standard error for any contrast. For the fixed-effects model you can use summary.lm:
fit <- aov(roi ~ Subject + Cue * Hemisphere, data = roiDataframe) summary(fit)
Df Sum Sq Mean Sq F value Pr(>F) Subject 7 4.2009 0.6001 0.5756 0.7677 Cue 1 0.2165 0.2165 0.2076 0.6533 Hemisphere 1 0.0197 0.0197 0.0189 0.8920 Cue:Hemisphere 1 0.0579 0.0579 0.0555 0.8161 Residuals 21 21.8969 1.0427
summary.lm(fit)
Call:
aov(formula = roi ~ Subject + Cue * Hemisphere, data = roiDataframe)
Residuals:
Min 1Q Median 3Q Max
-1.7893 -0.4197 0.1723 0.5868 1.3033
Coefficients:
Estimate Std. Error t value Pr(>|t|)
[...]
Cue1 -0.08224 0.18051 -0.456 0.653
Hemisphere1 0.02482 0.18051 0.137 0.892
Cue1:Hemisphere1 -0.04252 0.18051 -0.236 0.816
where the F values are the squares of the t values.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:
On Wed, 9 Mar 2005, Darren Weber wrote:
How do we specify a contrast interaction matrix for an ANOVA model? We have a two-factor, repeated measures design, with
Where does `repeated measures' come into this? You appear to have repeated a 2x2 experiment in each of 8 blocks (subjects). Such a design is usually analysed with fixed effects. (Perhaps you averaged over repeats in the first few lines of your code?)
Actually, that's not "usual" in SAS (and not SPSS either, I believe)
in things like
proc glm;
model y1-y4= ;
repeated row 2 col 2;
[Not that SAS/SPSS is the Gospel, but they do tend to set the
terminology in these matters.]
There you'd get the analysis split up as analyses of three contrasts
corresponding to the main effects and interaction, c(-1,-1,1,1),
c(-1,1,-1,1), and c(-1,1,1,-1) in the 2x2 case (up to scale and sign).
In the 2x2 case, this corresponds exactly to the 4-stratum model
row*col + Error(block/(row*col)).
(It is interesting to note that it is still not the optimal analysis
for arbitrary covariance patterns because dependence between contrasts
is not utilized - it is basically assumed to be absent.)
With more than two levels, you get the additional complications of
sphericity testing and GG/HF epsilons and all that.
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Prof Brian Ripley wrote:
On Wed, 9 Mar 2005, Darren Weber wrote: We have a two-factor, repeated measures design, with
Where does `repeated measures' come into this? You appear to have repeated a 2x2 experiment in each of 8 blocks (subjects). Such a design is usually analysed with fixed effects. (Perhaps you averaged over repeats in the first few lines of your code?)
roi.aov <- aov(roi ~ (Cue*Hemisphere) + Error(Subject/(Cue*Hemisphere)), data=roiDataframe)
I think the error model should be Error(Subject). In what sense are `Cue' and `Cue:Hemisphere' random effects nested inside `Subject'?
I do not understand this, and I think I am probably not the only one. That is why I would be grateful if you could give a bit more information. My understanding is that the fixed factors Cue and Hemisphere are crossed with the random factor Subject (in other words, Cue and Hemisphere are within-subjects factors, and this is probably why Darren called it a "repeated measure" design). In this case, it seems to me from the various textbooks I read on Anova, that the appropriate MS to test the interaction Cue:Hemisphere is Subject:Cue:Hemisphere (with 7 degress of freedom, as there are 8 independent subjects). If you input Error(Subject/(Cue*Hemisphere)) in the aov formula, then the test for the interaction indeed uses the Subject:Cue:Hemisphere source of variation in demoninator. This fits with the ouput of other softwares. If you include only 'Subjet', then the test for the interaction has 21 degrees of Freedom, and I do not understand what this tests. I apologize in if my terminology is not accurate. But I hope you can clarify what is wrong with the Error(Subject/(Cue*Hemisphere)) term, or maybe just point us to the relevant textbooks. Thanks in advance, Christophe Pallier
Let me fake some `data':
set.seed(1); roiValues <- rnorm(32)
subjectlabels <- paste("V"1:8, sep = "")
options(contrasts = c("contr.helmert", "contr.poly"))
roi.aov <- aov(roi ~ Cue*Hemisphere + Error(Subject), data=roiDataframe)
roi.aov
Call:
aov(formula = roi ~ Cue * Hemisphere + Error(Subject), data =
roiDataframe)
Grand Mean: 0.1165512
Stratum 1: Subject
Terms:
Residuals
Sum of Squares 4.200946
Deg. of Freedom 7
Residual standard error: 0.7746839
Stratum 2: Within
Terms:
Cue Hemisphere Cue:Hemisphere Residuals
Sum of Squares 0.216453 0.019712 0.057860 21.896872
Deg. of Freedom 1 1 1 21
Residual standard error: 1.021131
Estimated effects are balanced
Note that all the action is in one stratum, and the SSQs are the same as
aov(roi ~ Subject + Cue * Hemisphere, data = roiDataframe)
(and also the same as for your fit).
print(summary(roi.aov))
It auto-prints, so you don't need print().
######################################## I've tried to create a contrast matrix like this: cm <- contrasts(roiDataframe$Cue) which gives the main effect contrasts for the Cue factor. I really want to specify the interaction contrasts, so I tried this: ######################################## # c( lh_cueL, lh_cueR, rh_cueL, rh_cueR ) # CueRight>CueLeft for the Left Hemisphere. # CueLeft>CueRight for the Right Hemisphere cm <- c(-1, 1, 1, -1) dim(cm) <- c(2,2)
(That is up to sign what Helmert contrasts give you.)
roi.aov <- aov( roi ~ (Cue*Hemisphere) + Error(Subject/(Cue*Hemisphere)), contrasts=cm, data=roiDataframe) print(summary(roi.aov)) ######################################## but the results of these two aov commands are identical. Is it the case that the 2x2 design matrix is always going to give the same F values for the interaction regardless of the contrast direction?
Yes, as however you code the design (via `contrasts') you are fitting
the same subspaces. Not sure what you mean by `contrast direction',
though.
However, you have not specified `contrasts' correctly:
contrasts: A list of contrasts to be used for some of the factors in
the formula.
and cm is not a list, and an interaction is not a factor.
OR, is there some way to get a summary output for the contrasts that is not available from the print method?
For more than two levels, yes: see `split' under ?summary.aov. Also, see se.contrasts which allows you to find the standard error for any contrast. For the fixed-effects model you can use summary.lm:
fit <- aov(roi ~ Subject + Cue * Hemisphere, data = roiDataframe) summary(fit)
Df Sum Sq Mean Sq F value Pr(>F)
Subject 7 4.2009 0.6001 0.5756 0.7677
Cue 1 0.2165 0.2165 0.2076 0.6533
Hemisphere 1 0.0197 0.0197 0.0189 0.8920
Cue:Hemisphere 1 0.0579 0.0579 0.0555 0.8161
Residuals 21 21.8969 1.0427
summary.lm(fit)
Call:
aov(formula = roi ~ Subject + Cue * Hemisphere, data = roiDataframe)
Residuals:
Min 1Q Median 3Q Max
-1.7893 -0.4197 0.1723 0.5868 1.3033
Coefficients:
Estimate Std. Error t value Pr(>|t|)
[...]
Cue1 -0.08224 0.18051 -0.456 0.653
Hemisphere1 0.02482 0.18051 0.137 0.892
Cue1:Hemisphere1 -0.04252 0.18051 -0.236 0.816
where the F values are the squares of the t values.
On Thu, 10 Mar 2005, Christophe Pallier wrote:
Prof Brian Ripley wrote:
On Wed, 9 Mar 2005, Darren Weber wrote: We have a two-factor, repeated measures design, with
Where does `repeated measures' come into this? You appear to have repeated a 2x2 experiment in each of 8 blocks (subjects). Such a design is usually analysed with fixed effects. (Perhaps you averaged over repeats in the first few lines of your code?)
roi.aov <- aov(roi ~ (Cue*Hemisphere) + Error(Subject/(Cue*Hemisphere)), data=roiDataframe)
I think the error model should be Error(Subject). In what sense are `Cue' and `Cue:Hemisphere' random effects nested inside `Subject'?
I do not understand this, and I think I am probably not the only one. That is why I would be grateful if you could give a bit more information. My understanding is that the fixed factors Cue and Hemisphere are crossed with the random factor Subject (in other words, Cue and Hemisphere are within-subjects factors, and this is probably why Darren called it a "repeated measure" design).
The issue is whether the variance of the error really depends on the
treatment combination, which is what the Error(Subject/(Cue*Hemisphere))
assumes. With that model
Error: Subject:Cue
Df Sum Sq Mean Sq F value Pr(>F)
Cue 1 0.2165 0.2165 0.1967 0.6708
Residuals 7 7.7041 1.1006
Error: Subject:Hemisphere
Df Sum Sq Mean Sq F value Pr(>F)
Hemisphere 1 0.0197 0.0197 0.0154 0.9047
Residuals 7 8.9561 1.2794
Error: Subject:Cue:Hemisphere
Df Sum Sq Mean Sq F value Pr(>F)
Cue:Hemisphere 1 0.0579 0.0579 0.0773 0.789
Residuals 7 5.2366 0.7481
you are assuming different variances for three contrasts.
In this case, it seems to me from the various textbooks I read on Anova, that the appropriate MS to test the interaction Cue:Hemisphere is Subject:Cue:Hemisphere (with 7 degress of freedom, as there are 8 independent subjects). If you input Error(Subject/(Cue*Hemisphere)) in the aov formula, then the test for the interaction indeed uses the Subject:Cue:Hemisphere source of variation in demoninator. This fits with the ouput of other softwares. If you include only 'Subjet', then the test for the interaction has 21 degrees of Freedom, and I do not understand what this tests.
It uses a common variance for all treatment combinations.
I apologize in if my terminology is not accurate. But I hope you can clarify what is wrong with the Error(Subject/(Cue*Hemisphere)) term, or maybe just point us to the relevant textbooks.
Nothing is `wrong' with it, it just seems discordant with the description of the experiment.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Thu, 10 Mar 2005, Peter Dalgaard wrote:
Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:
On Wed, 9 Mar 2005, Darren Weber wrote:
How do we specify a contrast interaction matrix for an ANOVA model? We have a two-factor, repeated measures design, with
Where does `repeated measures' come into this? You appear to have repeated a 2x2 experiment in each of 8 blocks (subjects). Such a design is usually analysed with fixed effects. (Perhaps you averaged over repeats in the first few lines of your code?)
Actually, that's not "usual" in SAS (and not SPSS either, I believe)
in things like
proc glm;
model y1-y4= ;
repeated row 2 col 2;
[Not that SAS/SPSS is the Gospel, but they do tend to set the
terminology in these matters.]
That seems to be appropriate only if the four treatments are done in a particular order (`repeated') and one expects correlations in the responses. However, here the measurements seem to have been averages of replications. It may be "usual" to (mis?)specify experiments in SAS that way: I don't know what end users do, but it is not the only way possible in SAS.
There you'd get the analysis split up as analyses of three contrasts corresponding to the main effects and interaction, c(-1,-1,1,1), c(-1,1,-1,1), and c(-1,1,1,-1) in the 2x2 case (up to scale and sign). In the 2x2 case, this corresponds exactly to the 4-stratum model row*col + Error(block/(row*col)). (It is interesting to note that it is still not the optimal analysis for arbitrary covariance patterns because dependence between contrasts is not utilized - it is basically assumed to be absent.)
It also assumes that there is a difference between variances of the contrasts, that is there is either correlation between results or a difference in variances under different treatments. Nothing in the description led me to expect either of those, but I was asking why it was specified that way. (If the variance does differ with the mean then there are probably more appropriate analyses.)
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Prof Brian Ripley wrote:
On Wed, 9 Mar 2005, Darren Weber wrote:
How do we specify a contrast interaction matrix for an ANOVA model? We have a two-factor, repeated measures design, with
Where does `repeated measures' come into this? You appear to have repeated a 2x2 experiment in each of 8 blocks (subjects). Such a design is usually analysed with fixed effects. (Perhaps you averaged over repeats in the first few lines of your code?)
Each subject experiences multiple instances of a visual stimulus that cues them to orient their attention to the left or right visual field. For each instance, we acquire brain activity measures from the left and right hemisphere (to put it simply). For each condition in this 2x2 design, we actually have about 400 instances for each cell for each subject. These are averaged in several ways, so we obtain a single scalar value for each subject in the final analysis. So, at this point, it does not appear to be a repeated measures analysis. Perhaps another way to put it - each subject experiences every cell of the design, so we have "within-subjects" factors rather than "between-subjects" factors. That is, each subject experiences all experimental conditions, we do not separate and randomly allocate subjects to only one cell or another of the experiment. I hope this clarifies the first question. I will try to digest further points in this thread, if they are not too confounded by this first point of contention already.
As an R newbie (formerly SPSS), I was pleased to find some helpful notes on ANOVA here: http://personality-project.org/r/r.anova.html In my case, I believe the relevant section is: Example 4. Two-Way Within-Subjects ANOVA This is where I noted and copied the error notation. Sorry for any confusion about terms - I did mean "within-subjects" factors, rather than repeated measures (although, as noted earlier, we do have both in this experiment).
Prof Brian Ripley wrote:
On Thu, 10 Mar 2005, Christophe Pallier wrote:
Prof Brian Ripley wrote:
On Wed, 9 Mar 2005, Darren Weber wrote: We have a two-factor, repeated measures design, with
Where does `repeated measures' come into this? You appear to have repeated a 2x2 experiment in each of 8 blocks (subjects). Such a design is usually analysed with fixed effects. (Perhaps you averaged over repeats in the first few lines of your code?)
roi.aov <- aov(roi ~ (Cue*Hemisphere) + Error(Subject/(Cue*Hemisphere)), data=roiDataframe)
I think the error model should be Error(Subject). In what sense are `Cue' and `Cue:Hemisphere' random effects nested inside `Subject'?
I do not understand this, and I think I am probably not the only one. That is why I would be grateful if you could give a bit more information. My understanding is that the fixed factors Cue and Hemisphere are crossed with the random factor Subject (in other words, Cue and Hemisphere are within-subjects factors, and this is probably why Darren called it a "repeated measure" design).
The issue is whether the variance of the error really depends on the
treatment combination, which is what the
Error(Subject/(Cue*Hemisphere)) assumes. With that model
Error: Subject:Cue
Df Sum Sq Mean Sq F value Pr(>F)
Cue 1 0.2165 0.2165 0.1967 0.6708
Residuals 7 7.7041 1.1006
Error: Subject:Hemisphere
Df Sum Sq Mean Sq F value Pr(>F)
Hemisphere 1 0.0197 0.0197 0.0154 0.9047
Residuals 7 8.9561 1.2794
Error: Subject:Cue:Hemisphere
Df Sum Sq Mean Sq F value Pr(>F)
Cue:Hemisphere 1 0.0579 0.0579 0.0773 0.789
Residuals 7 5.2366 0.7481
you are assuming different variances for three contrasts.
In this case, it seems to me from the various textbooks I read on Anova, that the appropriate MS to test the interaction Cue:Hemisphere is Subject:Cue:Hemisphere (with 7 degress of freedom, as there are 8 independent subjects). If you input Error(Subject/(Cue*Hemisphere)) in the aov formula, then the test for the interaction indeed uses the Subject:Cue:Hemisphere source of variation in demoninator. This fits with the ouput of other softwares. If you include only 'Subjet', then the test for the interaction has 21 degrees of Freedom, and I do not understand what this tests.
It uses a common variance for all treatment combinations.
I apologize in if my terminology is not accurate. But I hope you can clarify what is wrong with the Error(Subject/(Cue*Hemisphere)) term, or maybe just point us to the relevant textbooks.
Nothing is `wrong' with it, it just seems discordant with the description of the experiment.