car::Anova - Can it be used for ANCOVA with repeated-measures factors.

4 messages · Henrik Singmann, John Fox, Peter Dalgaard

John Fox

Sun, Jul 22, 2012 2:25 PM #

Dear Henrik,

The within-subjects contrasts are constructed by Anova() to be orthogonal in the row-basis of the design, so you should be able to safely ignore the effects in which (for some reason that escapes me) you are uninterested. This would also be true (except for the estimated error) for the between-subjects design if you used "type-II" tests. It's true that the "type-III" between-subjects tests will be affected by the presence of an interaction, but for these tests to make sense at all, you have to formulate the model very carefully. For example, your type-III test for the "main effect" of treatment with the interaction in the model is for the treatment effect at age 0. Does that really make sense to you? Indeed, the type-III tests for the ANOVA (not ANCOVA) model only make sense because I was careful to use contrasts for the between-subjects factors that are orthogonal in the basis of the design:

 > contrasts(OBrienKaiser$treatment)
        [,1] [,2]
control   -2    0
A          1   -1
B          1    1

[,1]
F    1
M   -1

Best,
 John

On Sun, 22 Jul 2012 22:06:58 +0200

Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> wrote:

Dear John,

thanks for your response. But if I simply ignore the unwanted effects, the estimates of the main effects for the within-subjects factors are distroted (rationale see below). Or doesn't this hold for between-within interactions?

Or put another way: Do you think this approach is the correct way of running an ANCOVA involving within-subject factors?

As far as I understand ANCOVA, the covariate(s) should only be additive factors and do not interact with the factors of interest:
"Suppose that differences in [the mean of the covariate] are due to sources of variation related to [the mean of the dependent variable], but not directly related to the treatment effects." (Winer, 1972, p. 753, the parts in squared bracktes exchange the mathematical symbols with the definition).

Best,
Henrik

PS: Showing that adding the interaction term massively changes the main effect for a between-factor:

# The ANCOVA:
Anova(lm(pre.1 ~ treatment + age, data = n.OBrienKaiser), type = 3)

Anova Table (Type III tests)

Response: pre.1
             Sum Sq Df F value Pr(>F)
(Intercept)    0.0  1    0.01   0.90
treatment      0.3  2    0.06   0.94
age            4.5  1    1.54   0.24
Residuals     34.9 12

# The ANOVA:
Anova(lm(pre.1 ~ treatment, data = n.OBrienKaiser), type = 3)

Anova Table (Type III tests)

Response: pre.1
             Sum Sq Df F value     Pr(>F)
(Intercept)  225.6  1   74.47 0.00000097 ***
treatment      1.1  2    0.17       0.84
Residuals     39.4 13
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

# The model with interaction
Anova(lm(pre.1 ~ treatment * age, data = n.OBrienKaiser), type = 3)

Anova Table (Type III tests)

Response: pre.1
               Sum Sq Df F value Pr(>F)
(Intercept)     3.01  1    1.40  0.264
treatment      13.71  2    3.18  0.085 .
age            11.56  1    5.37  0.043 *
treatment:age  13.37  2    3.11  0.089 .
Residuals      21.53 10
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1


Am 22.07.2012 16:59, schrieb John Fox:

Dear Henrik,

As you discovered, entering the covariate age additively into the between-subject model doesn't prevent Anova() from reporting tests for the interactions between age and the within-subjects factors. I'm not sure why you would want to do so, but you could simply ignore these tests.

I hope this helps,
  John

--------------------------------
John Fox
Senator William McMaster
   Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Henrik Singmann
Sent: July-21-12 1:29 PM
To: r-help at stat.math.ethz.ch
Subject: [R] car::Anova - Can it be used for ANCOVA with repeated-
measures factors.

Dear list,

I would like to run an ANCOVA using car::Anova with repeated measures
factors, but I can't figure out how to do it. My (between-subjects)
covariate always interacts with my within-subject factors.
As far as I understand ANCOVA, covariates usually do not interact with
the effects of interest but are simply additive (or am I wrong here?).

More specifically, I can add a covariate as a factor to the between-
subjects part when fitting the MLM that behaves like expected (i.e.,
does not interact with the other factors), but when calling Anova on
the model, I don't know how I can specify the between-within design
(i.e., which parts of the model should interact with the repeated
measures factors).

As far as I understand it, neither the idesign, icontrasts or imatrix
arguments, nor the linearHypothesis function can specify the within-
between design (as far as I get it they all specify the within or
intra-subject design, see John Fox's slides from User 2011:
http://web.warwick.ac.uk/statsdept/useR-
2011/TalkSlides/Contributed/17Aug_1705_FocusV_4-Multivariate_1-
Fox.pdf).

If this it is not possible using car::Anova, is there another way to
achiebve what I want or is it plainly wrong?
I have the feeling that using R's "New Functions for Multivariate
Analysis" (Dalgaard, 2007, R News) this could be possible, but some
advice on how, would be greatly appreciated, as this does not seem to
be the most straight forward way.

Below is an example using the car::OBrienKaiser dataset adding an age
covariate. The example is merely an adoption from ?Anova with miniml
changes and includes e.g. age:phase:hour which I don't want to have.

Note that I posted this question to stackoverflow two days ago
(http://stackoverflow.com/q/11567446/289572) and did not receive any
responses. Please excuse my "crossposting", but I think R-help may be
the better place.

Best,
Henrik

PS: I know that the posting guide says "No questions about contributed
packages" but there are some questions about car on R-help, so I
thought this would be the correct place.

###### Example follows #####

require(car)
set.seed(1)

n.OBrienKaiser <- within(OBrienKaiser, age <- sample(18:35, size = 16,
replace = TRUE))

phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)),
levels=c("pretest", "posttest", "followup")) hour <- ordered(rep(1:5,
3)) idata <- data.frame(phase, hour)

mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2,
post.3, post.4, post.5,
            fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender +
age, data=n.OBrienKaiser) (av.ok <- Anova(mod.ok, idata=idata,
idesign=~phase*hour, type = 3))

# Type II Repeated Measures MANOVA Tests: Pillai test statistic
                              # Df test stat approx F num Df den Df
Pr(>F)
# (Intercept)                  1     0.971    299.9      1      9
0.000000032 ***
# treatment                    2     0.492      4.4      2      9
0.04726 *
# gender                       1     0.193      2.1      1      9
0.17700
# age                          1     0.045      0.4      1      9
0.53351
# treatment:gender             2     0.389      2.9      2      9
0.10867
# phase                        1     0.855     23.6      2      8
0.00044 ***
# treatment:phase              2     0.696      2.4      4     18
0.08823 .
# gender:phase                 1     0.079      0.3      2      8
0.71944
# age:phase                    1     0.140      0.7      2      8
0.54603
# treatment:gender:phase       2     0.305      0.8      4     18
0.53450
# hour                         1     0.939     23.3      4      6
0.00085 ***
# treatment:hour               2     0.346      0.4      8     14
0.92192
# gender:hour                  1     0.286      0.6      4      6
0.67579
# age:hour                     1     0.262      0.5      4      6
0.71800
# treatment:gender:hour        2     0.539      0.6      8     14
0.72919
# phase:hour                   1     0.663      0.5      8      2
0.80707
# treatment:phase:hour         2     0.893      0.3     16      6
0.97400
# gender:phase:hour            1     0.700      0.6      8      2
0.76021
# age:phase:hour               1     0.813      1.1      8      2
0.56210
# treatment:gender:phase:hour  2     1.003      0.4     16      6
0.94434
# ---
# Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

--
Dipl. Psych. Henrik Singmann
PhD Student
Albert-Ludwigs-Universit?t Freiburg
http://www.psychologie.uni-freiburg.de/Members/singmann

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Henrik Singmann

Sun, Jul 22, 2012 3:56 PM #

Dear John,

indeed, you are very right. Including the covariate as is, doesn't make any sense. The only correct way would be to center it on the mean beforehands. So actually the examples in my first and second mail are bogus (I add a corrected example at the end) and the reported test do not make much sense.

Let me try to explain why I want to discard the interactions of the covariate with the within-factors. The reason I want to exclude them is that I want to stay within the ANCOVA framework. I looked at the three books on experimental design I have on my desk (Winer, 1971; Kirk, 1982; Maxwell & Delaney, 2003) and they unanimously define the ANCOVA as the ANOVA on the responses controlled for the covariate only (i.e., not controlled for the covariate and the interactions with the other effects).
However, as you say, adding or removing an interaction with the orthogonal within-subject factors does indeed not alter the results (example at the end), so one could just use the output and discard the unwanted effects, although admittedly this seems sketchy given significant effects.

Unfortunately, my involvement with this issue has led me to another question. Winer and Kirk both discuss a split-plot ANCOVA in which one has measured a covariate for each observation. That is a second matrix alike the original data matrix, e.g. the body temperature of each person at each measurement for the OBrienKaiser dataset:

OBK.cov <- OBrienKaiser
OBK.cov[,-(1:2)] <- runif(16*15, 36, 41)

Would it be possible to fit the data using this temperature matrix as a covariate using car::Anova (I thought about this but couldn't find any idea of how to specify the imatrix)?

Thanks a lot for the helpful responses,
Henrik


PS: Better examples:
# compare the treatment and the phase effect across models.
require(car)
set.seed(1)

# using scale for the covariate:
n.OBrienKaiser <- within(OBrienKaiser, age <- scale(sample(18:35, size = 16, replace = TRUE), scale = FALSE))

phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)), levels=c("pretest", "posttest", "followup"))
hour <- ordered(rep(1:5, 3))
idata <- data.frame(phase, hour)

# Full ANCOVA model:
mod.1 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender + age, data=n.OBrienKaiser)
(av.1 <- Anova(mod.1, idata=idata, idesign=~phase*hour, type = 3))

#                             Df test stat approx F num Df den Df      Pr(>F)
# (Intercept)                  1     0.968    269.4      1      9 0.000000052 ***
# treatment                    2     0.443      3.6      2      9      0.0719 .
# gender                       1     0.305      3.9      1      9      0.0782 .
# age                          1     0.054      0.5      1      9      0.4902
# treatment:gender             2     0.222      1.3      2      9      0.3232
# phase                        1     0.811     17.2      2      8      0.0013 **
# ...

# removing the between-subject interaction does alter the lower order effects:
mod.2 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment + gender + age, data=n.OBrienKaiser)
(av.2 <- Anova(mod.2, idata=idata, idesign=~phase*hour, type = 3))

# Type III Repeated Measures MANOVA Tests: Pillai test statistic
#                      Df test stat approx F num Df den Df       Pr(>F)
# (Intercept)           1     0.959    254.5      1     11 0.0000000059 ***
# treatment             2     0.428      4.1      2     11      0.04644 *
# gender                1     0.271      4.1      1     11      0.06832 .
# age                   1     0.226      3.2      1     11      0.10030
# phase                 1     0.792     19.0      2     10      0.00039 ***
# ...

# removing the within-subject interaction does NOT alter the lower order effects:
mod.3 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender + age, data=n.OBrienKaiser)
(av.3 <- Anova(mod.3, idata=idata, idesign=~phase+hour, type = 3))
# Type III Repeated Measures MANOVA Tests: Pillai test statistic
#                        Df test stat approx F num Df den Df      Pr(>F)
# (Intercept)             1     0.968    269.4      1      9 0.000000052 ***
# treatment               2     0.443      3.6      2      9      0.0719 .
# gender                  1     0.305      3.9      1      9      0.0782 .
# age                     1     0.054      0.5      1      9      0.4902
# treatment:gender        2     0.222      1.3      2      9      0.3232
# phase                   1     0.811     17.2      2      8      0.0013 **
# ...



Am 22.07.2012 23:25, schrieb John Fox:

Dear Henrik,

The within-subjects contrasts are constructed by Anova() to be orthogonal in the row-basis of the design, so you should be able to safely ignore the effects in which (for some reason that escapes me) you are uninterested. This would also be true (except for the estimated error) for the between-subjects design if you used "type-II" tests. It's true that the "type-III" between-subjects tests will be affected by the presence of an interaction, but for these tests to make sense at all, you have to formulate the model very carefully. For example, your type-III test for the "main effect" of treatment with the interaction in the model is for the treatment effect at age 0. Does that really make sense to you? Indeed, the type-III tests for the ANOVA (not ANCOVA) model only make sense because I was careful to use contrasts for the between-subjects factors that are orthogonal in the basis of the design:

  > contrasts(OBrienKaiser$treatment)

         [,1] [,2]
control   -2    0
A          1   -1
B          1    1

contrasts(OBrienKaiser$gender)

   [,1]
F    1
M   -1

Best,
  John

On Sun, 22 Jul 2012 22:06:58 +0200
  Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> wrote:

Dear John,

thanks for your response. But if I simply ignore the unwanted effects, the estimates of the main effects for the within-subjects factors are distroted (rationale see below). Or doesn't this hold for between-within interactions?

Or put another way: Do you think this approach is the correct way of running an ANCOVA involving within-subject factors?

As far as I understand ANCOVA, the covariate(s) should only be additive factors and do not interact with the factors of interest:
"Suppose that differences in [the mean of the covariate] are due to sources of variation related to [the mean of the dependent variable], but not directly related to the treatment effects." (Winer, 1972, p. 753, the parts in squared bracktes exchange the mathematical symbols with the definition).

Best,
Henrik

PS: Showing that adding the interaction term massively changes the main effect for a between-factor:

# The ANCOVA:
Anova(lm(pre.1 ~ treatment + age, data = n.OBrienKaiser), type = 3)

Anova Table (Type III tests)

Response: pre.1
              Sum Sq Df F value Pr(>F)
(Intercept)    0.0  1    0.01   0.90
treatment      0.3  2    0.06   0.94
age            4.5  1    1.54   0.24
Residuals     34.9 12

# The ANOVA:
Anova(lm(pre.1 ~ treatment, data = n.OBrienKaiser), type = 3)

Anova Table (Type III tests)

Response: pre.1
              Sum Sq Df F value     Pr(>F)
(Intercept)  225.6  1   74.47 0.00000097 ***
treatment      1.1  2    0.17       0.84
Residuals     39.4 13
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

# The model with interaction
Anova(lm(pre.1 ~ treatment * age, data = n.OBrienKaiser), type = 3)

Anova Table (Type III tests)

Response: pre.1
                Sum Sq Df F value Pr(>F)
(Intercept)     3.01  1    1.40  0.264
treatment      13.71  2    3.18  0.085 .
age            11.56  1    5.37  0.043 *
treatment:age  13.37  2    3.11  0.089 .
Residuals      21.53 10
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1


Am 22.07.2012 16:59, schrieb John Fox:

Dear Henrik,

As you discovered, entering the covariate age additively into the between-subject model doesn't prevent Anova() from reporting tests for the interactions between age and the within-subjects factors. I'm not sure why you would want to do so, but you could simply ignore these tests.

I hope this helps,
   John

--------------------------------
John Fox
Senator William McMaster
    Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Henrik Singmann
Sent: July-21-12 1:29 PM
To: r-help at stat.math.ethz.ch
Subject: [R] car::Anova - Can it be used for ANCOVA with repeated-
measures factors.

Dear list,

I would like to run an ANCOVA using car::Anova with repeated measures
factors, but I can't figure out how to do it. My (between-subjects)
covariate always interacts with my within-subject factors.
As far as I understand ANCOVA, covariates usually do not interact with
the effects of interest but are simply additive (or am I wrong here?).

More specifically, I can add a covariate as a factor to the between-
subjects part when fitting the MLM that behaves like expected (i.e.,
does not interact with the other factors), but when calling Anova on
the model, I don't know how I can specify the between-within design
(i.e., which parts of the model should interact with the repeated
measures factors).

As far as I understand it, neither the idesign, icontrasts or imatrix
arguments, nor the linearHypothesis function can specify the within-
between design (as far as I get it they all specify the within or
intra-subject design, see John Fox's slides from User 2011:
http://web.warwick.ac.uk/statsdept/useR-
2011/TalkSlides/Contributed/17Aug_1705_FocusV_4-Multivariate_1-
Fox.pdf).

If this it is not possible using car::Anova, is there another way to
achiebve what I want or is it plainly wrong?
I have the feeling that using R's "New Functions for Multivariate
Analysis" (Dalgaard, 2007, R News) this could be possible, but some
advice on how, would be greatly appreciated, as this does not seem to
be the most straight forward way.

Below is an example using the car::OBrienKaiser dataset adding an age
covariate. The example is merely an adoption from ?Anova with miniml
changes and includes e.g. age:phase:hour which I don't want to have.

Note that I posted this question to stackoverflow two days ago
(http://stackoverflow.com/q/11567446/289572) and did not receive any
responses. Please excuse my "crossposting", but I think R-help may be
the better place.

Best,
Henrik

PS: I know that the posting guide says "No questions about contributed
packages" but there are some questions about car on R-help, so I
thought this would be the correct place.

###### Example follows #####

require(car)
set.seed(1)

n.OBrienKaiser <- within(OBrienKaiser, age <- sample(18:35, size = 16,
replace = TRUE))

phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)),
levels=c("pretest", "posttest", "followup")) hour <- ordered(rep(1:5,
3)) idata <- data.frame(phase, hour)

mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2,
post.3, post.4, post.5,
             fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender +
age, data=n.OBrienKaiser) (av.ok <- Anova(mod.ok, idata=idata,
idesign=~phase*hour, type = 3))

# Type II Repeated Measures MANOVA Tests: Pillai test statistic
                               # Df test stat approx F num Df den Df
Pr(>F)
# (Intercept)                  1     0.971    299.9      1      9
0.000000032 ***
# treatment                    2     0.492      4.4      2      9
0.04726 *
# gender                       1     0.193      2.1      1      9
0.17700
# age                          1     0.045      0.4      1      9
0.53351
# treatment:gender             2     0.389      2.9      2      9
0.10867
# phase                        1     0.855     23.6      2      8
0.00044 ***
# treatment:phase              2     0.696      2.4      4     18
0.08823 .
# gender:phase                 1     0.079      0.3      2      8
0.71944
# age:phase                    1     0.140      0.7      2      8
0.54603
# treatment:gender:phase       2     0.305      0.8      4     18
0.53450
# hour                         1     0.939     23.3      4      6
0.00085 ***
# treatment:hour               2     0.346      0.4      8     14
0.92192
# gender:hour                  1     0.286      0.6      4      6
0.67579
# age:hour                     1     0.262      0.5      4      6
0.71800
# treatment:gender:hour        2     0.539      0.6      8     14
0.72919
# phase:hour                   1     0.663      0.5      8      2
0.80707
# treatment:phase:hour         2     0.893      0.3     16      6
0.97400
# gender:phase:hour            1     0.700      0.6      8      2
0.76021
# age:phase:hour               1     0.813      1.1      8      2
0.56210
# treatment:gender:phase:hour  2     1.003      0.4     16      6
0.94434
# ---
# Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

--
Dipl. Psych. Henrik Singmann
PhD Student
Albert-Ludwigs-Universit?t Freiburg
http://www.psychologie.uni-freiburg.de/Members/singmann

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dipl. Psych. Henrik Singmann
PhD Student
Albert-Ludwigs-Universit?t Freiburg, Germany
http://www.psychologie.uni-freiburg.de/Members/singmann

John Fox

Sun, Jul 22, 2012 5:48 PM #

Dear Henrik,

On Mon, 23 Jul 2012 00:56:16 +0200

Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> wrote:

I'm afraid that Anova() won't handle repeated measures on covariates. I agree that it would be desirable to do so, and this capability is on my list of features to add to Anova(), but I can't promise when, or if, I'll get to it.

Sorry,
 John

Thanks a lot for the helpful responses,
Henrik


PS: Better examples:
# compare the treatment and the phase effect across models.
require(car)
set.seed(1)

# using scale for the covariate:
n.OBrienKaiser <- within(OBrienKaiser, age <- scale(sample(18:35, size = 16, replace = TRUE), scale = FALSE))

phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)), levels=c("pretest", "posttest", "followup"))
hour <- ordered(rep(1:5, 3))
idata <- data.frame(phase, hour)

# Full ANCOVA model:
mod.1 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender + age, data=n.OBrienKaiser)
(av.1 <- Anova(mod.1, idata=idata, idesign=~phase*hour, type = 3))

#                             Df test stat approx F num Df den Df      Pr(>F)
# (Intercept)                  1     0.968    269.4      1      9 0.000000052 ***
# treatment                    2     0.443      3.6      2      9      0.0719 .
# gender                       1     0.305      3.9      1      9      0.0782 .
# age                          1     0.054      0.5      1      9      0.4902
# treatment:gender             2     0.222      1.3      2      9      0.3232
# phase                        1     0.811     17.2      2      8      0.0013 **
# ...

# removing the between-subject interaction does alter the lower order effects:
mod.2 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment + gender + age, data=n.OBrienKaiser)
(av.2 <- Anova(mod.2, idata=idata, idesign=~phase*hour, type = 3))

# Type III Repeated Measures MANOVA Tests: Pillai test statistic
#                      Df test stat approx F num Df den Df       Pr(>F)
# (Intercept)           1     0.959    254.5      1     11 0.0000000059 ***
# treatment             2     0.428      4.1      2     11      0.04644 *
# gender                1     0.271      4.1      1     11      0.06832 .
# age                   1     0.226      3.2      1     11      0.10030
# phase                 1     0.792     19.0      2     10      0.00039 ***
# ...

# removing the within-subject interaction does NOT alter the lower order effects:
mod.3 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender + age, data=n.OBrienKaiser)
(av.3 <- Anova(mod.3, idata=idata, idesign=~phase+hour, type = 3))
# Type III Repeated Measures MANOVA Tests: Pillai test statistic
#                        Df test stat approx F num Df den Df      Pr(>F)
# (Intercept)             1     0.968    269.4      1      9 0.000000052 ***
# treatment               2     0.443      3.6      2      9      0.0719 .
# gender                  1     0.305      3.9      1      9      0.0782 .
# age                     1     0.054      0.5      1      9      0.4902
# treatment:gender        2     0.222      1.3      2      9      0.3232
# phase                   1     0.811     17.2      2      8      0.0013 **
# ...



Am 22.07.2012 23:25, schrieb John Fox:

Dear Henrik,

The within-subjects contrasts are constructed by Anova() to be orthogonal in the row-basis of the design, so you should be able to safely ignore the effects in which (for some reason that escapes me) you are uninterested. This would also be true (except for the estimated error) for the between-subjects design if you used "type-II" tests. It's true that the "type-III" between-subjects tests will be affected by the presence of an interaction, but for these tests to make sense at all, you have to formulate the model very carefully. For example, your type-III test for the "main effect" of treatment with the interaction in the model is for the treatment effect at age 0. Does that really make sense to you? Indeed, the type-III tests for the ANOVA (not ANCOVA) model only make sense because I was careful to use contrasts for the between-subjects factors that are orthogonal in the basis of the design:

  > contrasts(OBrienKaiser$treatment)

         [,1] [,2]
control   -2    0
A          1   -1
B          1    1

contrasts(OBrienKaiser$gender)

   [,1]
F    1
M   -1

Best,
  John

On Sun, 22 Jul 2012 22:06:58 +0200
  Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> wrote:

Dear John,

thanks for your response. But if I simply ignore the unwanted effects, the estimates of the main effects for the within-subjects factors are distroted (rationale see below). Or doesn't this hold for between-within interactions?

Or put another way: Do you think this approach is the correct way of running an ANCOVA involving within-subject factors?

As far as I understand ANCOVA, the covariate(s) should only be additive factors and do not interact with the factors of interest:
"Suppose that differences in [the mean of the covariate] are due to sources of variation related to [the mean of the dependent variable], but not directly related to the treatment effects." (Winer, 1972, p. 753, the parts in squared bracktes exchange the mathematical symbols with the definition).

Best,
Henrik

PS: Showing that adding the interaction term massively changes the main effect for a between-factor:

# The ANCOVA:
Anova(lm(pre.1 ~ treatment + age, data = n.OBrienKaiser), type = 3)

Anova Table (Type III tests)

Response: pre.1
              Sum Sq Df F value Pr(>F)
(Intercept)    0.0  1    0.01   0.90
treatment      0.3  2    0.06   0.94
age            4.5  1    1.54   0.24
Residuals     34.9 12

# The ANOVA:
Anova(lm(pre.1 ~ treatment, data = n.OBrienKaiser), type = 3)

Anova Table (Type III tests)

Response: pre.1
              Sum Sq Df F value     Pr(>F)
(Intercept)  225.6  1   74.47 0.00000097 ***
treatment      1.1  2    0.17       0.84
Residuals     39.4 13
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

# The model with interaction
Anova(lm(pre.1 ~ treatment * age, data = n.OBrienKaiser), type = 3)

Anova Table (Type III tests)

Response: pre.1
                Sum Sq Df F value Pr(>F)
(Intercept)     3.01  1    1.40  0.264
treatment      13.71  2    3.18  0.085 .
age            11.56  1    5.37  0.043 *
treatment:age  13.37  2    3.11  0.089 .
Residuals      21.53 10
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1


Am 22.07.2012 16:59, schrieb John Fox:

Dear Henrik,

As you discovered, entering the covariate age additively into the between-subject model doesn't prevent Anova() from reporting tests for the interactions between age and the within-subjects factors. I'm not sure why you would want to do so, but you could simply ignore these tests.

I hope this helps,
   John

--------------------------------
John Fox
Senator William McMaster
    Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Henrik Singmann
Sent: July-21-12 1:29 PM
To: r-help at stat.math.ethz.ch
Subject: [R] car::Anova - Can it be used for ANCOVA with repeated-
measures factors.

Dear list,

I would like to run an ANCOVA using car::Anova with repeated measures
factors, but I can't figure out how to do it. My (between-subjects)
covariate always interacts with my within-subject factors.
As far as I understand ANCOVA, covariates usually do not interact with
the effects of interest but are simply additive (or am I wrong here?).

More specifically, I can add a covariate as a factor to the between-
subjects part when fitting the MLM that behaves like expected (i.e.,
does not interact with the other factors), but when calling Anova on
the model, I don't know how I can specify the between-within design
(i.e., which parts of the model should interact with the repeated
measures factors).

As far as I understand it, neither the idesign, icontrasts or imatrix
arguments, nor the linearHypothesis function can specify the within-
between design (as far as I get it they all specify the within or
intra-subject design, see John Fox's slides from User 2011:
http://web.warwick.ac.uk/statsdept/useR-
2011/TalkSlides/Contributed/17Aug_1705_FocusV_4-Multivariate_1-
Fox.pdf).

If this it is not possible using car::Anova, is there another way to
achiebve what I want or is it plainly wrong?
I have the feeling that using R's "New Functions for Multivariate
Analysis" (Dalgaard, 2007, R News) this could be possible, but some
advice on how, would be greatly appreciated, as this does not seem to
be the most straight forward way.

Below is an example using the car::OBrienKaiser dataset adding an age
covariate. The example is merely an adoption from ?Anova with miniml
changes and includes e.g. age:phase:hour which I don't want to have.

Note that I posted this question to stackoverflow two days ago
(http://stackoverflow.com/q/11567446/289572) and did not receive any
responses. Please excuse my "crossposting", but I think R-help may be
the better place.

Best,
Henrik

PS: I know that the posting guide says "No questions about contributed
packages" but there are some questions about car on R-help, so I
thought this would be the correct place.

###### Example follows #####

require(car)
set.seed(1)

n.OBrienKaiser <- within(OBrienKaiser, age <- sample(18:35, size = 16,
replace = TRUE))

phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)),
levels=c("pretest", "posttest", "followup")) hour <- ordered(rep(1:5,
3)) idata <- data.frame(phase, hour)

mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2,
post.3, post.4, post.5,
             fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender +
age, data=n.OBrienKaiser) (av.ok <- Anova(mod.ok, idata=idata,
idesign=~phase*hour, type = 3))

# Type II Repeated Measures MANOVA Tests: Pillai test statistic
                               # Df test stat approx F num Df den Df
Pr(>F)
# (Intercept)                  1     0.971    299.9      1      9
0.000000032 ***
# treatment                    2     0.492      4.4      2      9
0.04726 *
# gender                       1     0.193      2.1      1      9
0.17700
# age                          1     0.045      0.4      1      9
0.53351
# treatment:gender             2     0.389      2.9      2      9
0.10867
# phase                        1     0.855     23.6      2      8
0.00044 ***
# treatment:phase              2     0.696      2.4      4     18
0.08823 .
# gender:phase                 1     0.079      0.3      2      8
0.71944
# age:phase                    1     0.140      0.7      2      8
0.54603
# treatment:gender:phase       2     0.305      0.8      4     18
0.53450
# hour                         1     0.939     23.3      4      6
0.00085 ***
# treatment:hour               2     0.346      0.4      8     14
0.92192
# gender:hour                  1     0.286      0.6      4      6
0.67579
# age:hour                     1     0.262      0.5      4      6
0.71800
# treatment:gender:hour        2     0.539      0.6      8     14
0.72919
# phase:hour                   1     0.663      0.5      8      2
0.80707
# treatment:phase:hour         2     0.893      0.3     16      6
0.97400
# gender:phase:hour            1     0.700      0.6      8      2
0.76021
# age:phase:hour               1     0.813      1.1      8      2
0.56210
# treatment:gender:phase:hour  2     1.003      0.4     16      6
0.94434
# ---
# Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

--
Dipl. Psych. Henrik Singmann
PhD Student
Albert-Ludwigs-Universit?t Freiburg
http://www.psychologie.uni-freiburg.de/Members/singmann

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Peter Dalgaard

Mon, Jul 23, 2012 5:24 AM #

On Jul 23, 2012, at 02:48 , John Fox wrote:

[snip long discussion which I admit not to have studied in every detail...]

"Here There Be Tygers"... These models very easily get into territory that does not fall within the realm of standard (multivariate) linear modeling, and I'm not sure you really want it to be handled by a tool like Anova().

There is some risk that I will find myself writing half a treatise in email, but lets look at a simple example: a simple randomized block design with treatments (say, Variety) and a covariate (say, Eelworm). In much of the ANCOVA ideology there is an assumption that the covariate is independent of treatment, typically a pre-randomization measurement. Now, using standard univariate theory, you can easily fit a model like

Yield ~ Variety + Eelworm + Block

in which there is a single regression coefficient on Eelworm, and the Variety effects are said to be "adjusted for differences in eelworm count".

You can do this with lm(), or with aov() as you please. However, in the latter case, you might formulate the model with a random Block effect, i.e.

Yield ~ Variety + Eelworm + Error(Block)

In that case, you will find that you get two estimates of the Eelworm effect, one from each stratum. This comes about via interblock information: If there's a high average Yield in blocks where the average Eelworm is low, then this says something about the effect of Eelworm. The estimate from the within-Block stratum will be the same as in the model with non-random Block effects.

If you believe in a mechanistic explanation for the Eelworm effect, you would likely believe that the two regression coefficients estimate the same quantity and you could try combining the estimates into one (recovery of interblock information). However, this messes up all standard theory and since the interblock estimate is usually quite inaccurate, one often decides to discard it. (Mixed-effects software happily fits such models, at the expense of precise "degrees of freedom"-theory.)

There's an alternative interpretation in the form of a two-dimensional model,

cbind(Yield, Eelworm) ~ Variety + Error(Block)

In that model, you get two-dimensional contrasts, and covariance matrices for each stratum. Then you can utilize the fact that if it is known that the contrasts for the covariate are zero, then the mean of the response (i.e. Yield) is the same as the conditional mean given the covariate equals zero, which is the intercept in the conditional regression model, which is the adjusted ANCOVA contrast yet again.

One difference is that in the two-dimensional response model, it is not obvious that the "between" and "within" covariance matrices need to share a common regression coefficient. If you think about this with a view to potential measurement errors in the covariate, it becomes clear that the two regression coefficients could well be different.

In the repeated measurements setting, we make a shift from an additive Block effect to a multidimensional Yield-response (corresponding to a reshape from long to wide). Let us say, for convenience that there are 3 Varieties; then we are looking at a 3-dimensional response. If we want to study Variety effects, we can decompose the response into a set of contrasts and an average, discard the latter, and use multivariate tests for zero mean of the contrasts.

To introduce a covariate at this point gets tricky because the standard linear model assumes the same design matrix for all responses, so you cannot have Yield1 depend on Eelworm1 only, Yield2 on Eelworm2, etc. although you could potentially have all responses depend on all covariates, leaving you with 9 regression coefficients. However, it is not at all clear that you can compare the intercepts between varieties in such a model.

One viewpoint is that this is really a (2x3=6)-dimensional response problem if we consider Yield and Eelworm simultaneously. However, it is possible is to transform both the Yield and the Eelworm variables to contrasts, for a 4-dimensional response, consisting of two Yield contrasts and two Eelworm contrasts. If the latter are known to have mean zero, we can condition on them and look at the intercepts. That'll be a 2-d regression analysis with 2 covariates (4 regression coeficients) and I think the results should make OK sense. The annoying thing is that in the general case of p varieties, you get (p-1)^2 regression coefficients, but I suspect that it is not really possible to impose simplifying restrictions on them without losing simplicity of analysis.

Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com