Skip to content

car::Anova - Can it be used for ANCOVA with repeated-measures factors.

4 messages · Henrik Singmann, John Fox, Peter Dalgaard

#
Dear Henrik,

The within-subjects contrasts are constructed by Anova() to be orthogonal in the row-basis of the design, so you should be able to safely ignore the effects in which (for some reason that escapes me) you are uninterested. This would also be true (except for the estimated error) for the between-subjects design if you used "type-II" tests. It's true that the "type-III" between-subjects tests will be affected by the presence of an interaction, but for these tests to make sense at all, you have to formulate the model very carefully. For example, your type-III test for the "main effect" of treatment with the interaction in the model is for the treatment effect at age 0. Does that really make sense to you? Indeed, the type-III tests for the ANOVA (not ANCOVA) model only make sense because I was careful to use contrasts for the between-subjects factors that are orthogonal in the basis of the design:

 > contrasts(OBrienKaiser$treatment)
        [,1] [,2]
control   -2    0
A          1   -1
B          1    1
[,1]
F    1
M   -1

Best,
 John

On Sun, 22 Jul 2012 22:06:58 +0200
Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> wrote:
#
Dear John,

indeed, you are very right. Including the covariate as is, doesn't make any sense. The only correct way would be to center it on the mean beforehands. So actually the examples in my first and second mail are bogus (I add a corrected example at the end) and the reported test do not make much sense.

Let me try to explain why I want to discard the interactions of the covariate with the within-factors. The reason I want to exclude them is that I want to stay within the ANCOVA framework. I looked at the three books on experimental design I have on my desk (Winer, 1971; Kirk, 1982; Maxwell & Delaney, 2003) and they unanimously define the ANCOVA as the ANOVA on the responses controlled for the covariate only (i.e., not controlled for the covariate and the interactions with the other effects).
However, as you say, adding or removing an interaction with the orthogonal within-subject factors does indeed not alter the results (example at the end), so one could just use the output and discard the unwanted effects, although admittedly this seems sketchy given significant effects.

Unfortunately, my involvement with this issue has led me to another question. Winer and Kirk both discuss a split-plot ANCOVA in which one has measured a covariate for each observation. That is a second matrix alike the original data matrix, e.g. the body temperature of each person at each measurement for the OBrienKaiser dataset:

OBK.cov <- OBrienKaiser
OBK.cov[,-(1:2)] <- runif(16*15, 36, 41)

Would it be possible to fit the data using this temperature matrix as a covariate using car::Anova (I thought about this but couldn't find any idea of how to specify the imatrix)?

Thanks a lot for the helpful responses,
Henrik


PS: Better examples:
# compare the treatment and the phase effect across models.
require(car)
set.seed(1)

# using scale for the covariate:
n.OBrienKaiser <- within(OBrienKaiser, age <- scale(sample(18:35, size = 16, replace = TRUE), scale = FALSE))

phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)), levels=c("pretest", "posttest", "followup"))
hour <- ordered(rep(1:5, 3))
idata <- data.frame(phase, hour)

# Full ANCOVA model:
mod.1 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender + age, data=n.OBrienKaiser)
(av.1 <- Anova(mod.1, idata=idata, idesign=~phase*hour, type = 3))

#                             Df test stat approx F num Df den Df      Pr(>F)
# (Intercept)                  1     0.968    269.4      1      9 0.000000052 ***
# treatment                    2     0.443      3.6      2      9      0.0719 .
# gender                       1     0.305      3.9      1      9      0.0782 .
# age                          1     0.054      0.5      1      9      0.4902
# treatment:gender             2     0.222      1.3      2      9      0.3232
# phase                        1     0.811     17.2      2      8      0.0013 **
# ...

# removing the between-subject interaction does alter the lower order effects:
mod.2 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment + gender + age, data=n.OBrienKaiser)
(av.2 <- Anova(mod.2, idata=idata, idesign=~phase*hour, type = 3))

# Type III Repeated Measures MANOVA Tests: Pillai test statistic
#                      Df test stat approx F num Df den Df       Pr(>F)
# (Intercept)           1     0.959    254.5      1     11 0.0000000059 ***
# treatment             2     0.428      4.1      2     11      0.04644 *
# gender                1     0.271      4.1      1     11      0.06832 .
# age                   1     0.226      3.2      1     11      0.10030
# phase                 1     0.792     19.0      2     10      0.00039 ***
# ...

# removing the within-subject interaction does NOT alter the lower order effects:
mod.3 <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5,
           fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment * gender + age, data=n.OBrienKaiser)
(av.3 <- Anova(mod.3, idata=idata, idesign=~phase+hour, type = 3))
# Type III Repeated Measures MANOVA Tests: Pillai test statistic
#                        Df test stat approx F num Df den Df      Pr(>F)
# (Intercept)             1     0.968    269.4      1      9 0.000000052 ***
# treatment               2     0.443      3.6      2      9      0.0719 .
# gender                  1     0.305      3.9      1      9      0.0782 .
# age                     1     0.054      0.5      1      9      0.4902
# treatment:gender        2     0.222      1.3      2      9      0.3232
# phase                   1     0.811     17.2      2      8      0.0013 **
# ...



Am 22.07.2012 23:25, schrieb John Fox:

  
    
#
Dear Henrik,

On Mon, 23 Jul 2012 00:56:16 +0200
Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> wrote:
I'm afraid that Anova() won't handle repeated measures on covariates. I agree that it would be desirable to do so, and this capability is on my list of features to add to Anova(), but I can't promise when, or if, I'll get to it.

Sorry,
 John
#
On Jul 23, 2012, at 02:48 , John Fox wrote:
[snip long discussion which I admit not to have studied in every detail...]
"Here There Be Tygers"... These models very easily get into territory that does not fall within the realm of standard (multivariate) linear modeling, and I'm not sure you really want it to be handled by a tool like Anova().

There is some risk that I will find myself writing half a treatise in email, but lets look at a simple example: a simple randomized block design with treatments (say, Variety) and a covariate (say, Eelworm). In much of the ANCOVA ideology there is an assumption that the covariate is independent of treatment, typically a pre-randomization measurement. Now, using standard univariate theory, you can easily fit a model like 

Yield ~ Variety + Eelworm + Block

in which there is a single regression coefficient on Eelworm, and the Variety effects are said to be "adjusted for differences in eelworm count". 

You can do this with lm(), or with aov() as you please. However, in the latter case, you might formulate the model with a random Block effect, i.e.

Yield ~ Variety + Eelworm + Error(Block)

In that case, you will find that you get two estimates of the Eelworm effect, one from each stratum. This comes about via interblock information: If there's a high average Yield in blocks where the average Eelworm is low, then this says something about the effect of Eelworm. The estimate from the within-Block stratum will be the same as in the model with non-random Block effects. 

If you believe in a mechanistic explanation for the Eelworm effect, you would likely believe that the two regression coefficients estimate the same quantity and you could try combining the estimates into one (recovery of interblock information). However, this messes up all standard theory and since the interblock estimate is usually quite inaccurate, one often decides to discard it. (Mixed-effects software happily fits such models, at the expense of precise "degrees of freedom"-theory.)

There's an alternative interpretation in the form of a two-dimensional model, 

cbind(Yield, Eelworm) ~ Variety + Error(Block)

In that model, you get two-dimensional contrasts, and covariance matrices for each stratum. Then you can utilize the fact that if it is known that the contrasts for the covariate are zero, then the mean of the response (i.e. Yield) is the same as the conditional mean given the covariate equals zero, which is the intercept in the conditional regression model, which is the adjusted ANCOVA contrast yet again.

One difference is that in the two-dimensional response model, it is not obvious that the "between" and "within" covariance matrices need to share a common regression coefficient. If you think about this with a view to potential measurement errors in the covariate, it becomes clear that the two regression coefficients could well be different.

In the repeated measurements setting, we make a shift from an additive Block effect to a multidimensional Yield-response (corresponding to a reshape from long to wide). Let us say, for convenience that there are 3 Varieties; then we are looking at a 3-dimensional response. If we want to study Variety effects, we can decompose the response into a set of contrasts and an average, discard the latter, and use multivariate tests for zero mean of the contrasts. 

To introduce a covariate at this point gets tricky because the standard linear model assumes the same design matrix for all responses, so you cannot have Yield1 depend on Eelworm1 only, Yield2 on Eelworm2, etc. although you could potentially have all responses depend on all covariates, leaving you with 9 regression coefficients. However, it is not at all clear that you can compare the intercepts between varieties in such a model. 

One viewpoint is that this is really a (2x3=6)-dimensional response problem if we consider Yield and Eelworm simultaneously. However, it is possible is to transform both the Yield and the Eelworm variables to contrasts, for a 4-dimensional response, consisting of two Yield contrasts and two Eelworm contrasts. If the latter are known to have mean zero, we can condition on them and look at the intercepts. That'll be a 2-d regression analysis with 2 covariates (4 regression coeficients) and I think the results should make OK sense. The annoying thing is that in the general case of p varieties, you get (p-1)^2 regression coefficients, but I suspect that it is not really possible to impose simplifying restrictions on them without losing simplicity of analysis.