Skip to content

question about custom contrasts in ANOVA

4 messages · Scot W McNary, Spencer Graves, Brian Ripley

#
Hi,

I have a problem in which I have test score data on students from a number 
of schools.  In each school I have a measure of whether or not they 
received special programming.  I am interested in the interaction between 
school and attendance to the programming, but in a very select set of 
comparisons.  I'd like to cast the test as one in which students in each 
school who attend are compared with students who don't across all schools. 
So, I would be comparing school 1 attenders with school 1 non-attenders, 
school 2 attenders with school 2 non-attenders, etc.  The reason for the 
custom contrast is that the between school comparisons (e.g., school 1 
attenders vs. school 2 non-attenders) are of less interest.

This seems to require a custom contrast statement for the interaction 
term.  I have a toy example that seems to work as it should, but wonder if 
I've correctly created the contrast needed.

Here is a toy example (code put together from bits taken from MASS ch 6, 
and various R-help postings, (e.g., 
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/49077.html)):

# toy interaction contrast example, 10 schools, 100 kids, 5 attenders (1) 
# and 5 non-attenders (2) in each school

# make the data
school <- gl(10, 10)
attend <- gl(2, 5, 100)
# creates an interaction with schools 6 and 7
y <- c(sample(seq(450, 650, 1), 50), rep(c(rep(650, 5), rep(450, 5)), 2),
 	sample(seq(450, 650, 1), 30))

# anova
summary(aov(y ~ school * attend))

# graphically
Means <- tapply(y, list(school, attend), mean)

plot(Means[,1], col="red", type = "l", ylim = c(400,700))

points(Means[,2], col="blue", type = "l")

# create contrasts for hypothesis of interest
# school i attend j - school i attend j'
# for all schools
sxa <- interaction(school, attend)
sxam <- as.matrix(rbind(diag(1,10), diag(1,10) * -1))
contrasts(sxa) <- sxam

summary(aov(y ~ sxa), split=list(sxa=1:10), expand.split = T)

The actual problem has a few more schools, other covariates, considerably 
more students, and is somewhat unbalanced.

Thanks,

Scot


--
   Scot W. McNary  email:smcnary at charm.net
5 days later
#
Since I have not seen a reply to this post, I will attempt a brief 
comment:  I won't comment on the specifics, but your general approach 
seems appropriate.  You already seem to know that a factor with k levels 
is converted into (k-1) separate numerical variables, and a separate 
regression coefficient is estimated for each one.

	  To see in more detail what you were estimating, I looked at the 
following:

	  model.matrix(y~xma)
	  fit2 <- aov(y~xma)
	  attributes(fit2)

	  From the latter, I identified "contrasts" as something that might be 
interesting to examine, as follows:

	  fit2$contrasts

	  For clarity, I think I might reduce the number of observations 
substantially, limiting myself to only 2 or 3 schools and paramterize 
the problem manually, but preserving imbalance.  Then I'd use "lm", 
specifying the terms in different orders.  With imbalance, the answer 
depends on the order unfortunately.  When in doubt, I often experiment 
with changing the order:  If the changes do not affect the conclusions, 
I pick the simplest case to present to my audience.  If the changes do 
affect the conclusions, I know I need to worry about which answer seems 
most correct, and I also know something about the limits of the 
conclusions.

	  I know this doesn't answer your question, but I hope it helped with a 
solution methodology.

	  Best Wishes,
	  Spencer Graves
Scot W McNary wrote:

            

  
    
#
I was puzzled as to what the question actually was. If you set
the interaction contrasts are precisely those created manually.  Nothing 
fancier is required.  But I am not sure what you want to do with them once 
you have them.
On Tue, 30 Aug 2005, Spencer Graves wrote:

            

  
    
#
Dear Prof. Ripley:

	  Agreed.  I thought that coding the factor levels manually and / or 
looking at model.matrix(y~xma) after further simplifying the problem 
might help Scot clarify his question.

	  Thanks,
	  Spencer Graves
Prof Brian Ripley wrote: