question about custom contrasts in ANOVA
Since I have not seen a reply to this post, I will attempt a brief comment: I won't comment on the specifics, but your general approach seems appropriate. You already seem to know that a factor with k levels is converted into (k-1) separate numerical variables, and a separate regression coefficient is estimated for each one. To see in more detail what you were estimating, I looked at the following: model.matrix(y~xma) fit2 <- aov(y~xma) attributes(fit2) From the latter, I identified "contrasts" as something that might be interesting to examine, as follows: fit2$contrasts For clarity, I think I might reduce the number of observations substantially, limiting myself to only 2 or 3 schools and paramterize the problem manually, but preserving imbalance. Then I'd use "lm", specifying the terms in different orders. With imbalance, the answer depends on the order unfortunately. When in doubt, I often experiment with changing the order: If the changes do not affect the conclusions, I pick the simplest case to present to my audience. If the changes do affect the conclusions, I know I need to worry about which answer seems most correct, and I also know something about the limits of the conclusions. I know this doesn't answer your question, but I hope it helped with a solution methodology. Best Wishes, Spencer Graves
Scot W McNary wrote:
Hi, I have a problem in which I have test score data on students from a number of schools. In each school I have a measure of whether or not they received special programming. I am interested in the interaction between school and attendance to the programming, but in a very select set of comparisons. I'd like to cast the test as one in which students in each school who attend are compared with students who don't across all schools. So, I would be comparing school 1 attenders with school 1 non-attenders, school 2 attenders with school 2 non-attenders, etc. The reason for the custom contrast is that the between school comparisons (e.g., school 1 attenders vs. school 2 non-attenders) are of less interest. This seems to require a custom contrast statement for the interaction term. I have a toy example that seems to work as it should, but wonder if I've correctly created the contrast needed. Here is a toy example (code put together from bits taken from MASS ch 6, and various R-help postings, (e.g., http://finzi.psych.upenn.edu/R/Rhelp02a/archive/49077.html)): # toy interaction contrast example, 10 schools, 100 kids, 5 attenders (1) # and 5 non-attenders (2) in each school # make the data school <- gl(10, 10) attend <- gl(2, 5, 100) # creates an interaction with schools 6 and 7 y <- c(sample(seq(450, 650, 1), 50), rep(c(rep(650, 5), rep(450, 5)), 2), sample(seq(450, 650, 1), 30)) # anova summary(aov(y ~ school * attend)) # graphically Means <- tapply(y, list(school, attend), mean) plot(Means[,1], col="red", type = "l", ylim = c(400,700)) points(Means[,2], col="blue", type = "l") # create contrasts for hypothesis of interest # school i attend j - school i attend j' # for all schools sxa <- interaction(school, attend) sxam <- as.matrix(rbind(diag(1,10), diag(1,10) * -1)) contrasts(sxa) <- sxam summary(aov(y ~ sxa), split=list(sxa=1:10), expand.split = T) The actual problem has a few more schools, other covariates, considerably more students, and is somewhat unbalanced. Thanks, Scot -- Scot W. McNary email:smcnary at charm.net
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915