Skip to content
Prev 131947 / 398506 Next

two-way categorical anova post-hoc data extraction


        
I've never found this particularly easy to do. For a recent client I wrote:
library(effects)
library(multcomp)
library(multcompView)


effectsum <- function(model, effect) {
  effects <- as.data.frame(all.effects(model)[[effect]])

  mcp <- list("Tukey")
  names(mcp) <- effect
  class(mcp) <- "mcp"

  glht_sum <- summary(glht(model, linfct = mcp))
  p <- as.vector(glht_sum$test$pvalues)
  names(p) <- gsub(" ", "", names(glht_sum$test$tstat))

  groups <- multcompLetters(p)
  effects[, 1] <- factor(effects[, 1])
  effects$group <- groups$Letters[as.character(effects[, 1])]

  effects
}


which allows you to do:

mtcars$cyl <- factor(mtcars$cyl)
simple <- lm(mpg ~ wt + cyl, data=mtcars)
effects <- effectsum(simple, "cyl")

library(ggplot2)
qplot(cyl, fit, data=effects, min = lower, max = upper,
geom="pointrange", ylab="Mean effect") + geom_text(aes(label = group,
y = min(lower) - diff(range(lower)) * 0.07))

ggplot(mtcars, aes(x = cyl, y = mpg)) + geom_crossbar(aes(min=lower,
max=upper, y=fit), data=effects, width=0.2) +
geom_point(aes(colour=wt))

This gives lsmeans (aka population marginal means) with their standard
errors, and groups generated from all pairwise comparisons adjusted
for multiple comparisons.

I would love to improve this code: to deal with all factors in a model
automatically.  Additionally, all.effects and multcompLetters are
rather fragile with respect to level names - if you get an error try
removing any non-alphanumeric characters,

Hadley