GLMM Q: Categorical fixed effects with more than 2 levels
On 2012-07-22 01:19, Paul Johnson wrote:
On Sat, Jul 21, 2012 at 9:24 AM, Julie Kern <juliekern27 at googlemail.com> wrote:
I?m running a lmm (with lme) to investigate the influence of several fixed effects on call rate. The model also includes the random terms of group & individual identity. Several of my fixed effects are categorical & some have more than 2 categories (up to 5). E.g. The fixed effect habitat contains the categories ?open?, ?medium? & ?dense?. The model has shown habitat to be a significant predictor of call rate, but I would now like to test the differences between these. E.g. Are call rates in open, medium & dense habitats different from each other? I am unsure how best to go about this as I still have to take into account repeated measures from the same individuals & groups so don?t think I can simply use a Wilcoxon signed-rank test.
I don't think there is an 'honestly significant difference' test that can check whether the 3 are different from each other. However, I
Assuming a factor X with three levels called "A", "B" and "C". Isn't
that just the simultaneous test of
H0_A: A = (B + C) / 2 <=> A - (B + C) / 2 = 0
H0_B: B = (A + C) / 2 <=> B - (A + C) / 2 = 0
H0_C: C = (A + B) / 2 <=> C - (A + B) / 2 = 0
? If so, that could easily be set-up using the 'multcomp' package:
> library("nlme")
> library("multcomp")
Loading required package: mvtnorm
Loading required package: survival
Loading required package: splines
> fm <- lme(yield ~ 0 + Variety, random = ~ nitro | Block, data = Oats)
> K <- rbind("GR vs (M, V)" = c( 1, -1/2, -1/2),
+ "M vs (GR, V)" = c(-1/2, 1, -1/2),
+ "V vs (GR, M)" = c(-1/2, -1/2, 1))
> fm.glht <- glht(fm, linfct = mcp(Variety = K))
> set.seed(123) # for reproducibility
> summary(fm.glht, test = adjusted("free")) # see ?adjusted
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: User-defined Contrasts
Fit: lme.formula(fixed = yield ~ 0 + Variety, data = Oats, random =
~nitro |
Block)
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
GR vs (M, V) == 0 0.7917 3.9024 0.203 0.8392
M vs (GR, V) == 0 8.7292 3.9024 2.237 0.0470 *
V vs (GR, M) == 0 -9.5208 3.9024 -2.440 0.0391 *
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
(Adjusted p values reported -- free method)
So, "Golden Rain" does not differ significantly from the mean of the
others, "Marvellous" has significantly higher yield than mean of the
others, and "Victory" has significantly lower yield than mean of the
others. This seems consistent with
> boxplot(yield ~ Variety, data = Oats)
think you could ask a more specific comparison, such as "does it make
a difference if I assume the effects of open and medium are the same
as the effect of dense"?
You can create a new factor variable that recodes open and medium to
be the same thing. I find the procedure to do that is difficult to
describe to my students. To make that easier, I have a little package
of regression tools called "rockchalk". It has in there a function
combineLevels. you could run (if dat is the data frame)
dat$hab2 <- combineLevels(dat$habitat, levs = c("open", "medium"),
newLabel = c("oOrMed"))
if you run that model again, replacing habitat by hab2, then the anova
function will calculate a likelihood ratio test to check if you have
lost explanatory power by fixing the coefficients of open and medium
to be the same.
now, if you have a 5 level variable, say
c("lo","lom","medium","mhi","hi"), and you want to reduce the model
to a comparison that combines "lo" and "lom" versus
"medium","mhi","hi", you have to run combineLevels twice, first
putting together "lo" and "lom" and then "medium", "mhi", "hi".
You aren't required to do that with rockchalk, but otherwise you have
a somewhat conceptually difficult series of steps.
1. Take the levels of the existing variable,
2. Add the new level for the combined category
3. Create a new factor variable that uses the new levels object
4. Reassign the cases to the new label.
Huh?
> a <- gl(5, 2, labels = c("lo2", "lo1", "mid", "hi1", "hi2"))
> a
[1] lo2 lo2 lo1 lo1 mid mid hi1 hi1 hi2 hi2
Levels: lo2 lo1 mid hi1 hi2
> levels(a) <- c("lo", "lo", "mid+hi", "mid+hi", "mid+hi")
> a
[1] lo lo lo lo mid+hi mid+hi mid+hi mid+hi mid+hi mid+hi
Levels: lo mid+hi
>
This seems pretty straightforward. Honestly, what's "conceptually
difficult" here?
HTH,
Henric
If the factor is ordinal, then this is a little spicy to splice the new one into the levels list at the right spot. But you can see how if you read the code for combineLevels.
Any advice would be greatly appreciated, I'm new to R & have spent a long time searching for this on the web but am going round in ever more confusing circles! Thank you! Julie
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models