GLMM Q: Categorical fixed effects with more than 2 levels

On Sat, Jul 21, 2012 at 9:24 AM, Julie Kern <juliekern27 at googlemail.com> wrote:
I?m running a lmm (with lme) to investigate the influence of several
fixed effects on call rate. The model also includes the random terms
of group & individual identity. Several of my fixed effects are
categorical & some have more than 2 categories (up to 5). E.g. The
fixed effect habitat contains the categories ?open?, ?medium? &
?dense?. The model has shown habitat to be a significant predictor of
call rate, but I would now like to test the differences between these.
E.g. Are call rates in open, medium & dense habitats different from
each other? I am unsure how best to go about this as I still have to
take into account repeated measures from the same individuals & groups
so don?t think I can simply use a Wilcoxon signed-rank test.
I don't think there is an 'honestly significant difference' test that
can check whether the 3 are different from each other.  However, I
Assuming a factor X with three levels called "A", "B" and "C".  Isn't 
that just the simultaneous test of

H0_A: A = (B + C) / 2  <=>  A - (B + C) / 2 = 0
H0_B: B = (A + C) / 2  <=>  B - (A + C) / 2 = 0
H0_C: C = (A + B) / 2  <=>  C - (A + B) / 2 = 0

?  If so, that could easily be set-up using the 'multcomp' package:

 > library("nlme")
 > library("multcomp")
Loading required package: mvtnorm
Loading required package: survival
Loading required package: splines
 > fm <- lme(yield ~ 0 + Variety, random = ~ nitro | Block, data = Oats)
 > K <- rbind("GR vs (M, V)" = c(   1, -1/2, -1/2),
+            "M vs (GR, V)" = c(-1/2,    1, -1/2),
+            "V vs (GR, M)" = c(-1/2, -1/2,    1))
 > fm.glht <- glht(fm, linfct = mcp(Variety = K))
 > set.seed(123) # for reproducibility
 > summary(fm.glht, test = adjusted("free")) # see ?adjusted

          Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: User-defined Contrasts

Fit: lme.formula(fixed = yield ~ 0 + Variety, data = Oats, random = 
~nitro |
     Block)

Linear Hypotheses:
                   Estimate Std. Error z value Pr(>|z|)
GR vs (M, V) == 0   0.7917     3.9024   0.203   0.8392
M vs (GR, V) == 0   8.7292     3.9024   2.237   0.0470 *
V vs (GR, M) == 0  -9.5208     3.9024  -2.440   0.0391 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
(Adjusted p values reported -- free method)

So, "Golden Rain" does not differ significantly from the mean of the 
others, "Marvellous" has significantly higher yield than mean of the 
others, and "Victory" has significantly lower yield than mean of the 
others.  This seems consistent with

 > boxplot(yield ~ Variety, data = Oats)
think you could ask a more specific comparison, such as "does it make
a difference if I assume the effects of open and medium are the same
as the effect of dense"?

You can create a new factor variable that recodes open and medium to
be the same thing.  I find the procedure to do that is difficult to
describe to my students.  To make that easier, I have a little package
of regression tools called "rockchalk".  It has in there a function
combineLevels.  you could run (if dat is the data frame)

dat$hab2 <- combineLevels(dat$habitat, levs = c("open", "medium"),
newLabel = c("oOrMed"))

if you run that model again, replacing habitat by hab2, then the anova
function will calculate a likelihood ratio test to check if you have
lost explanatory power by fixing the coefficients of open and medium
to be the same.

now, if you have a 5 level variable, say
c("lo","lom","medium","mhi","hi"),  and you want to reduce the model
to a comparison that combines "lo" and "lom" versus
"medium","mhi","hi", you have to run combineLevels twice, first
putting together "lo" and "lom" and then "medium", "mhi", "hi".

You aren't required to do that with rockchalk, but otherwise you have
a somewhat conceptually difficult series of steps.

1. Take the levels of the existing variable,
2. Add the new level for the combined category
3. Create a new factor variable that uses the new levels object
4. Reassign the cases to the new label.
Huh?

 > a <- gl(5, 2, labels = c("lo2", "lo1", "mid", "hi1", "hi2"))
 > a
  [1] lo2 lo2 lo1 lo1 mid mid hi1 hi1 hi2 hi2
Levels: lo2 lo1 mid hi1 hi2
 > levels(a) <- c("lo", "lo", "mid+hi", "mid+hi", "mid+hi")
 > a
  [1] lo     lo     lo     lo     mid+hi mid+hi mid+hi mid+hi mid+hi mid+hi
Levels: lo mid+hi
 >

This seems pretty straightforward.  Honestly, what's "conceptually 
difficult" here?

HTH,
Henric
If the factor is ordinal, then this is a little spicy to splice the
new one into the levels list at the right spot.  But you can see how
if you read the code for combineLevels.

Any advice would be greatly appreciated, I'm new to R & have spent a
long time searching for this on the web but am going round in ever
more confusing circles!

Thank you!

Julie

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

GLMM Q: Categorical fixed effects with more than 2 levels

Thread (3 messages)