How do you report lmer results?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Dear R-Gurus

I am a PhD student from South Africa working on chimpanzee behaviour.
I am looking at patterns of shade utilization and am using generalized
linear mixed models to examine the effects of various factors on
whether chimpanzees choose to spend time in the sun or shade.
OK.  Are these binary (sun vs shade) outcomes?

 I
realise that the lme4 package and the outputs of the lmer functions
have been discussed ad nauseum but I have been reading through many of
them and am finding it all extremely confusing. I have used programs
like Statistica to run glm's with no random factors but now that I
have to include random effects, this is no longer an option. Thus I
have turned to R (and hence I am a complete R virgin).
This is fine; you should mention that you previously tried on r-help
(this is a more appropriate list anyway).
What I would like to know is the following. What is the accepted
general consensus on how to report the outputs of a lmer model? What
is the currently accepted method for determining whether fixed effect
parameters are significant in predicting the outcomes of the model
(LHR, AIC, Wald X^2...?)? While I recognise that the "Pr(>|z|)" value
is not a definitive p-value (rather an approximation), can one treat
it loosely as an 'estimated' p-value?
Hard to say.  You could try searching google scholar for references to
"lmer" or "lme4", for a start, although there aren't a lot of hits
there.  I would recommend Bolker et al _Trends in Ecology and Evolution_
2008, too (of course); it might also be useful to look in Zuur's book on
mixed models.
My model comprises 2 categorical predictor variables (Time of day:
'Time'; Available amount of shade, coded as a three-way
classification: 'Tertile'), two continuous predictor variables
(maximum temperature: 'Max'; minimum temperature: 'Min') and three
random effects (Which experimental dataset the data were derived from:
'Exp'; Which individual chimpanzee was observed: 'Indiv'; Which
area/zone of the enclosure they occupied at the time of observation:
'Zone'). These are the outputs that I have generated thus far using
LHR testing. How should I be interpretting and reporting these
outputs?

Generalized linear mixed model fit by the Laplace approximation
Formula: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone)
+      Max + Min
   Data: sdata
   AIC BIC logLik deviance
 215.5 259 -95.77    191.5
Random effects:
 Groups Name        Variance   Std.Dev.
 Zone   (Intercept) 2.6596e-01 5.1571e-01
 Indiv  (Intercept) 0.0000e+00 0.0000e+00
 Exp    (Intercept) 2.9021e-11 5.3871e-06
One thing that's clear here is that your model is a bit overfitted --
you are getting zero or near-zero variances for both 'indiv' and 'exp'.
 That doesn't necessarily mean anything is wrong, but you might want to
test whether the model performs differently if you drop these random
effects, or whether it gives significantly different results with
different combinations of random effects (again, unlikely but worth
checking).

  It's also not surprising that you get zero variance for Exp, with only
two levels.  The numbers of random-effects levels here are very small --
7 or 8 is probably the *minimum* you could expect to work.
Opinions differ (see <http://glmm.wikidot.com/faq>), but I would suggest
you make Exp a fixed factor ...
Number of obs: 276, groups: Zone, 8; Indiv, 7; Exp, 2
Fixed effects:
              Estimate Std. Error z value Pr(>|z|)
(Intercept)   -2.15725    1.58304  -1.363  0.17297
Time11h00      0.96362    0.40956   2.353  0.01863 *
Time12h00      1.57906    0.49033   3.220  0.00128 **
Time13h00      1.58951    0.40705   3.905 9.43e-05 ***
Time14h00      1.07939    0.53876   2.003  0.04513 *
TertileLOW    -1.40906    0.53761  -2.621  0.00877 **
TertileMEDIUM -1.24862    0.57396  -2.175  0.02960 *
Max            0.10122    0.08611   1.175  0.23985
Min            0.13439    0.10292   1.306  0.19162
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
            (Intr) Tm1100 Tm1200 Tm1300 Tm1400 TrtLOW TMEDIU Max
Time11h00    0.056
Time12h00    0.258  0.447
Time13h00    0.115  0.510  0.486
Time14h00   -0.049  0.318  0.276  0.370
TertileLOW  -0.146 -0.119 -0.215 -0.236 -0.096
TertlMEDIUM -0.128  0.024 -0.145 -0.155 -0.224  0.707
Max         -0.914 -0.162 -0.277 -0.198 -0.084 -0.025 -0.022
Min          0.178  0.074 -0.023  0.105  0.244 -0.101 -0.077 -0.463

anova(m1,m2)
Data: sdata
Models:
m2: prop ~ Time + (1 | Exp) + (1 | Indiv) + (1 | Zone) + Max + Min
m1: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m1:     Max + Min
   Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
m2 10 216.72 252.92 -98.359
m1 12 215.55 258.99 -95.773 5.1721      2    0.07532 .
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

anova(m1,m3)
Data: sdata
Models:
m3: prop ~ Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) + Max +
m3:     Min
m1: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m1:     Max + Min
   Df    AIC    BIC   logLik  Chisq Chi Df Pr(>Chisq)
m3  8 226.11 255.08 -105.057
m1 12 215.55 258.99  -95.773 18.567      4  0.0009556 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

anova(m1,m4)
Data: sdata
Models:
m4: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m4:     Min
m1: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m1:     Max + Min
   Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
m4 11 214.81 254.64 -96.407
m1 12 215.55 258.99 -95.773 1.2672      1     0.2603

anova(m1,m5)
Data: sdata
Models:
m5: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m5:     Max
m1: prop ~ Time + Tertile + (1 | Exp) + (1 | Indiv) + (1 | Zone) +
m1:     Max + Min
   Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
m5 11 215.22 255.05 -96.613
m1 12 215.55 258.99 -95.773 1.6792      1      0.195

As I understand this output, the only significant predictor in the
model appears to be time of day. But, I don't really know how this
should be reported. Can you point me to some papers or examples where
lmer outputs have been reported formally? Any help that you could
offer would be MOST appreciated.
I would try drop1() [works with version ...-40 of lme4, I believe] for
more convenient output.

  What you are doing is reasonable, although likelihood ratio test
statistics are probably going to be somewhat liberal (anticonservative)
for "small" sample sizes.  *If* all the predictors were represented
equally in all the random-effect levels (i.e. individuals, zones,
experimental blocks all had similar levels of Tertile, Min/Max temp,
time) then this would be close to a randomized-block design, and in the
classical sense you would have a fairly large residual N (as opposed to
a nested design, where at least some of the treatments don't differ
within blocks).  LRT is probably appropriate with a caveat.  If you are
paranoid, use parametric bootstrapping (see the examples for
?"simulate-mer").
Sincerely (in desperation)

Luke Duncan

PhD Candidate
School of Animal, Plant and Environmental Sciences
University of the Witwatersrand
Johannesburg, South Africa

+27 11 717 6452

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4xZjIACgkQc5UpGjwzenM5QACfSfnnneERl3Q4NN936N8i2/cl
TM8AniEQeCArTkuSf18p8qoTepFdEVQC
=Xc5e
-----END PGP SIGNATURE-----

How do you report lmer results?

Thread (3 messages)