Interpretation of GLMM output in R
Dear Yvonne, To answer Question 1, I would scale both predictors to unit standard deviation (ie fad.data$Volume_std<-scale(fad.data$Volume,scale=TRUE)) and then compare the back-transformed coefficients, as you have a Poisson model this would be then exp(fixef(fad)). Say that you get a value of 1.5 for the variable Volume, you would then interpret this as: an increase in one standard deviation of Volume leads to an increase of 1.5*Pollinator, put differently every increase of one standard deviation in Volume increase Pollinator by 50%. Having scaled the predictor would also allow you to compare the strength of their effects. Also note that effect sizes do not give informations about how much variance in the response variable is explained by individuals terms. Concerning the LRT to test the effect of predictors, it is not the best available option (see http://glmm.wikidot.com/faq#summary) the function "mixed" in the "afex" package allow you to do parametric bootstrap to test fixed effects (using type 2 or 3 tests). For effects with very small p-values (like volume) using LRT or parametric bootstrap will most certainly lead to the same interpretation (ie volume has a significant effect), for p-values close to 0.05 (like parasitoids) you might get different interpretations. Sincerely yours, Lionel
On 02/08/2015 11:31, Yvonne Hiller wrote:
Dear Paul, Thierry and list members,
first of all:
Thanks so much for going through my questions, although it came in an
inappropriate look with wired brackets and unspecified questions. Apologies for that. So I
try again to specify things.
The model includes following variables:
Pollinator = pollinator offspring
Parasitoids = parasitoid offspring
Volume = measured fig size
tree = tree ID
This is the output of the model:
fad <-glmer(Pollinator~Parasitoids+volume+(1|tree),data=fad.data,family="poisson",verbose=TRUE)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: poisson ( log )
Formula: Pollinator ~ Parasitoids + volume + (1 | tree)
Data: fad.data
AIC BIC logLik deviance df.resid
1550.6 1560.2 -771.3 1542.6 76
Scaled residuals:
Min 1Q Median 3Q Max
-7.4825 -2.4111 -0.4725 2.4321 9.3409
Random effects:
Groups Name Variance Std.Dev.
tree (Intercept) 0.1063 0.3261
Number of obs: 80, groups: tree, 8
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.3167832 0.1634876 14.171 <2e-16 ***
Parasitoids -0.0046908 0.0022169 -2.116 0.0344 *
volume 0.0069527 0.0006197 11.220 <2e-16 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) Prstds
Parasitoids -0.144
volume -0.656 -0.139
1. I want to know if the predictor variables have an effect on pollinator
offspring. Therefore, I tested if ?Parasitoids? and ?volume? are important factors in the model by the LRT.
fad1 <- glmer(Pollinator~1+volume+(1|tree),data=fad.data,family="poisson",verbose=TRUE)
fad2 <- glmer(Pollinator~Parasitoids+1+(1|tree),data=fad.data,family="poisson",verbose=TRUE)
anova(fad,fad1)
Models:
fad1: Pollinator ~ 1 + volume + (1 | tree)
fad: Pollinator ~ Parasitoids + volume + (1 | tree)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
fad1 3 1553.1 1560.3 -773.57 1547.1
fad 4 1550.6 1560.2 -771.32 1542.6 4.5105 1 0.03369 *
anova(fad,fad2)
Models:
fad2: Pollinator ~ Parasitoids + 1 + (1 | tree)
fad: Pollinator ~ Parasitoids + volume + (1 | tree)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
fad2 3 1669.2 1676.4 -831.61 1663.2
fad 4 1550.6 1560.2 -771.32 1542.6 120.59 1 < 2.2e-16 ***
From the output, I conclude that both predictor variables have an effect on pollinator offspring. Is it reasonable to do the LRT for testing the predictor variables? Is there any chance to get an effect size of how much the predictor variables explain variance in the response variable? Since I know they have an effect, interpreation would be different when the effect of one variable is small or large.
2. From suggestions of Thierry and Paul and from the the caterpillar plot: dotplot(ranef(fad, condVar=TRUE))$tree I do understand that there is indeed a fairly large remaining variability in the number of pollinator offspring among trees, but the relative effect of ?tree? does not depend on ?Parasitoids? and ?volume?. Again, I would highly appreciate your suggestions and feedback on this. Thank you very much in advance. Best wishes,Yvonne
________________________________
Date: Sun, 2 Aug 2015 00:08:35 +0200
Subject: Re: [R-sig-ME] Interpretation of GLMM output in R
From: thierry.onkelinx at inbo.be
To: pauljohn32 at gmail.com
CC: yvonne.hiller at hotmail.de; r-sig-mixed-models at r-project.org
Paul,
Note that exp(beta0 + beta1*parasitoids + btree) is equivalent to
exp(beta0 + beta1*parasitoids)*exp(btree). So the relative effect of
tree doesn't depend on the other effects.
I tend to look at the ratio of the 97.5% and 2.5% quantiles of the
random effect. exp(q*sigma)/exp(-q*sigma) or simplified exp(2*q*sigma)
with q =1.96 (97.5% quantile of a normal distribution) and sigma= the
standard deviation of the random effect. exp(2*1.96*0.3261) ~ 3.5 So
the 97.5% quantile tree has about 3.5 times the number of pollinators
of the 2.5% quantile tree.
Best regards,
Thierry
Op 1-aug.-2015 15:43 schreef "Paul Johnson"
<pauljohn32 at gmail.com<mailto:pauljohn32 at gmail.com>>:
Hi
Your message comes through with weird line breaks, you should turn off
the HTML compose option in your mail program, just write text.
On Fri, Jul 31, 2015 at 8:42 AM, Yvonne Hiller
<yvonne.hiller at hotmail.de<mailto:yvonne.hiller at hotmail.de>> wrote:
Dear Ben Bolker and list members,
I?am a PhD-student working on tropical plant-pollinator
interactions (the fig-fig wasp mutualism).
I have some difficulties with my analyses of my data
using lmer (family = Poisson). I have read a lot of threads and
searched for
solutions in Zuur 2009, though it was not completely satisfying. I
looked at
other papers using GLMM (espeacially for the fig-fig wasp mutualism),
but there
were so many different ways in reporting and interpreting p-values,
AICs etc.
Therefore, I would be very grateful, if you could go through my output and
answer my questions to hopefully fully understand GLMM. Thank you so
much in
advance.
One of my questions is:
Have parasitoids (offspring) and the volume size an
effect on the number of pollinator offspring?
As pollinators are parasitized by parasitoids, I would
expect a negative impact on pollinator offspring. As the size of the
fig fruits
might be an additionally factor (due to limited oviposition sites
inside the fruit
or due to the ovipositor length of a parasitoid is to short to reach
inner most
ovules galled by pollinators), I included it in the model.
I have poisson data (counts):
Pollinator
= pollinator offspring
Parasitoids = parasitoid offspring
volume = fig fruit size measurement
tree = Tree ID
I have data from trees in the forest and in the
farmland, as well as data of the rainy and dry season. I have chosen
to perform
lmer for each season and habitat separately. For each tree, I have
collected 10
fig fruits to count offspring of wasps at the trophic level (pollinator,
parasitoid) and to measure the size of the fruit (volume). As I have 10 fig
fruits per tree, I would use tree as a random effect, to account for
unmeasured
variance between trees. I have found different approaches to that in
literature. For instance: We did not include? tree ? and ?
syconium ? as random factors, because wasps were free to move between
syconia
on the same tree, and between trees.
Therefore,
I am not sure if I have to use GLMM with trees as a random effect or
if it?s
also possible to use GLM without the random effect.
The inclusion of
fig volume in this manner removed the need to use ?fig? as an
additional nested
factor within ?tree?. Is that right?
Here, I
present the model with the random effect.
So, lets start with the model of the farmland during
the dry season:
fad <-
lmer(Pollinator~Parasitoids+volume+(1|tree),family="poisson",verbose=TRUE)
summary (fad)
Generalized linear mixed model fit by the Laplace
approximation
Formula: Pollinator ~ Parasitoids + volume + (1 |
tree)
Data:
fad.data
AIC BIC logLik deviance
1166 1175
-578.9 1158
Random effects:
Groups
Name Variance Std.Dev.
tree (Intercept) 0.10634 0.3261
Number of obs: 80, groups: tree, 8
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept)
2.3167867 0.1632747 14.190
<2e-16 ***
Parasitoids -0.0046908
0.0021686 -2.163 0.0305 *
volume
0.0069525 0.0006108 11.383
<2e-16 ***
---
Signif.
codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05
?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) Prstds
Parasitoids -0.160
volume
-0.657 -0.107
1. Random effects: What does the Random
Effect table - the Variance, Std. Dev. and Intercept tells me?
Your model assumes that the outcome is Poisson with expected value
exp(beta0 + beta1*parasitoids + btree)
btree is a unique added amount for each tree. The estimate of the
number btree's variance across trees is 0.1.
What that 0.1 means in terms of the predicted outcome? Well, that
mostly depends on how big beta0 + beta1*parasitoids is. If that
number is huge, say 1000, then adding a thing with variance 0.1 won't
matter much.
On the other hand, if it is 0.01, then the random effect at the tree
level is very large, compared to the systematic components in your
model. When the link function gets applied, the distribution of
outcomes changes in an interesting way.
If you run ranef(), it will spit out the estimates of the random
differences among trees (btree "BLUPS"). If you run the predict
method, you can see how those map out to predicted values (exp(beta0 +
beta1 parasitoids + btree)
So it says that the
between?tree variability is fairly large. But I don?t understand to what it
relates. Does it mean that pollinator offspring variance is high between
trees and might be overestimating the explanatory variables?
2. What can I
conclude from the model regarding the fixed effects and how to report
about it
(with or without p-values, z-values, estimates)?
So of what I know is that
the p-value is only a guide and that it is rather a comparison of two
models.
What are the two models and can I compare them.
I am puzzled why you see p values at all. In the version of lme4 I'm
running now, I don't see p values.
Lets compare versions, since I'm pretty sure p values were removed
quite a while ago.
sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lme4_1.1-8 Matrix_1.2-2
loaded via a namespace (and not attached):
[1] grid_3.1.2 lattice_0.20-33 MASS_7.3-43 minqa_1.2.4
[5] nlme_3.1-121 nloptr_1.0.4 Rcpp_0.12.0 splines_3.1.2
[9] tools_3.1.2
Anyway...
If you had a huge sample, those p values would be accurate.
You have a small sample, there are other, more computationally
intensive ways, to get p values. Read the Jrnl Stat Software paper b
y the lmer team, they describe profiling and bootstrapping. You have
small enough sample, could do either one.
4. What does the
correlation of fixed effects tells me?
It is a hint about multicollinearity & numerical instability, so far as
I know.
5. Is it right
that the estimates tells me whether the relation of the fixed effects to
pollinator offspring is positive or negative?
Best way to get answer is to plot the predicted values from the model.
Use the predict function to plot for various values of the predictor.
6. Can I
calculate an effect size on each explanatory variable?
Only if you think the term "effect size" is meaningful. And if you
have a formula for one. In my experience with consulting here, it
means anything the researcher wants to call a summary number.
I've come to loathe the term because somebody in the US Dept. of
Education mandated all studies report standardized effect sizes,
forcing everybody to make Herculean assumptions about all kinds of
model parameters to get Cohen's D or whatnot.
I would highly appreciate your feedback on this.
Thanks so much in advance.
Good luck. Next time, use a text only email composer and try to ask 1
specific question. You are more likely to get attention if people can
easily read the message and see what you want. This one was difficult
to read (for me at least) and also somewhat vague.
Best wishes,
Yvonne Hiller
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>
mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Paul E. Johnson
Professor, Political Science Director
1541 Lilac Lane, Room 504 Center for Research Methods
University of Kansas University of Kansas
http://pj.freefaculty.org http://crmda.ku.edu
_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>
mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models