GLMM- relationship between AICc weight and random effects?
I would add some caution when interpreting the estimated values when using random-intercept only GLMMs for RSFs. The standard errors for the fixed effects do not reflect the animal as the sampling unit (see Schielzeth and Forstmeier 2009). Thus, if your objective is to make inference to the larger population of animals, the standard errors for the fixed effects are far too narrow (i.e. overconfident). It is more appropriate to use 2-stage approaches or GLMMs that use random slopes for the variables of interest. Random-intercept only models are somewhat abused in the RSF literature in this regard, in my opinion......
On Sun, Jul 10, 2016 at 7:06 PM, Ben Bolker <bbolker at gmail.com> wrote:
On 16-07-09 07:20 PM, Teresa Oliveira wrote:
Dear list members, I am developing GLMM's in order to assess habitat selection (using GLMMs' coeficients to construct Resource selection functions). I have
(telemetry)
data from 5 study areas, and each area has a different number of individuals monitored. To develop GLMM's, the dependend variable is binary (1-used locations; 0-available locations), and I have a initial set of 14 continuous
variables
(8 land cover variables; 2 distance variables, to artificial areas and water sources; 4 topographic variables): a buffer was placed around each location and the area of each land cover within that buffer was accounted for; distances were measured from each point to the nearest feature, and topographic variables were obtained using DEM rasters. I tested for correlation using Spearman's Rank, so not all 14 were used in the GLMMs. All variables were transformed using z-score. As random effect, I used individual ID. I thought at the beggining to use study area as a random effect but I only had 5 levels and there was
almost
no variance when that random effect was used. I constructed a GLMM with 9 variables (not correlated) and a random
effect,
then used "dredge()" function and "model.avg(dredge)" to sort models by
AIC
values. This was the result (only models of AICc lower than 2
represented):
[1]Call:
model.avg(object = dredge.m1.1)
Component model call:
glmer(formula = Used ~ <512 unique rhs>, data = All_SA_Used_RP_Area_z,
family =
binomial(link = "logit"))
Component models:
df logLik AICc delta weight
123578 8 -4309.94 8635.89 0.00 0.14
1235789 9 -4309.22 8636.44 0.55 0.10
123789 8 -4310.52 8637.04 1.14 0.08
1235678 9 -4309.75 8637.50 1.61 0.06
12378 7 -4311.78 8637.57 1.67 0.06
1234578 9 -4309.79 8637.58 1.69 0.06
Variables 1 and 2 represent the distance variables; from 3 to 8 land
cover
variables, and 9 is a topographic variable. Weights seem to be very low, even if I average all those models as it seems to be common when delta values are low.
Well as far as we can tell from this, variables 4-9 aren't doing much (on the other hand, variables 1-3 seem to be in all of the top models you've shown us -- although presumably there are a bunch more models that are almost like these, and similar in weight, with other permutations of [123] + [some combination of 456789] ...) Even with this weights, I constructed GLMMs for each of the
combinations, and the results were simmilar for all 6 combinations. Here
are the results for the first one (GLMM + overdispersion + r-squared):
Random effects:
Groups Name Variance Std.Dev.
ID.CODE_1 (Intercept) 13.02 3.608
Number of obs: 32670, groups: ID.CODE_1, 55
Fixed effects:
Estimate Std. Error z value Pr(>|z|),
(Intercept) -0.54891 0.51174 -1.073 0.283433
3 -0.22232 0.04059 -5.478 4.31e-08 ***
5 -0.05433 0.02837 -1.915 0.055460 .
7 -0.13108 0.02825 -4.640 3.49e-06 ***
8 -0.15864 0.08670 -1.830 0.067287 .
1 0.28438 0.02853 9.968 < 2e-16 ***
2 0.11531 0.03021 3.817 0.000135 ***
Residual deviance: 0.256
r.squaredGLMM():
R2m R2c
0.01063077 0.80039950
This is what I get from this analysis:
1) Variance and SD of the random effect seems fine (definitely better
than
the "0" I got when using Study Areas as random effect);
yes -- SD of the random effects is much larger than any of the fixed effects, which means that the differences among individuals are large (presumably that means you have very different numbers of presences for different number of individuals [all individuals sharing a common pool of pseudo-absences ???)
2) Estimate values make sense from what I know of the species and the knowledge I have of the study areas;
Good!
3) Overdispersion values seem good, and R-squared values don't seem very good (at least when considering only fixed effects) but, as I read in several places, AIC and r-squared are not always in agreement.
Overdispersion is meaningless for binary data.
4) Weight values seem very low. Does it mean the models are not good?
It means there are many approximately equivalent models. Nothing in this output tells you very much about absolute goodness of fit (which is tricky for binary data).
Then what I did was construct a GLM ("glm()"), so no random effect was
used. I used the same set of variables used in [1], and here are the
results (only models of AICc lower than 2 represented):
[2] Call:
model.avg(object = dredge.glm_m1.1)
Component model call:
glm(formula = Used ~ <512 unique rhs>, family = binomial(link = "logit"),
data =
All_SA_Used_RP_Area_z)
Component models:
df logLik AICc delta weight
12345678 9 -9251.85 18521.70 0.00 0.52
123456789 10 -9251.77 18523.54 1.84 0.21
1345678 8 -9253.84 18523.69 1.99 0.19
In this case, weight values are higher.
Does this mean that it is better not to use a random effect? (I am not
sure
I can compare GLMM with GLM results, correct me if I am doing wrong assumptions)
No. You could do a likelihood ratio test with anova(), but note that the AICc values for the glm() fits are 10,000 (!!) units higher than the glmer fits. While it will potentially greatly complicate your life, I think you should at least *consider* interactions between your environment variables and ID, i.e. allow for the possibility that different individuals respond differently to habitat variation. Ben Bolker
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Craig DeMars, Ph.D. Postdoctoral Fellow Department of Biological Sciences University of Alberta Phone: 780-221-3971 [[alternative HTML version deleted]]