GLMM- relationship between AICc weight and random effects?

Mon, Jul 11, 2016 2:53 AM

Regarding SD: yes, I have several individuals with different number of
locations between them. But, for instance, when I consider study areas, I
have a very low variance (usually between 0-0.2, and SD is also 0 or, for
instance, 0.3, when variance is 0.1). Shouldn't I consider study areas as
random effect? I mean, is the "variance" value enough to include/exclude a
random effect?

How should I test for goodness of fit?

Regarding interactions: yes, I may construct models including interactions
between ID and other variables, but first I think it is better to find a
"strong" global model, and find which variables are most important.


Thank you very much for your help!!
Teresa

2016-07-11 2:06 GMT+01:00 Ben Bolker <bbolker at gmail.com>:


On 16-07-09 07:20 PM, Teresa Oliveira wrote:

Dear list members,

I am developing GLMM's in order to assess habitat selection (using GLMMs'
coeficients to construct Resource selection functions). I have

(telemetry)

data from 5 study areas, and each area has a different number of
individuals monitored.

To develop GLMM's, the dependend variable is binary (1-used locations;
0-available locations), and I have a initial set of 14 continuous

variables

(8 land cover variables; 2 distance variables, to artificial areas and
water sources; 4 topographic variables): a buffer was placed around each
location and the area of each land cover within that buffer was accounted
for; distances were measured from each point to the nearest feature, and
topographic variables were obtained using DEM rasters. I tested for
correlation using Spearman's Rank, so not all 14 were used in the GLMMs.
All variables were transformed using z-score.

As random effect, I used individual ID. I thought at the beggining to use
study area as a random effect but I only had 5 levels and there was

almost

no variance when that random effect was used.

I constructed a GLMM with 9 variables (not correlated) and a random

effect,

then used "dredge()" function and "model.avg(dredge)" to sort models by

AIC

values. This was the result (only models of AICc lower than 2

represented):

[1]Call:
model.avg(object = dredge.m1.1)

Component model call:
glmer(formula = Used ~ <512 unique rhs>, data = All_SA_Used_RP_Area_z,
family =
     binomial(link = "logit"))

Component models:
          df   logLik    AICc  delta weight
123578     8 -4309.94 8635.89   0.00   0.14
1235789    9 -4309.22 8636.44   0.55   0.10
123789     8 -4310.52 8637.04   1.14   0.08
1235678    9 -4309.75 8637.50   1.61   0.06
12378      7 -4311.78 8637.57   1.67   0.06
1234578    9 -4309.79 8637.58   1.69   0.06

Variables 1 and 2 represent the distance variables; from 3 to 8 land

cover

variables, and 9 is a topographic variable. Weights seem to be very low,
even if I average all those models as it seems to be common when delta
values are low.

Well as far as we can tell from this, variables 4-9 aren't doing much
(on the other hand, variables 1-3 seem to be in all of the top models
you've shown us -- although presumably there are a bunch more models
that are almost like these, and similar in weight, with other
permutations of [123] + [some combination of 456789] ...)


Even with this weights, I constructed GLMMs for each of the

combinations, and the results were simmilar for all 6 combinations. Here
are the results for the first one (GLMM + overdispersion + r-squared):

Random effects:
 Groups    Name        Variance Std.Dev.
 ID.CODE_1 (Intercept) 13.02    3.608
Number of obs: 32670, groups:  ID.CODE_1, 55

Fixed effects:
            Estimate Std. Error z value Pr(>|z|),
(Intercept) -0.54891    0.51174  -1.073 0.283433
3       -0.22232    0.04059  -5.478 4.31e-08 ***
5       -0.05433    0.02837  -1.915 0.055460 .
7       -0.13108    0.02825  -4.640 3.49e-06 ***
8       -0.15864    0.08670  -1.830 0.067287 .
1         0.28438    0.02853   9.968  < 2e-16 ***
2         0.11531    0.03021   3.817 0.000135 ***
Residual deviance: 0.256
r.squaredGLMM():
       R2m        R2c
0.01063077 0.80039950
This is what I get from this analysis:

1) Variance and SD of the random effect seems fine (definitely better

than

the "0" I got when using Study Areas as random effect);

  yes -- SD of the random effects is much larger than any of the fixed
effects, which means that the differences among individuals are large
(presumably that means you have very different numbers of presences for
different number of individuals [all individuals sharing a common pool
of pseudo-absences ???)

2) Estimate values make sense from what I know of the species and the
knowledge I have of the study areas;

  Good!

3) Overdispersion values seem good, and R-squared values don't seem very
good (at least when considering only fixed effects) but, as I read in
several places, AIC and r-squared are not always in agreement.

  Overdispersion is meaningless for binary data.

4) Weight values seem very low. Does it mean the models are not good?

  It means there are many approximately equivalent models.  Nothing in
this output tells you very much about absolute goodness of fit (which is
tricky for binary data).

Then what I did was construct a GLM ("glm()"), so no random effect was
used. I used the same set of variables used in [1], and here are the
results (only models of AICc lower than 2 represented):

[2] Call:
model.avg(object = dredge.glm_m1.1)

Component model call:
glm(formula = Used ~ <512 unique rhs>, family = binomial(link = "logit"),
data =
     All_SA_Used_RP_Area_z)

Component models:
          df   logLik     AICc   delta weight
12345678   9 -9251.85 18521.70    0.00   0.52
123456789 10 -9251.77 18523.54    1.84   0.21
1345678    8 -9253.84 18523.69    1.99   0.19

In this case, weight values are higher.

Does this mean that it is better not to use a random effect? (I am not

sure

I can compare GLMM with GLM results, correct me if I am doing wrong
assumptions)

  No.  You could do a likelihood ratio test with anova(), but note that
the AICc values for the glm() fits are 10,000 (!!) units higher than the
glmer fits.

  While it will potentially greatly complicate your life, I think you
should at least *consider* interactions between your environment
variables and ID, i.e. allow for the possibility that different
individuals respond differently to habitat variation.

  Ben Bolker

GLMM- relationship between AICc weight and random effects?

Thread (7 messages)