Skip to content

Random effect 'overlap' with fixed effect

2 messages · Chung-Huey Wu, Thierry Onkelinx

#
Hello all,

This is my first post in r-sig-ME. I have benefited and learned a lot from
the threads here. In this email I would like to ask for comments on the
correct way to specify my random effect term (sampling site), which
'overlap' with the fixed effect of main interest (altitude).

I am now using nlme, lme4, and glmmADMB to analyze altitudinal survey data
of plant traits.

Below is the structure of the data I am now have trouble analyzing:
=======================================
a) Research design: In every Jan, Apr, Jul, Oct from Jul-2012 to Oct-2015,
we visited the 3 sites at low altitude and 3 sites at medium altitude,
haphazardly selected 6 individual plants of the focal species, and measured
traits and # herbivore for each individual.

b) Hypothesis to test:  Does the trait/# herbivore vary across altitude and
month? Are there interactions?

c) Notations:
alt (altitude): L (low) vs. M (medium); a 2-level factor.
m (month): Jan, Apr, Jul, Oct; a 4-level factor.
site (sampling site): 3 at low- and 3 at medium-altitude Taiwan; a 6-level
factor with unique location names.
y (year of survey): 2012-2015; a 4-level factor.
--------
C_count (# chewing herbivore):  discrete (count) response variable. Many
zeros, mostly 1-3, a few large values (10,17,22).
mT (physical leaf toughness): continuous response variable.


d) My model structure and outputs:
mod_T = lme(mT ~ m + alt + m:alt, random = ~ 1|site/y,
                 weights = varIdent(form = ~1|alt),
                 contrasts = c(list(m=contr.sum,alt=contr.sum)),
                 na.action = "na.omit", data = dat))
#### weights model form (1|alt) is AIC-selected to control
for heteroskedasticity.

Linear mixed-effects model fit by REML
 Data: dat
       AIC      BIC    logLik
  3533.131 3591.853 -1752.565

Random effects:
Formula: ~1 | site
             (Intercept)
StdDev:    5.603294

Formula: ~1 | y %in% site
               (Intercept)  Residual
StdDev:    4.142521  9.003203

Variance function:
 Structure: Different standard deviations per stratum
 Formula: ~1 | m
 Parameter estimates:
     July       Oct       Jan       Apr
1.0000000 0.7836473 0.7696658 1.0022467

Fixed effects: mT ~ m + alt + m:alt
                  Value     Std.Error  DF    t-value      p-value
(Intercept) 39.38305  3.590926  468 10.967380  0.0000
mApr         7.57426   1.548235  468  4.892190   0.0000
mJuly        8.44167   1.457696  468  5.791103   0.0000
mOct         4.19561   1.300168  468  3.226973   0.0013
altM          -2.72082   5.093589   4   -0.534166   0.6215
mApr:altM   -2.31793  2.224683 468 -1.041916   0.2980
mJuly:altM  -3.14538  2.098787 468 -1.498666   0.1346
mOct:altM   -3.09390  1.880431 468 -1.645314   0.1006
 Correlation:
           (Intr) mApr   mJuly  mOct   altM   mApr:M mJly:M
mApr       -0.160
mJuly      -0.191  0.394
mOct       -0.214  0.442  0.527
altM       -0.705  0.113  0.135  0.151
mApr:altM   0.111 -0.696 -0.274 -0.307 -0.171
mJuly:altM  0.133 -0.274 -0.695 -0.366 -0.201  0.414
mOct:altM   0.148 -0.305 -0.364 -0.691 -0.225  0.462  0.546

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max
-2.49626272 -0.70206429 -0.03138199  0.59483874  3.00467177

Number of Observations: 498
Number of Groups:
       site y %in% site
          6          24

------------------------------------------------------------------------------------------
mod_C = glmmadmb(C_count ~ m + alt + m:alt + (1|site/y),
                    family = "nbinom",
                    zeroInflation = TRUE, data= dat)
#### AIC(nbinom model) << AIC(poisson model), so use nbinom

AIC: 686.6

Coefficients:
                   Estimate Std. Error  z value  Pr(>|z|)
(Intercept)  -0.70537    0.47722   -1.48    0.1394
mApr          0.00304    0.63689    0.00    0.9962
mJuly         -1.24819    0.63544   -1.96   0.0495 *
mOct         -0.36723    0.60261   -0.61    0.5423
altM           -1.63873    0.73215   -2.24   0.0252 *
mApr:altM   0.81522    0.94844    0.86    0.3900
mJuly:altM  3.21364    0.88936    3.61    0.0003 ***
MonthOct:AltM   1.21148    0.89989    1.35    0.1782
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Number of observations: total=499, Site=6, Site:Year=24
Random effect variance(s):
Group=Site
            Variance StdDev
(Intercept)  0.03713 0.1927
Group=Site:Year
            Variance StdDev
(Intercept)   0.2289 0.4785

Negative binomial dispersion parameter: 0.17887 (std. err.: 0.036541)
Zero-inflation: 1.0132e-06  (std. err.:  0.00013482 )

Log-likelihood: -331.319
Warning message:
In .local(x, sigma, ...) :
  'sigma' and 'rdig' arguments are present for compatibility only: ignored
================================================

My Questions:
1) Is it reasonable to include (1|site/y)? Since site effect is mixed up
with altitude effect (my main focus), I am worried that adding 'site' as
random effect would disturb the estimation of altitude effect (which seems
strong based on plotted data), or even cause estimation problem?  (for
example, in summary(mod_T), the DF of the 'altM' term is so small (4)
compared to models without site random effect (~480, attached below for
your reference)).
2) One step back, are lme() (for controlling heteroskedasticity) and
glmmadmb() (for accounting for zero-inflation) the right choices for this
analysis?


I would appreciate your comments and help.
Thank very much!

Chung-Huey Wu


####################################################
[WITHOUT site random effect]
mod_T = lme(mT ~ m + alt + m:alt, random = ~ 1|y,
                 weights = varIdent(form = ~1|alt),
                 contrasts = c(list(m=contr.sum,alt=contr.sum)),
                 na.action = "na.omit", data = dat))

Linear mixed-effects model fit by REML
 Data: dat
       AIC      BIC   logLik
  3672.361 3726.888 -1823.18

Random effects:
 Formula: ~1 | y
        (Intercept) Residual
StdDev:    3.515026 9.344131

Variance function:
 Structure: Different standard deviations per stratum
 Formula: ~1 | m
 Parameter estimates:
     July       Oct       Jan       Apr
1.0000000 1.0896624 0.9214424 1.0804838
Fixed effects: mT ~ m + alt + m:alt
               Value Std.Error  DF   t-value p-value
(Intercept) 39.65193  2.135613 487 18.567007  0.0000
mApr         7.57426  1.805682 487  4.194680  0.0000
mJuly        8.17279  1.638495 487  4.987985  0.0000
mOct         3.92672  1.706423 487  2.301144  0.0218
altM        -3.48711  1.709890 487 -2.039375  0.0420
mApr:altM   -1.32758  2.588246 487 -0.512928  0.6082
mJuly:altM  -2.37910  2.312807 487 -1.028662  0.3041
mOct:altM   -2.32762  2.409046 487 -0.966198  0.3344
 Correlation:
           (Intr) mApr   mJuly  mOct   altM   mApr:M mJly:M
mApr       -0.356
mJuly      -0.421  0.464
mOct       -0.404  0.446  0.526
altM       -0.376  0.445  0.490  0.471
mApr:altM   0.248 -0.698 -0.324 -0.311 -0.661
mJuly:altM  0.278 -0.329 -0.682 -0.348 -0.739  0.488
mOct:altM   0.267 -0.316 -0.348 -0.684 -0.710  0.469  0.525

Standardized Within-Group Residuals:
       Min         Q1        Med         Q3        Max
-2.2019475 -0.7909630 -0.0548838  0.7254892  3.0912138

Number of Observations: 498
Number of Groups: 4
####################################################
#
Dear Chung-Huey,

You have 6 x 4 observations per site - year combination. That is sufficient
to use 1|site/year as random effect.
Site is an essential part of the design of your study. Therefore it should
be in the model. The site effect will take up information which is not
explained by the altitude. So if the altitude effect is smaller in
comparison to the site effect, then the site-to-site variation is more
important than the altitude effect.

Note that lots of zero's does not imply zero inflation see
https://rpubs.com/INBOstats/zeroinflation

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2017-01-18 5:59 GMT+01:00 Chung-Huey Wu <chung.huey.wu at gmail.com>: