Dear everyone,
I have a question regarding the comparison of Generalized Linear Mixed
Models (GLMM - using the lme4 - package) and Markov-Chain Monte-Carlo
Generalized Mixed Models (MCMCglmm - using the MCMCglmm-package).
I am running mixed models using both packages. While the estimates
provided by each approach are relatively comparable, the significance
levels differ strongly. My main questions are the following:
1. How can the divergence in results be explained and why doe both
methods provide different results even though the model structure is the
same? (Note: diagnostic plots for both models look fine)
2. I am aware that there is strong debate about the validity of p-values
within GLMMs. Is there similar concern regarding MCMCglmm?
3. What would be a sensible interpretation of both model outputs
especially with respect to PRED2 and the interaction term.
Below I provide a short description and example output:
I want to test the effect of 3 continuous predictors (PRED2, PRED3,
PRED4) on a proportional response (RESP). PRED2 and PRED3 enter the
model as an interaction effect (PRED2*PRED3). The response is modeled
using a Poisson error structure with a log-link function and the
nominator of the ratio is therefore included as an offset predictor
(offset(log(PRED1))).
MODEL1 -? GLMM using the lme4-package:
MOD1 <- glmer(RESP ~ offset(log(PRED1)) + PRED2 * PRED3 + PRED4 +
(1|RANEF), family = "poisson", data = data,
control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=100000)))
summary(MOD1)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod'] Family: poisson ( log ) Formula: RESP ~
offset(log(PRED1)) + PRED2 * PRED3 + PRED4 + (1 | RANEF) Data: data
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun =
1e+05)) AIC BIC logLik deviance df.resid 2451.3 2470.9 -1219.6 2439.3
190 Scaled residuals: Min 1Q Median 3Q Max -10.8175 -1.2477 -0.1913
1.1076 11.0389 Random effects: Groups Name Variance Std.Dev. RANEF
(Intercept) 0.1519 0.3898 Number of obs: 196, groups: Order, 56 Fixed
effects: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.29896
0.06250 -52.784 < 2e-16 *** PRED2 0.09089 0.01850 4.912 9.00e-07 ***
PRED3 0.57262 0.02375 24.114 < 2e-16 *** PRED4 0.54726 0.01808 30.275 <
2e-16 *** PRED2:PRED3 0.12190 0.01821 6.693 2.19e-11 *** --- Signif.
codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Correlation of
Fixed Effects: (Intr) PRED2 PRED3 PRED4 PRED2 -0.142 PRED3 -0.056 0.102
PRED4 -0.042 0.135 -0.155 PRED2:PRED3 0.001 0.017 -0.678 0.273
MODEL2 -? MCMCglmm using the lme4-package:
Now I build the same model using a MCMCglmm using weak informative
priors. The offset term was implemented following the suggestion by
Jarrod Hadfield in this post
(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2017q3/026016.html).
See the model formula and summary below:
prior <- list(B=list(V=diag(6)*10, mu=c(0,1,0,0,0,0)),
???????????????????????? G =list(G1=list(V=1,nu=0.002)),
???????????????????????? R=list(V=1,nu=0.002))
prior$B$V[2,2] <- 1e-9 # set fixed effect prior for the offset variable to 1
prior
PGLMM.MOD1 <- MCMCglmm(RESP ~ log(PRED1) + PRED2 * PRED3 + PRED4,
??????????????????????????????????????? random = ~ RANEF,
??????????????????????????????????????? family="poisson",
??????????????????????????????????????? prior = prior,
??????????????????????????????????????? scale = T,
??????????????????????????????????????? pr=TRUE,
??????????????????????????????????????? data = data,
??????????????????????????????????????? nitt = 500000, burnin = 20000,
thin = 200)
summary(PGLMM.MOD1)
Iterations = 20001:499801 Thinning interval = 200 Sample size = 2400
DIC: 1164.591 G-structure: ~Order post.mean l-95% CI u-95% CI eff.samp
RANEF 0.02264 0.0002387 0.0762 2400 R-structure: ~units post.mean l-95%
CI u-95% CI eff.samp units 0.2788 0.1946 0.3705 2400 Location effects:
RESP ~ log(PRED1) + PRED2 * PRED3 + PRED4 post.mean l-95% CI u-95% CI
eff.samp pMCMC (Intercept) -3.40280 -3.50958 -3.28826 2400 <4e-04 ***
log(PRED1) 1.00000 0.99994 1.00006 2257 <4e-04 *** PRED2 0.08369
-0.01995 0.18120 2400 0.113 PRED3 0.63058 0.52164 0.73889 2400 <4e-04
*** PRED4 0.60488 0.50565 0.69963 2400 <4e-04 *** PRED2:PRED3 0.06882
-0.02363 0.17414 2258 0.167 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01
?*? 0.05 ?.? 0.1 ? ? 1
As you can see from the different summaries, the estimates are fairly
similar with some divergence especially regarding PRED2 and the
interaction term. Additionally the provided p-values differ
dramatically, with the GLMM (lme4) being highly significant for PRED2
and the interaction wheras non-significance for the MCMCglmm.
My questions now are:
1. How can the divergence in results be explained and why doe both
methods provide different results even though the model structure is the
same? (Note: diagnostic plots for both models look fine)
2. I am aware that there is strong debate about the validity of p-values
within GLMMs. Is there similar concern regarding MCMCglmm?
3. What would be a sensible interpretation of both model outputs
especially with respect to PRED2 and the interaction term.
Any insight and help on that matter would be highly appreciated.
Best regards,
Bernd
Bernd Lenzner PhD
Researcher
University of Vienna
Department of Botany and Biodiversity Research
Division of Conservation Biology, Vegetation and Landscape Ecology
Rennweg 14
A-1030 Vienna
[[alternative HTML version deleted]]