Questions on the results from glmmPQL(MASS)
Ben Bolker's response to a glmmPQL question below raises a question. Does the issue of bias with binomial data reported by Breslow (2003) remain valid with respect specifically to Ripley's treatment of PQL in glmmPQL? Breslow makes no reference to this particular implementation. He does discuss that of SAS GLIMMIX, but it does not work exactly as glmmPQL. I've compared results between binomial models between these two approaches, and they usually give compatible results. But they can diverge markedly in enough cases that I wish I understood just how they differ, so wonder if relative vulnerability to bias could be involved. BTW Ben refers Zhijie to a separate user group that focuses on mixed models. I knew nothing of this group. Following through on the link I found their archive, which included a fairly extensive thread on a question I posed to the regular R group in October. My question was forwarded, by Ben Bolker in fact (Wald F tests thread), for which I'm grateful. But I'm embarassed to say I only learned of the thread, even though I initiated it, because of this email. I just assumed no responses, other than R-News, and that was mostly questions to me about glmmPQL, rather than attempts to answer my own question. I'm clearly not the only one unaware of the mixed-models group, and a very sad choice for asking questions about glmmPQL.
Mark Fowler
Population Ecology Division
Bedford Inst of Oceanography Dept Fisheries & Oceans Dartmouth NS Canada
B2Y 4A2 Tel. (902) 426-3529 Fax (902) 426-9710 Email fowlerm at mar.dfo-mpo.gc.ca Home Tel. (902) 461-0708 Home Email mark.fowler at ns.sympatico.ca -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Ben Bolker Sent: December 6, 2008 4:10 PM To: r-help at r-project.org Subject: Re: [R] Questions on the results from glmmPQL(MASS)
zhijie zhang wrote:
Dear Rusers, I have used R,S-PLUS and SAS to analyze the sample data "bacteria" in MASS package. Their results are listed below. I have three questions, anybody can give me possible answers? Q1:From the results, we see that R get 'NAs'for AIC,BIC and logLik, while S-PLUS8.0 gave the exact values for them. Why?
This is a philosophical difference between S-PLUS and R. Since glmmPQL uses quasi-likelihood, technically there is no log-likelihood (hence no AIC nor BIC, which are based on the log-likelihood) for this model -- the argument is that one is limited to looking at Wald tests (testing the Z- or t-statistics, i.e. parameter estimates divided by estimated standard errors) for inference in this case.
zhijie zhang wrote:
Q2: The model to analyse the data is logity=b0+u+b1*trt+b2*I(week>2), but the results for Random effects in R/SPLUS confused me. SAS may be
clearer.
Random effects:
Formula: ~1 | ID
(Intercept) Residual
StdDev: 1.410637 0.7800511
Which is the random effect 'sigma'? I think it is "1.410637", but
what does "0.7800511" mean? That is, i want ot know how to explain/use
the above two data for Random effects.
The (Intercept) random effect is the variance in intercept across grouping factors . The residual (0.78) is (I believe) the individual-level error estimated for the underlying linear mixed model -- you can probably ignore this.
zhijie zhang wrote:
Q3:In SAS and other softwares, we can get *p*-values for the random effect 'sigma', but i donot see the *p*-values in the results of R/SPLUS. I have used attributes() to look for them, but no *p* values.
Anybody knows how to get *p*-values for the random effect 'sigma',.
Any suggestions or help are greatly appreciated.
#R Results:MASS' version 7.2-44; R version 2.7.2
library(MASS)
summary(glmmPQL(y ~ trt + I(week > 2), random = ~ 1 | ID,family =
binomial, data = bacteria))
Linear mixed-effects model fit by maximum likelihood
Data: bacteria
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | ID
(Intercept) Residual
StdDev: 1.410637 0.7800511
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: y ~ trt + I(week > 2)
Value Std.Error DF t-value
p-value
(Intercept) 3.412014 0.5185033 169 6.580506 0.0000
trtdrug -1.247355 0.6440635 47 -1.936696 0.0588
trtdrug+ -0.754327 0.6453978 47 -1.168779 0.2484
I(week > 2)TRUE -1.607257 0.3583379 169 -4.485311 0.0000
Correlation:
(Intr) trtdrg trtdr+
trtdrug -0.598
trtdrug+ -0.571 0.460
I(week > 2)TRUE -0.537 0.047 -0.001
#S-PLUS8.0: The results are the same as R except the followings:
AIC BIC logLik
1113.622 1133.984 -550.8111
#SAS9.1.3
proc nlmixed data=b;
parms b0=-1 b1=1 b2=1 sigma=0.4;
yy=b0+u+b1*trt+b2*week;
p=1/(1+exp(-yy));
Model response~binary(p);
Random u~normal(0,sigma) subject=id;
Run;
-2 Log Likelihood = 192.2
AIC (smaller is better)=200.2
AICC (smaller is better) =200.3
BIC (smaller is better)= 207.8
Parameter Estimates
Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha
Lower Upper Gradient
b0 3.4966 0.6512 49 5.37 <.0001 0.05
2.1880 4.8052 -4.69E-6
trt -0.6763 0.3352 49 -2.02 0.0491 0.05
-1.3500 -0.00266 -0.00001
I(week>2) -1.6132 0.4785 49 -3.37 0.0015 0.05
-2.5747
-0.6516 -9.35E-7 sigma 1.5301 0.9632 49 1.59 0.1186 0.05 -0.4054 3.4656 -2.42E-6
In general (alas) it is *extremely* difficult to get *correct* p-values for effects (both fixed and random, although fixed might be even worse) in GLMMs, despite the fact that SAS will happily give them to you. In general you can get p-values for random effects via a likelihood ratio test on the difference of nested models with and without the relevant effects. In this case that's a little bit trickier because (1) glmmPQL won't give you log-likelihoods (2) glmmPQL won't fit models without any random effects at all, and comparing log-likelihoods across different fitting procedures is always a little risky -- you have to make sure they are including the same constants in the log-likelihood. A partial solution is to use the lmer function in the lme4 package: lme2 <- lmer(y ~ trt + I(week > 2)+(1|ID), family = binomial, data = bacteria) which gives you similar estimates (a good idea in any case, because in any case PQL is not reliable for binary data -- see Breslow 2003). This will give you a likelihood etc. for the model. You still need to work out whether comparing AIC/BIC log-likelihood between lmer and glm (which you need to fit the model without random effects) is sensible. I would strongly recommend that you follow up further questions on this topic to r-sig-mixed-models at r-project.org , which is a special mailing list for mixed models. good luck, Ben Bolker -- View this message in context: http://www.nabble.com/Questions-on-the-results-from-glmmPQL%28MASS%29-tp 20867935p20874037.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.