An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20130121/4be742e2/attachment.pl>
testing for overdispersion in the mixed effects logistic regression
3 messages · Antonio P. Ramos, Ben Bolker
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20130121/986fbdcd/attachment.pl>
1 day later
Antonio P. Ramos <ramos.grad.student at ...> writes:
Hi all,
I would like to check whether I am properly checking for overdispersion in
a mixed effects logistic regression. Here is the code:
m1 <- lmer(mortality.under.2 ~ maternal_age_d +B0 + B4 + V102 + BORD +
WLTHIND5+ time + new_time + democracy + V106 +
WLTHIND5*time + WLTHIND5*new_time + democracy*WLTHIND5 +
(1 |CASEID)-1,data=brazil1,family=binomial(link="logit"))
# testing for overdispersion
n <- length(residuals(m1)) # the number of obs
k <- length(unique(brazil1$CASEID)) # the number of random effects
z <- sum(residuals(m1,type=standardize)^2) # squared, standardize residuals
cat("the overdispersion factor is",z/(n-k),"\n")
cat("the p-value of the overdispersion test is", pchisq(z,n-k))
If your response variable is binary (as I'm guessing it is since
you appear to have a single response and not something of the form
cbind(success,failure), but I can't be sure), then testing for
overdispersion doesn't make any sense -- any among-observation
variance is unidentifiable.
In any case:
* type=standardize is unlikely to work, the normal expectation
is that type should be a string (e.g. see ?residuals.lme in the nlme
package); however, the residuals method
in the stable version of lme4 doesn't even take a "type" argument
getMethod("residuals","mer")@.Data
function (object, ...)
napredict(attr(object at frame, "na.action"), object at resid)
<environment: namespace:lme4.0>
-- yes, this should be better documented!
help("residuals,mer-method") [ugh] tells you that residuals() returns
the Pearson residuals by default:
?resid?: The residuals, y-mu, weighted by the ?sqrtrWt? slot (when
its length is >0).
The last comments are that (1) you should be using lower.tail=FALSE
in your pchisq() call; (2) it's not clear what the "residual df" really
are for random effects models (subtracting the number of random effects
estimates is probably conservative --- but why aren't you subtracting
the fixed effect parameter df too?); (3) the Pearson residuals are
a fairly poor estimate of the scale parameter/overdispersion when the
data are far from conditionally normal (e.g. binomial or Poisson with
small numbers of counts) -- see Venables and Ripley; (4) there's more
information on all of this at http://glmm.wikidot.com/faq#overdispersion.
Ben Bolker
(Yes, this should be better documented!)