Accounting for overdispersion in a mixed-effect model with a proportion response variable and categorical explanatory variables.

Ben Bolker · 2011-04-22T00:49:11Z

Richard Friedman cancercenter.columbia.edu> writes: > > Dear R-help-list, > > I have a problem in which the explanatory variables are categorical, > the response variable is a proportion, and experiment contains > technical replicates (pseudoreplicates) as well as biological > replicated. I am new to both generalized linear models and mixed- > effects models and would greatly appreciate the advice of experienced > analysts in this matter. > > I analyzed the data in

Ben Bolker

Thu, Apr 21, 2011 5:49 PM

Richard Friedman <friedman <at> cancercenter.columbia.edu> writes:

Off the top of my head, I would say that #4 is best.

X <- read.table("mouse.dat",header=TRUE,
colClasses=rep(c("factor","numeric"),c(3,2)))
Xagg <- aggregate(cbind(positive,negative)~treatment+mouse,data=X,FUN=sum)
## gets your aggregated data

model <- glm(cbind(positive,negative)~treatment,family=binomial,data=Xagg)

  If all the assumptions of the model were actually met this might be OK (since
if all observations on all mice were really independent, the sum of binomials
would be binomial) but the residual deviance suggests overdispersion (dev/df
approx. 2)

(note that your 2 NA observations got dropped during my aggregation
step)

yes (I'm working through this before I see your comments, but
you seem to be on the right track)

Seems reasonable.

[snip]

X$obs<-1:nrow(X)

library(lme4)
## it doesn't make sense to include treatment in the random effect
##  since it is already present as a fixed effect in the model
## "mouse within treatment" (= mouse:treatment) is what you want
model2<-lmer(cbind(positive,negative)~treatment+(1|mouse:treatment)+
             (1|obs) ,data=X, family=binomial)
## note that mouse:treatment variance comes out to zero

model2B<-lmer(cbind(positive,negative)~treatment+(1|mouse:treatment),
             data=X, family=binomial)

anova(model2,model2B)
## and yet model with observation-level RE appears significantly better
## (even though this is a conservative test because the null hypothesis
## is on the boundary)


I then compared the 2 models.

##   note that test="F" is ignored

I think so.  It may however comfort you to see that the coefficients
and their standard errors are practically the same under each model.

  For the gold standard you may want to do parametric bootstrapping
(see ?refit in the most recent (r-forge) version of lme4)

  I would suggest that future questions along these lines go to the
r-sig-mixed-models mailing list ...

  Ben Bolker

Accounting for overdispersion in a mixed-effect model with a proportion response variable and categorical explanatory variables.

Thread (2 messages)