Hi everyone, I am having trouble with overdispersion when trying to model count data using a GLMM. Beyond going to a negative binomial or Poisson- lognormal distribution, I have seen the suggestion (from Ben Bolker I believe) to include observation as a random effect. For example using the lme4 package my code would look something like this: glmer(count ~ SoilT + SoilT2 + RH + rain24 + drought + rain24*SoilT + drought*rain24 + (1 | plot) + (1 | obs), data = Data, family = poisson) When I try this I get a fitted vs. residual plot with large residuals at low fitted values funneling down to small residuals as the fitted values get larger. This indicates heterogeneity. I was wondering if that is expected for some reason with observation-level random effects or if this model just doesn't meet the assumptions of GLMM for my data? Thanks, Dan ------------------------------------------------------------------------------------ Daniel J. Hocking 122 James Hall Department of Natural Resources & the Environment University of New Hampshire Durham, NH 03824 dhocking at unh.edu http://sites.google.com/site/danieljhocking/ http://quantitativeecology.blogspot.com/ http://richnessoflife.blogspot.com/ "Without data, you are just another person with an opinion."
Using Observations as Random Effect in GLMM?
3 messages · Daniel Hocking, John Maindonald, Ben Bolker
I've been looking recently at animal count data that I've modeled as Poisson with an observation level random effect, and have worried a bit about such issues. The observation level random effects model and the over-dispersion model add variances on different scales -- for the observation level random effects random effects model the added variance is proportional to the square of the Poisson mean, whereas for the over-dispersion model it is proportional to the mean. (These comments assume small additional error; but they do delineate the broad ballparks in which the two models operate. The glmer() function is making its own very specific assumptions about the scale on which to add the additional normal error. The models are thus pretty much equivalent only if the range of expected values is small. It would be useful to have more flexibility, at the observation level at least, in the modelling of the extra-Poisson error. Among the various packages that handle GLMMs, do any of them offer such flexibility, maybe allowing e.g. a quasi-Poisson error? (Sure, there are issues about how legit quasi-Poisson errors are. I expect however someone will sometime work out how to give them full theoretical respectability, and they will duly be admitted to the part of the statistical pantheon allocated to those models that are thus theoretically respectable.) John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm
On 22/01/2012, at 7:44 AM, Daniel Hocking wrote:
Hi everyone, I am having trouble with overdispersion when trying to model count data using a GLMM. Beyond going to a negative binomial or Poisson-lognormal distribution, I have seen the suggestion (from Ben Bolker I believe) to include observation as a random effect. For example using the lme4 package my code would look something like this: glmer(count ~ SoilT + SoilT2 + RH + rain24 + drought + rain24*SoilT + drought*rain24 + (1 | plot) + (1 | obs), data = Data, family = poisson) When I try this I get a fitted vs. residual plot with large residuals at low fitted values funneling down to small residuals as the fitted values get larger. This indicates heterogeneity. I was wondering if that is expected for some reason with observation-level random effects or if this model just doesn't meet the assumptions of GLMM for my data? Thanks, Dan ------------------------------------------------------------------------------------ Daniel J. Hocking 122 James Hall Department of Natural Resources & the Environment University of New Hampshire Durham, NH 03824 dhocking at unh.edu http://sites.google.com/site/danieljhocking/ http://quantitativeecology.blogspot.com/ http://richnessoflife.blogspot.com/ "Without data, you are just another person with an opinion."
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
John Maindonald <john.maindonald at ...> writes:
I've been looking recently at animal count data that I've modeled as Poisson with an observation level random effect, and have worried a bit about such issues. The observation level random effects model and the over-dispersion model add variances on different scales -- for the observation level random effects random effects model the added variance is proportional to the square of the Poisson mean, whereas for the over-dispersion model it is proportional to the mean. (These comments assume small additional error; but they do delineate the broad ballparks in which the two models operate. The glmer() function is making its own very specific assumptions about the scale on which to add the additional normal error. The models are thus pretty much equivalent only if the range of expected values is small. It would be useful to have more flexibility, at the observation level at least, in the modelling of the extra-Poisson error. Among the various packages that handle GLMMs, do any of them offer such flexibility, maybe allowing e.g. a quasi-Poisson error?
Recent versions of the glmmADMB package offer two flavors of negative binomial model, either with variance = mu*(1+mu/k) (the classic 'quadratic' (almost) parameterization, which Hardin and Hilbe call NB2) or with variance = phi*mu (which Hardin and Hilbe call NB1; I believe this is what you are calling "quasi-Poisson" above). The variance-mean relationship of NB2 and of the lognormal-Poisson model are the same, although the details do differ ...
(Sure, there are issues about how legit quasi-Poisson errors are. I expect however someone will sometime work out how to give them full theoretical respectability, and they will duly be admitted to the part of the statistical pantheon allocated to those models that are thus theoretically respectable.)
I haven't tried it yet, but my response to the original poster would have been to try a well-behaved simulation and see whether the same phenomenon occurred ...