Dear all, I'm trying to analyze some strongly overdispersed Poisson-distributed data using R's mixed effects model function "lmer". Recently, several people have suggested incorporating an observation-level random effect, which would model the excess variation and solve the problem of underestimated standard errors that arises with overdispersed data. It seems to be working, but I feel uneasy using this method because I don't actually understand conceptually what it is doing. Does it package up the extra, non-Poisson variation into a miniature variance component for each data point? But then I don't understand how one ends up with non-zero residuals and why one can't just do this for any analyses (even with normally-distributed data) in which one would like to reduce noise. I may be way off base here, but does this approach model some kind of mixture distribution that's a combination of Poisson and whatever distribution the extra variation is? I've read that people often use a negative binomial distribution (aka Poisson-gamma) to model overdispersed count data in which they assume that the process is Poisson (so they use a log link) but the extra variation is a gamma distribution (in which variance is proportional to square of the mean). The frequently referred to paper by Elston et al (2001) describes modeling a Poisson-lognormal distribution in which overdispersion arises from errors taking on a lognormal distribution. Is the approach of using the observation-level random effect doing something similar, and simply assuming some kind of Poisson-normal mixed distribution? Does this approach therefore assume that the observation-level variance is normally distributed? If anyone could give me any guidance on this, I would appreciate it very much. Martina Muller
Observation-level random effect to model overdispersion
4 messages · M.S.Muller, Ben Bolker, Jarrod Hadfield +1 more
On 11-03-21 07:51 AM, M.S.Muller wrote:
Dear all, I'm trying to analyze some strongly overdispersed Poisson-distributed data using R's mixed effects model function "lmer". Recently, several people have suggested incorporating an observation-level random effect, which would model the excess variation and solve the problem of underestimated standard errors that arises with overdispersed data. It seems to be working, but I feel uneasy using this method because I don't actually understand conceptually what it is doing. Does it package up the extra, non-Poisson variation into a miniature variance component for each data point? But then I don't understand how one ends up with non-zero residuals and why one can't just do this for any analyses (even with normally-distributed data) in which one would like to reduce noise. I may be way off base here, but does this approach model some kind of mixture distribution that's a combination of Poisson and whatever distribution the extra variation is? I've read that people often use a negative binomial distribution (aka Poisson-gamma) to model overdispersed count data in which they assume that the process is Poisson (so they use a log link) but the extra variation is a gamma distribution (in which variance is proportional to square of the mean). The frequently referred to paper by Elston et al (2001) describes modeling a Poisson-lognormal distribution in which overdispersion arises from errors taking on a lognormal distribution. Is the approach of using the observation-level random effect doing something similar, and simply assuming some kind of Poisson-normal mixed distribution? Does this approach therefore assume that the observation-level variance is normally distributed?
Exactly. The observation-level random effect approach is equivalent to assuming that the individual observations are [x]-normal distributed, i.e. a compound of a normal distribution transformed by the inverse link function and the specified distribution family. Sorry that's a bit clunky, but it translates to what you said above -- * lognormal-Poisson for Poisson with log link; * logit-normal-binomial for binomial with logit link; etc. For what it's worth, the Elston paper is philosophically sensible but I'm not sure that it's computationally sound; as I have said before <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q2/003967.html>, using PQL with observation-level random effects is explicitly *dis*recommended in the Genstat documentation; I had convergence problems fitting the data in lme4, and MCMCglmm told me that the data were under-specified and I should consider a more informative prior ... (I see Jarrod Hadfield has just answered this question too.)
If anyone could give me any guidance on this, I would appreciate it very much. Martina Muller
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Hi, Your intuition is correct: using an observation-level random effect is the same as using a log-normal mixing distribution if the log-link is used: the marginal distribution of the observation-level effects are assumed to be normal. It goes back to HINDE J (1982) GLIM 82-109 at least. Jarrod
On 21 Mar 2011, at 11:51, M.S.Muller wrote:
Dear all, I'm trying to analyze some strongly overdispersed Poisson- distributed data using R's mixed effects model function "lmer". Recently, several people have suggested incorporating an observation- level random effect, which would model the excess variation and solve the problem of underestimated standard errors that arises with overdispersed data. It seems to be working, but I feel uneasy using this method because I don't actually understand conceptually what it is doing. Does it package up the extra, non-Poisson variation into a miniature variance component for each data point? But then I don't understand how one ends up with non-zero residuals and why one can't just do this for any analyses (even with normally-distributed data) in which one would like to reduce noise. I may be way off base here, but does this approach model some kind of mixture distribution that's a combination of Poisson and whatever distribution the extra variation is? I've read that people often use a negative binomial distribution (aka Poisson-gamma) to model overdispersed count data in which they assume that the process is Poisson (so they use a log link) but the extra variation is a gamma distribution (in which variance is proportional to square of the mean). The frequently referred to paper by Elston et al (2001) describes modeling a Poisson-lognormal distribution in which overdispersion arises from errors taking on a lognormal distribution. Is the approach of using the observation-level random effect doing something similar, and simply assuming some kind of Poisson-normal mixed distribution? Does this approach therefore assume that the observation-level variance is normally distributed? If anyone could give me any guidance on this, I would appreciate it very much. Martina Muller
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20110321/d4cebe97/attachment.pl>
One point, additional to other responses. There is just one "miniature variance component" (by the way, not miniature in the sense that it has to be small). As I understand it, a normal distribution with this variance generates one random effect, on the scale of the linear predictor, for each observation. On the scale of the response, well . . . John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm
On 21/03/2011, at 10:51 PM, M.S.Muller wrote:
Dear all, I'm trying to analyze some strongly overdispersed Poisson-distributed data using R's mixed effects model function "lmer". Recently, several people have suggested incorporating an observation-level random effect, which would model the excess variation and solve the problem of underestimated standard errors that arises with overdispersed data. It seems to be working, but I feel uneasy using this method because I don't actually understand conceptually what it is doing. Does it package up the extra, non-Poisson variation into a miniature variance component for each data point? But then I don't understand how one ends up with non-zero residuals and why one can't just do this for any analyses (even with normally-distributed data) in which one would like to reduce noise. I may be way off base here, but does this approach model some kind of mixture distribution that's a combination of Poisson and whatever distribution the extra variation is? I've read that people often use a negative binomial distribution (aka Poisson-gamma) to model overdispersed count data in which they assume that the process is Poisson (so they use a log link) but the extra variation is a gamma distribution (in which variance is proportional to square of the mean). The frequently referred to paper by Elston et al (2001) describes modeling a Poisson-lognormal distribution in which overdispersion arises from errors taking on a lognormal distribution. Is the approach of using the observation-level random effect doing something similar, and simply assuming some kind of Poisson-normal mixed distribution? Does this approach therefore assume that the observation-level variance is normally distributed? If anyone could give me any guidance on this, I would appreciate it very much. Martina Muller
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models