Skip to content

GLMM for underdispersed count data: Conway-Maxwell-Poisson and Ordinal

9 messages · Mollie Brooks, Ben Bolker, Jonathan Judge +2 more

#
Dear all,

I am trying to find an appropriate GLMM (with temporal and individual 
crossed random effects) to model underdispersed count data (clutch 
size). I have found several possible ways of doing that. A good 
distribution for data like this would seem to be the 
Conway-Maxwell-Poisson but I have not found a way of using it within a 
GLMM in R (I have asked here 
<http://stats.stackexchange.com/questions/249738/how-to-define-the-nu-parameter-of-conway-maxwell-poisson-in-spamm-package> 
and here 
<http://stats.stackexchange.com/questions/249798/conway-maxwell-poisson-with-crossed-random-effects-in-r>).
I have seen that Ben Bolker suggested (here 
<https://stat.ethz.ch/pipermail/r-sig-mixed-models/2014q1/021945.html>and 
here 
<http://stats.stackexchange.com/questions/92156/how-to-handle-underdispersion-in-glmm-binomial-outcome-variable>) 
to use an ordinal model in cases like this(e.g. _ordinal:clmm_). I have 
tried this solution and the results I obtain makes (biological) sense to 
me. However, I wonder why but I cannot put all the three crossed random 
effects I have in the clmm model (_Error: no. random effects (=1254) >= 
no. observations (=854)_) whereas it is not a problem for the glmer 
model (the no. of levels of each single random effect does not exceed 854)*.
Beyond that, and that's what I would like to ask you, *I cannot find a 
reference to justify I used the ordinal model* to deal with 
underdispersed count data (referee will ask it for sure).
Best,

Simone

* FMglmer<- glmer(fledges ~ habitatF * (areaPatchFath + poligF01 + 
StdLayingDate + ageFath1 + ageMoth1) + (1|year) + (1|ringMoth) + 
(1|ringFath), data = datiDRS)
    FMclmm<- glmer(as.factor(fledges)~ habitatF * (areaPatchFath + 
poligF01 + StdLayingDate + ageFath1 + ageMoth1) + (1|year) + 
(1|ringMoth) + (1|ringFath), data = datiDRS)
1 day later
#
Dear Simone,

I?ve been working on adding the Conway-Maxwell-Poisson distribution to the glmmTMB package. It isn?t published yet, but I?ve tested it with simulated data and 2 real data sets. It seems to be working well, so I plan to introduce it in a manuscript on Biorxiv in the near future. You?re welcome to try it with your data and tell me how it goes. 

First, you?ll have to install the genpois branch of glmmTMB with the following

devtools::install_github("glmmTMB/glmmTMB/glmmTMB", ref="genpois")

Then, you could fit your model with this code

FMCMP <- glmmTMB(fledges ~ habitatF * (areaPatchFath + poligF01 + 
StdLayingDate + ageFath1 + ageMoth1) + (1|year) + (1|ringMoth) + 
(1|ringFath), data = datiDRS, family="compois")

For an explanation of the dispersion parameter, see ?sigma.glmmTMB

If you want to try it out on simulated data, there?s an rCMP function available here
https://github.com/James-Thorson/Conway-Maxwell-Poisson <https://github.com/James-Thorson/Conway-Maxwell-Poisson>

cheers,
Mollie

???????????
Mollie E. Brooks, Ph.D.
Postdoctoral Researcher
National Institute of Aquatic Resources
Technical University of Denmark

  
  
#
One reference that uses ordinal regression in a similar situation
(litter size of Florida panthers) is
http://link.springer.com/article/10.1007/s00442-011-2083-0 ("Does
genetic introgression improve female reproductive performance? A test on
the endangered Florida panther")

  Not sure about the number-of-random-effects error: a reproducible
example would probably be needed (smaller is better!)

  Ben Bolker
On 16-12-06 08:41 AM, Simone Santoro wrote:
1 day later
#
The Conway-Maxwell-Poisson may well be the way to go here; I?ll note only that I have sometimes found underdispersed counting data being driven by excess zeroes. If zero-inflation of some kind is in fact the culprit, and you still wish to use multi-level modeling, the brms front-end for Stan offers a variety of easy-to-use fitting options for zero-inflation / hurdle / adjustment.



Best,

Jonathan



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10



From: Mollie Brooks<mailto:mollieebrooks at gmail.com>
Sent: Wednesday, December 7, 2016 7:36 PM
To: Simone Santoro<mailto:santoro at ebd.csic.es>
Cc: r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] GLMM for underdispersed count data: Conway-Maxwell-Poisson and Ordinal



Dear Simone,

I?ve been working on adding the Conway-Maxwell-Poisson distribution to the glmmTMB package. It isn?t published yet, but I?ve tested it with simulated data and 2 real data sets. It seems to be working well, so I plan to introduce it in a manuscript on Biorxiv in the near future. You?re welcome to try it with your data and tell me how it goes.

First, you?ll have to install the genpois branch of glmmTMB with the following

devtools::install_github("glmmTMB/glmmTMB/glmmTMB", ref="genpois")

Then, you could fit your model with this code

FMCMP <- glmmTMB(fledges ~ habitatF * (areaPatchFath + poligF01 +
StdLayingDate + ageFath1 + ageMoth1) + (1|year) + (1|ringMoth) +
(1|ringFath), data = datiDRS, family="compois")

For an explanation of the dispersion parameter, see ?sigma.glmmTMB

If you want to try it out on simulated data, there?s an rCMP function available here
https://github.com/James-Thorson/Conway-Maxwell-Poisson <https://github.com/James-Thorson/Conway-Maxwell-Poisson>

cheers,
Mollie

???????????
Mollie E. Brooks, Ph.D.
Postdoctoral Researcher
National Institute of Aquatic Resources
Technical University of Denmark
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
Hi,

Thank you all very much your hints. They have been really really helpful 
for me. Below you may find a reproducible code to see how three 
approaches fit a simulated data set (clmm::ordinal, glmmTMB::glmmTMB, 
fitme:spaMM). Results seem to me qualitatively similar but with 
clmm:ordinal I cannot use the three crossed random effects because I get 
an error like this:

Error: no. random effects (=135) >= no. observations (=100)


set.seed(1234)
library(ordinal)
library(glmmTMB)
library(spaMM)
dati<- data.frame(fledges= rpois(100,10), habitatF= 
as.factor(rbinom(100,1,0.5)), areaPatchFath= rnorm(100), poligF01= 
as.factor(rbinom(100,1,0.5)),StdLayingDate= rnorm(100), ageFath1= 
rpois(100,3), ageMoth1= rpois(100,3), year= as.factor(rpois(100,200)), 
ringMoth= as.factor(rpois(100,200)), ringFath= as.factor(rpois(100,200)))
str(dati)

system.time(Fitclm<- clmm(as.factor(fledges) ~ 
habitatF*(areaPatchFath+poligF01+StdLayingDate+ageFath1+ageMoth1)+(1|year)+(1|ringMoth)+(1|ringFath),data=dati,Hess=T))
# this way it works...
system.time(Fitclm1<- clmm(as.factor(fledges) ~ 
habitatF*(areaPatchFath+poligF01+StdLayingDate+ageFath1+ageMoth1)+(1|year)+(1|ringFath),data=dati,Hess=T))
summary(Fitclm1)

system.time(FitglmmTMB<- glmmTMB(as.factor(fledges) ~ 
habitatF*(areaPatchFath+poligF01+StdLayingDate+ageFath1+ageMoth1)+(1|year)+(1|ringMoth)+(1|ringFath),data=dati,family= 
"compois"))
summary(FitglmmTMB)

system.time(FitglmmTMB<- glmmTMB(as.factor(fledges) ~ 
habitatF*(areaPatchFath+poligF01+StdLayingDate+ageFath1+ageMoth1)+(1|year)+(1|ringMoth)+(1|ringFath),data=dati,family= 
"compois"))
summary(FitglmmTMB)

# This lasts much more (3-4')
system.time(Fitfitme<- fitme(fledges ~ 
habitatF*(areaPatchFath+poligF01+StdLayingDate+ageFath1+ageMoth1)+(1|year)+(1|ringFath)+(1|ringMoth),data=dati,COMPoisson(),method 
= "ML"))
summary(Fitfitme)











El 08/12/2016 a las 4:32, Ben Bolker escribi?:

  
    
#
You could try using brms::brm with family "cumulative" as an alternative to
ordinal::clmm as the former is able to handle models with more random
effects than observations. The call would look as follows:

brm(as.factor(fledges) ~
habitatF*(areaPatchFath+poligF01+StdLayingDate+ageFath1+ageMoth1)+(1|year)+(
1|ringMoth)+(1|ringFath),
data=dati, family = cumulative())

Note that you need a C++ compiler to get this to work since brms is a
front-end of Stan.

Best,
Paul

2016-12-09 19:40 GMT+01:00 Simone Santoro <santoro at ebd.csic.es>:

  
  
4 days later
#
Hi Simone,

For the glmmTMB model with the Conway-Maxwell Poisson distribution, the left side of the equation should technically by fledges rather than as.factor(fledges). However, it looks like glmmTMB doesn?t evaluate the as.factor() command and fits the model with fledges as the response anyway.

If you end up needing zero-inflation also, it can be specified using the ziformula command. See vignette("glmmTMB") or here https://github.com/glmmTMB/glmmTMB/blob/master/misc/salamanders.pdf <https://github.com/glmmTMB/glmmTMB/blob/master/misc/salamanders.pdf>
for an example.

cheers,
Mollie

???????????
Mollie E. Brooks, Ph.D.
Postdoctoral Researcher
National Institute of Aquatic Resources
Technical University of Denmark

  
  
#
Thank you all so much for it. Really a very useful discussion, hope it 
may be so to others too.

El 14/12/2016 a las 16:06, Mollie Brooks escribi?:

  
    
#
On 16-12-14 11:05 AM, Simone Santoro wrote:
I'd be careful with this conclusion.  I think these are *not* the same
model.  For the fake data set (where all the true effects are zero) the
results aren't that different, but what happens when you convert an
integer value to a factor is that the unique values get converted to
codes 1, 2, ...  This would potentially be disastrous.

  I got the clmm model to run with Rune's development version; it's a
little hard to see whether the results are comparable or not since it's
fitting a qualitatively different model ...