Skip to content

Mixed mutlinomial regression for count data with overdisperion & zero-inflation

3 messages · Stéphanie Périquet, Ben Bolker

#
Dear list members,



First sorry for this very long first post ?


I am looking for advises to fit a mixed multinomial regression on count
data that are overdispersed and zero-inflated. My question is to evaluate
the effect of season and moonlight on diet composition of bat-eared foxes.
My dataset is composed of 14 possible prey item, 20 individual foxes
observed, 4 seasons and a moon illumination index ranging from 0 to 1 by
0.1 implements (considered as a continuous variable even if takes only 11
values). For each unique combination of individual*season*moon, I thus has
14 lines, one for the count of each prey item.
the following form to answer my question (ie a multinomial regression):

glmer(count~item+item:season+item:moon+offset(logduration)+(1+indiv)+(1|obs)+
(1|id), family=poisson)

where count is the number of prey of a given type recorded eaten;

item is the prey type;

logduration is the log(total time observed for a given combination of
individual*season*moon);

obs is a unique id for each combination of individual*season*moon, so each
obs value regroups 14 lines (one for each prey item) with the same
individual*season*moon;

id is a unique id for each line to account for overdispersion (as
quasi-poisson or negative binomial distributions are not implemented in
lme4, Elston et al. 2001).



However, they are a lot of zeros in my data i.e. lot of prey items has
never been observed being eaten for mane combinations of
individual*season*moon.

Following Ben Bolker wiki (http://glmm.wikidot.com/faq) I summarize that I
should use of the following methods to answer my question


   - ?      glmmADMB, with family=nbinom
   - ?      MCMCglmm, with family=zipoisson
   - ?      "expectation-maximization (EM) algorithm" in lme4



Here come the questions:

1.               1. Is it correct to assume that I could use the same model
structure (count~item+item:season+item:moon+offset(logduration)+(1+indiv)+(1|obs))
in glmmADMB or MCMCglmm to answer my question ?

2.   I then wouldn't need the (1|id) to correct for overdispersion as both
methods would already account for it, correct?

3.   I am totally new to MCMCglmm, so would it be correct to define the
priors and model as follows (inspired from Ben Bolker et al. 2012 Owls
example: a zero-inflated, generalized linear mixed model for count data
2012)

# define the fixed effects

fixef2 <- count~trait-1+  at.level(trait,1):logduration  +
at.level(trait,1):(item*season) +  at.level(trait,1):(item*moon)

#Set up a variable that will pick out the offset (duration) parameter,
which will be in 3rd position

offvec <- c(1,1,2,rep(1,))

#define the priors with 2 random factors and log(duration) as offset

prior2 <- list(R=list(V=diag(c(1,1)),nu=0.002,fix=2),

G=list(G1=list(V=diag(c(1,1e-6)),nu=0.002,fix=2),
G2=list(V=diag(c(1,1e-6)),nu=0.002,fix=2)),

list(B=list(mu=c(0,1)[offvec],

V=diag(c(1e6,1e-6)[offvec]))))

# define the model

mfit1 <- MCMCglmm(fixef2, rcov=~idh(trait):units,
random=~idh(trait):indiv+idh(trait):obs, prior=prior2,data=diet,
family="zipoisson",verbose=FALSE))

4.     4.  If I were to use the EM algorithm method, how should the results
be interpreted?



Thanks in advance for your help!

Stephanie
3 days later
#
St?phanie P?riquet <stephanie.periquet at ...> writes:
That's OK.  I'm only going to answer part of it, because it's long.
Yes, but I don't know if this will account for the possible dependence
*among* prey types.
Seems about right.
   There is glmer.nb now, but you might not want it; it tends to
be slower and more fragile, and you'd still have to deal with
zero-inflation.
That doesn't *necessarily* mean you need zero-inflation. Large 
numbers of zeros might just reflect low probabilities, not ZI per se.
Note there's a marginally newer version at 
https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html

  Another, newer choice is glmmTMB (available on Github) with
family="nbinom2"
glmmADMB or glmmTMB, yes: I'm not sure about MCMCglmm
That's right, I think.
I'm going to let Jarrod Hadfield, or someone else, answer this one.
The result is composed of two models -- a 'binary' (structural zero vs
non-structural zero) and a 'conditional' (count) part.
#
Hi Ben,

Thank you very much for your answer!

I am aware that a lot of zero doesn't mean zero inflation, but if my
understanding is correct the only way to check for ZI would be to compare
one model take doesn't take it into account and another one that does right?

With the model example I gave (count~item+item:season+item:
moon+offset(logduration)+(1+indiv)+(1|obs)) glmmADMB doesn't run but I'm
gonna dig a bit more into this ans come back t you if I can't figure it out.

Best,
Stephanie
On 17 May 2016 at 00:41, Ben Bolker <bbolker at gmail.com> wrote: