mixed mutlinomial regression for count data with, overdisperion & zero-inflation

Wed, May 18, 2016 12:26 AM

Yeah thanks Alain, I'm definitely planning to buy this book!

So I looked at the zeros in my data abased on you advice and I did the
following:
mod<-glmer(count~item+item:season+item:moon+item:season:moon+(1|indiv/obs)+(1|id),family=poisson,nAGQ=0,data=diet3)
z<-simulate(mod,nsim=1000)

For the original data I have 69.3% of zeros while the average over the 1000
simulations is 63.5%.Is there a way to statistically compare these 2
values? Or could you say that these 2 figures are not very different and
then zero inflation models might not be necessary?

Best,
Stephanie

On 17 May 2016 at 20:21, Highland Statistics Ltd <highstat at highstat.com>
wrote:


On 17/05/2016 18:53, St?phanie P?riquet wrote:

Dear Alain,

Thanks for your reply and advices! Will try to do that and wait for your
very timely paper to come out to be sure I did the right thing!


Stephanie,

Although it does not cover multinomial models directly, this one may be of
use as well:

Beginner's Guide to Zero-Inflated Models with R (2016). Zuur AF and Ieno EN
http://highstat.com/BGZIM.htm

Sorry for the self-references.

Kind regards,

Alain


Best,
Stephanie

On 17 May 2016 at 12:08, Highland Statistics Ltd <highstat at highstat.com>
wrote:

----------------------------------------------------------------------

Message: 1
Date: Tue, 17 May 2016 08:28:42 +0200
From: St?phanie P?riquet <stephanie.periquet at gmail.com>
To: Ben Bolker <bbolker at gmail.com>
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Mixed mutlinomial regression for count data
      with overdisperion & zero-inflation
Message-ID:
      <CAMKTVFXZnvS1g-FaNVQ1FQUj5u84S-fd=

<k4u_6x5PwJUZ2R%2BbQ at mail.gmail.com>k4u_6x5PwJUZ2R+bQ at mail.gmail.com>

Content-Type: text/plain; charset="UTF-8"

Hi Ben,

Thank you very much for your answer!

I am aware that a lot of zero doesn't mean zero inflation, but if my
understanding is correct the only way to check for ZI would be to

compare

one model take doesn't take it into account and another one that does

right?

Incorrect.
1. Calculate the percentage of zeros for your observed data.
2. Fit a model....this can be a model without zero inflation stuff.
3. Simulate 1000 data sets from your model and for each simulated data
set assess the percentage of zeros.
4. Compare the results in 3 with those in 1.

5. Even nicer....
5a. Plot a simple frequency table for the original data
(plot(table(Response), type = "h").
5b. Calculate a table() for each of your simulated data.
5c. Calculate the average frequency table.
5d. Compare 5a and 5c.

For a nice example and R code, see:
A protocol for conducting and presenting results of regression-type
analyses. Zuur & Ieno
doi: 10.1111/2041-210X.12577
Methods in Ecology and Evolution 2016

Comes out in 2 weeks or so.

Kind regards,

Alain

With the model example I gave (count~item+item:season+item:
moon+offset(logduration)+(1+indiv)+(1|obs)) glmmADMB doesn't run but I'm
gonna dig a bit more into this ans come back t you if I can't figure it

out.

Best,
Stephanie

On 17 May 2016 at 00:41, Ben Bolker < <bbolker at gmail.com>

bbolker at gmail.com> wrote:

St?phanie P?riquet <stephanie.periquet at ...> <stephanie.periquet at ...>

writes:

Dear list members,

First sorry for this very long first post ?

   That's OK.  I'm only going to answer part of it, because it's long.

I am looking for advises to fit a mixed multinomial regression on

count

data that are overdispersed and zero-inflated. My question is to

evaluate

the effect of season and moonlight on diet composition of bat-eared

foxes.

My dataset is composed of 14 possible prey item, 20 individual foxes
observed, 4 seasons and a moon illumination index ranging from 0 to 1

by

0.1 implements (considered as a continuous variable even if takes

only 11

values). For each unique combination of individual*season*moon, I thus

has

14 lines, one for the count of each prey item.

 From what I gathered, it would be possible to use
a standard glmm model of
the following form to answer my question (ie a multinomial

regression):

glmer(count~item+item:season+item:moon+offset(logduration)+
(1+indiv)+(1|obs)+
(1|id), family=poisson)

   Yes, but I don't know if this will account for the possible

dependence

*among* prey types.

where count is the number of prey of a given type recorded eaten;

item is the prey type;

logduration is the log(total time observed for a given combination of
individual*season*moon);

obs is a unique id for each combination of individual*season*moon,
so each
obs value regroups 14 lines (one for each prey item) with the same
individual*season*moon;

id is a unique id for each line to account for overdispersion (as
quasi-poisson or negative binomial distributions are not implemented

in

lme4, Elston et al. 2001).

    Seems about right.
    There is glmer.nb now, but you might not want it; it tends to
be slower and more fragile, and you'd still have to deal with
zero-inflation.

However, they are a lot of zeros in my data i.e. lot of prey items has
never been observed being eaten for mane combinations of
individual*season*moon.

   That doesn't *necessarily* mean you need zero-inflation. Large
numbers of zeros might just reflect low probabilities, not ZI per se.

Following Ben Bolker wiki ( <http://glmm.wikidot.com/faq>

http://glmm.wikidot.com/faq) I summarize

that I

should use of the following methods to answer my question

    - ?      glmmADMB, with family=nbinom
    - ?      MCMCglmm, with family=zipoisson
    - ?      "expectation-maximization (EM) algorithm" in lme4

   Note there's a marginally newer version at
https://rawgit.com/bbolker/mixedmodels-misc/master/glmmFAQ.html

   Another, newer choice is glmmTMB (available on Github) with
family="nbinom2"

Here come the questions:
1.  1. Is it correct to assume that I could use the same model
structure

(count~item+item:season+item:moon+offset(logduration)+(1+indiv)+(1|obs))

in glmmADMB or MCMCglmm to answer my question ?

   glmmADMB or glmmTMB, yes: I'm not sure about MCMCglmm

2.   I then wouldn't need the (1|id) to correct for overdispersion as

both

methods would already account for it, correct?

    That's right, I think.

3.   I am totally new to MCMCglmm, so  ...

   I'm going to let Jarrod Hadfield, or someone else, answer this one.

4.     4.  If I were to use the EM algorithm method,
how should the results
be interpreted?

   The result is composed of two models -- a 'binary' (structural zero

vs

non-structural zero) and a 'conditional' (count) part.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

*St?phanie PERIQUET (PhD) * - Bat-eared Fox Research Project
*Dept of Zoology & Entomology*
*University of the Free State, Qwaqwa Campus*
*Cell: +27 79 570 2683*
ResearchGate profile
<https://www.researchgate.net/profile/Stephanie_Periquet>


Kalahari bat-eared foxes on Twitter <https://twitter.com/kal_batearedfox>

	[[alternative HTML version deleted]]

mixed mutlinomial regression for count data with, overdisperion & zero-inflation

Thread (10 messages)