mixed zero-inflated poisson regression
Hi Jarrod (and all other r-sig-mixed-members who answered me) First, thanks to all who replied. I learned a lot from all answers, and it seems I have got a better understanding on the topic now. Jarrod Hadfields suggestion about priors helped me a lot to get betters models. As you suggested my models did not run well at all and there were several issues about my code. There were a lot of serious problems associated with the models (autocorrelation, correlation between chains from different parameters, poor convergence). Refining my code according to suggestions makes the models now run much better. I also centralised the predictor variable, which helped a lot on the correlation between the chains of the parameter estimates for the fixed factors. It also appears that the models have to run for some time in order to converge, and that I need a quite high thinning to reduce autocorrelation. Finally, using the priors suggested by Jarrod (with the fix=2-adjustment) of course did the trick and made models more stable and reliable. Currently I am running 3 MCMCglmm for each model, with nitt=1000000, burn=500000 and thin=1000. This seems to do give reasonable output (Gelman diagnostics < 1.02 for all chains), which are consistent among similar runs, although it takes some time. As suggested by other list members, it may be a good idea to verify results using a MLE-based approach, for instance using ADMB. A recent (future) book was also suggested (even as a Christmas present for myself. I was thinking more in line of a pair of new skis, but perhaps a book about zero-inflated models and generalized linear mixed models would be just as fun). Link to the book: http://www.highstat.com/book4.htm Again, thanks to all for the help. Ivar
On 16.11.2011 21:43, Jarrod Hadfield wrote:
Hi Denise, Yes I did. Thanks for correcting it. Jarrod On 16 Nov 2011, at 19:58, Denise Rocha wrote:
Hi Dr. Jarrod.
Didn't you want to say:
"prior<-list(R=list(V=diag(2), nu=0.002, fix=*2*)..." ?
Thank you.
Denise.
2011/11/14 Jarrod Hadfield <j.hadfield at ed.ac.uk
<mailto:j.hadfield at ed.ac.uk>>
Hi,
ZIP models generally mix poorly and often need to be run for long
periods of time. However, you seem to be estimating a residual
variance for the zero-inflation bit and this is not identifiable.
In this case you will never get convergence or good mixing. I
recommend you fix the residual variance of the zero-inflation
process at one using a prior like:
prior<-list(R=list(V=diag(2), nu=0.002, fix=1),
G=list(G1=list(V=1, nu=1, alpha.mu <http://alpha.mu>=0,
alpha.V=1000)))
I have used a parameter expansion for the random effects.
However, your specification is a bit odd because you assume that
the random effects for the poisson and zero-inflation have equal
variance and a correlation of 1. ~us(trait):gr or
~idh(trait):gr are probably more reasonable and you *could* use
the priors:
prior<-list(R=list(V=diag(2), nu=0.002, fix=1),
G=list(G1=list(V=diag(2), nu=2, alpha.mu
<http://alpha.mu>=c(0,0), alpha.V=diag(2)*1000)))
for ~us(trait):gr, and
prior<-list(R=list(V=diag(2), nu=0.002, fix=1),
G=list(G1=list(V=diag(2), nu=1, alpha.mu
<http://alpha.mu>=c(0,0), alpha.V=diag(2)*1000)))
~idh(trait):gr.
~gr might make more sense if you use a zero-altered poisson
("zapoisson") rather than a zero-inflated poisson.
Cheers,
Jarrod
On 14 Nov 2011, at 16:17, Ivar Herfindal wrote:
Dear Thierry
Thank you for your quick response. I have previously tried to
follow the CourseNote you suggested without success. I gave
it another try today, and it seems I have manged to generate
such a zero-inflated model as I want during the day. The
syntax looks something like this:
mcmodm1 <- MCMCglmm(Y ~ trait -1 + at.level(trait, 1:2):A,
data=dataa,
family="zipoisson", rcov=~idh(trait):units, nitt=30000,
random=~gr)
However, the main challenge is that the model seems quite
unstable withmy "real" data. Even with 100000 iterations, two
identical specified models gives quite different output, and
plotting the mcmodm1$Sol suggest not very stable estimates.
This occurs also on my simulated data.
Do anyone have any suggestion on how to make the model more
stable (e.g. by the rcov or random arguments)? Or do I just
have to run it as long as it takes to get "fairly precise"
estimates?
Ivar
On 14.11.2011 10:40, ONKELINX, Thierry wrote:
Dear Ivar,
Your assumption on MCMCglmm is incorrect. You can specify
different predictor for both the count and the binomial
component. Have a look at the CourseNotes vignette for an
example.
Best regards,
Thierry
----------------------------------------------------------------------------
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie& Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium
Research Institute for Nature and Forest
team Biometrics& Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be <mailto:Thierry.Onkelinx at inbo.be>
www.inbo.be <http://www.inbo.be>
To call in the statistician after the experiment is done
may be no more than asking him to perform a post-mortem
examination: he may be able to say what the experiment
died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an
answer does not ensure that a reasonable answer can be
extracted from a given body of data.
~ John Tukey
-----Oorspronkelijk bericht-----
Van: r-sig-mixed-models-bounces at r-project.org
<mailto:r-sig-mixed-models-bounces at r-project.org>
[mailto:r-sig-mixed-models- <mailto:r-sig-mixed-models->
bounces at r-project.org <mailto:bounces at r-project.org>]
Namens Ivar Herfindal
Verzonden: maandag 14 november 2011 9:57
Aan: r-sig-mixed-models at r-project.org
<mailto:r-sig-mixed-models at r-project.org>
Onderwerp: [R-sig-ME] mixed zero-inflated poisson
regression
Dear all
It seems like I have reached the limit of my
statistical skills, and now turn to the
mixed-list for some assistance.
I have a dataset with a response variable that
appears to have a distribution that
is zero-inflated poisson (ZIP) (or close to). The
response variable is number of
fledglings from nests with known mothers.
The number of fledglings has a lot of zeros that
comes from failed breeding
attempts. The number of fledgings can therefore be
considered as a two-step
process:
1. do the female breed, and
2. if the female breed, what was the number of
fledglings.
I want to relate the number of fledglings to some
predictor variables, and I am
particularly interested in the combined effect of
these variables on the
probability to breed and number of fledglings given
breeding. I have several
observations per female (multiple years) and each
nest location is also
represented by several females (in different years).
I therefore need/want to
account for random factors. I guess I could run two
separate models (one
binomial and one poisson), but I need to make
predictions on how my
explanatory variables affect the number of fledglings
irrespective of the
mechanism behind.
I have searched R-sites a lot, and it found some
packages that can model ZIP.
1. pscl (function zeroinfl)
Pros: have the option to relate both the poisson and
binomial process to the
predictor variable, so I can evaluate how probability
of breeding and number of
fledglings depend on my predictor variables
Cons: Can not include random factors.
2. glmmADMB (function glmmadmb), MCMCglmm, and gamlss
Pros: allows random factors
Cons: 1. all packages seems to assume that all zeros
have the same probability
of belonging to the zero component, i.e. the
logit-link structure does not depend
on the predictor variable.
2. For some of the packages (particularly MCMCglmm) I
have some challenges
about how to specify the model correctly.
Does anyopne in the R(mixed) community have
experience on this? I provide a
simple example below to illustrate what I want.
Ivar Herfindal
## A simple reproducible example
library(pscl)
library(glmmADMB)
library(MCMCglmm)
library(gamlss)
set.seed(4242)
# create some dummy data that have a lot of zeros in
the dependent variable
dataa<- cbind.data.frame(Y=rbinom(100, 1, 0.7))
dataa$Y<- dataa$Y *
rpois(100, 3) # the response dataa$A<- rnorm(100) #
explanatory variable
dataa$gr<- factor(1:10) # grouping variable
# fit models with different packages
## First non-mixed models
### pscl zeroinfl
zimod<- zeroinfl(Y ~ A, data=dataa, dist="poisson") #
Get estimate of A on
poisson and binomial process separate, this is what I
want (I believe so...)
zimod2<- zeroinfl(Y ~ A|1, data=dataa,
dist="poisson") # All zero counts have
the same probability of belonging to the zero component
### glmmADMB
admod<- glmmadmb(Y ~ A, data=dataa, family="poisson",
zeroInflation=TRUE)
# Same results as zimod2.
# Q: Is it possible to obtain similar results as
zimod, i.e.
# separate the effect of A on the poisson and
binomial process?
### MCMCglmm
mcmod<- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
rcov=~idh(trait):units)
# more or less same as admod and zimod2, but
"plot(mcmod$Sol)" reveals large
variation.
# Can change burn and nitt to get more stable(?)
mcmod2<- MCMCglmm(Y ~ A, data=dataa, family="zipoisson",
rcov=~idh(trait):units, burn=10000, nitt=20000) # Q:
Is it possible to separate the
effect of A on the poisson and binomial process?
### gamlss
gamod<- gamlss(Y ~ A, data=dataa, family=ZIP) # same
as zimod2 and admod
## Some attempts on mixed models with gr as random
factor admodm<-
glmmadmb(Y ~ A, data=dataa, family="poisson",
zeroInflation=TRUE,
random=~(1|gr)) mcmodm<- MCMCglmm(Y ~ A, data=dataa,
family="zipoisson", rcov=~idh(trait):units,
random=~idh(trait):gr) gamodm<-
gamlss(Y ~ A + random(gr), data=dataa, family=ZIP) #
For glmmadmb and gamlss
I think the models are specified ok # For MCMCglmm I
am not sure if it is
specified correctly # But I still want to be able to
separate the effect of A on the
two processes.
--
--------------------------------------------------
Ivar Herfindal (PhD, Researcher)
Centre for Conservation Biology
Department of Biology
Norwegian University for Science and Technology
N-7491 Trondheim, Norway
email: Ivar.Herfindal at bio.ntnu.no
<mailto:Ivar.Herfindal at bio.ntnu.no>
web: http://www.ntnu.no/employees/ivar.herfindal
phone: +47 73596253
Cellphone: +47 91625333
fax: +47 73596090
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
--------------------------------------------------
Ivar Herfindal (PhD, Researcher)
Centre for Conservation Biology
Department of Biology
Norwegian University for Science and Technology
N-7491 Trondheim, Norway
email: Ivar.Herfindal at bio.ntnu.no
<mailto:Ivar.Herfindal at bio.ntnu.no>
web: http://www.ntnu.no/employees/ivar.herfindal
phone: +47 73596253
Cellphone: +47 91625333
fax: +47 73596090
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
*Denise Rocha Ayres*
Mestre em Gen?tica e Melhoramento Animal
Doutoranda em Gen?tica e Melhoramento Animal
UNESP-C?mpus de Jaboticabal
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
-------------------------------------------------- Ivar Herfindal (PhD, Researcher) Centre for Conservation Biology Department of Biology Norwegian University for Science and Technology N-7491 Trondheim, Norway email: Ivar.Herfindal at bio.ntnu.no web: http://www.ntnu.no/employees/ivar.herfindal phone: +47 73596253 Cellphone: +47 91625333 fax: +47 73596090