-----Oorspronkelijk bericht-----
Van: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] Namens
Christopher Desjardins
Verzonden: vrijdag 29 oktober 2010 16:47
Aan: Steve Hong
CC: r-sig-mixed-models at r-project.org
Onderwerp: Re: [R-sig-ME] analysis of count data with many zero values
Hi Steve,
The MCMCglmm package has several different models that you
could fit to zero-inflated count data. You can fit
zero-inflated Poisson models, hurdle models, zero-alterated
and zero-truncated models. I don't believe you can fit
zero-inflated negative binomials with that package but I
could be wrong.
Also I believe that ZINB models work well when you have
zero-inflated and non-zero overdispersed data. You could also
roll your own using rjags or r2winbugs, etc.
There are lots of publications out there examining
zero-inflation especially using MCMC based approaches. (Do a
quick Google Scholar search for zero-inflated multilevel
models). In addition, Jarrod Hadfield's CourseNotes (they
come w/ MCMCglmm) are also quite informative and provide some
examples of how you might fit such a model. In my experience
with count data that are highly zero-inflated (86% of all
data were zeroes), the ZIP model worked well but converged
very slowly and required about 60,000 MCMC iterations. If
you'd like to see the code I can share it as well. Also I
believe this topic has come up several times and I would
encourage to search through the archives of R-Sig-Mixed-Models.
HTH,
Chris
On Fri, Oct 29, 2010 at 9:32 AM, Steve Hong
<emptican at gmail.com> wrote:
Dear list,
This is the first time I have this type of data. I have count data
collected repeatedly from the same plot with multiple years
and have found that proportion of 'zero' values are very
of proportion is about 92 %, min: 53 %, max: 100 %). Only one year
has 53% of zeros in the data and the rest of years have at least
greater than 86% zero values in the data set.
The objective of the study is to develop predictive models and
validate them, for example, using cross validation.
Variables collected are: year, insect count, longitude,
properties (x1...x4).
Since data have too many zero observations, I am thinking
zero inflated model to fit the data. However, I am very
My questions are:
1. Is it possible to use zero inflated model to fit data with about
90% zeros? I am wondering if zero proportion is too high
inference using statistical methods.
2. If I can use zero inflated models, can I use either Poisson
distribution or negative binomial distribution? Or both?
3. Do you have any good reference (paper and/or website)
'easy'
tutorial for me to study?
I am wondering if I provided enough information or submitted it to
correct mailing list. Please let me know if you have any
comments and suggestions.
I would greatly appreciate that.
Thank you very much in advance!!!
Steve
[[alternative HTML version deleted]]