hurdle model

Thu, Aug 19, 2010 6:22 AM
On Thu, 2010-08-19 at 14:54 +0300, Gavin Simpson wrote:
I know that at least Jane Elith has an email address (I have used it
years ago), so you could ask her. However, it may be  that their hurdle
model uses just Poisson, and there is a minor mistake in their Table 2. 

You can use quasipoisson() or poisson() in glm() in a very natural way:
the fitting happens via iteratively reweighted least squares, and all
you need to define is the relationship between fitted values and
variance. If you look at poisson() and quasipoisson() functions in R
(these provide the backbone of the glm(..., family=)), you see that the
differences are that quasipoissoin()$aic() always returns NA, and
quasipoisson() lacks item simulate(). Otherwise they work in a similar
way. Except in poisson() you take the scale (\phi) to be 1, and in
quasipoisson() you estimate the scale from the fitted model. Then you
just multiply standard errors with the scale, use F tests instead of
Chisq in anova() etc.

I am not sure (or actually, I don't think) that this fitting parallelism
extends to *truncated* Poisson that is used in pscl::hurdle(). Although
you can do fitting by stages, and fit quasipoisson() glm for above-zero
values, I don't think this is the correct thing to do when you are not
allowed to have new zeros. However, the truncated poisson likelihood
model is a huge improvement over hand-fitting glm with iteratively
reweighted least squares and assuming constant variance/fit
relationship.

If you are worried about the overdispersion of the above-zero count
data, use the truncated negative binomial model offerred by
pscl::hurdle(). It is designed for the purpose (and has a more exciting
narrative for ecologists).

Cheers, Jari Oksanen
hurdle model

Thread (9 messages)