Dear group - I am currently fitting a Poisson glmer where I have an excess
of outcomes that are zero (>95%). I am now debating on how to proceed and
came up with three options:
1.) Just fit a regular glmer to the complete data. I am not fully sure how
interpret the coefficients then, are they more optimizing towards
distinguishing zero and non-zero, or also capturing the differences in
those outcomes that are non-zero?
2.) Leave all zeros out of the data and fit a glmer to only those outcomes
that are non-zero. Then, I would only learn about differences in the
non-zero outcomes though.
3.) Use a zero-inflated Poisson model. My data is quite large-scale, so I
am currently playing around with the EM implementation of Bolker et al.
that alternates between fitting a glmer with data that are weighted
according to their zero probability, and fitting a logistic regression for
the probability that a data point is zero. The method is elaborated for the
OWL data in:
https://groups.nceas.ucsb.edu/non-linear-modeling/projects/owls/WRITEUP/owls.pdf
I am not fully sure how to interpret the results for the zero-inflated
version though. Would I need to interpret the coefficients for the result
of the glmer similar to as I would do for my idea of 2)? And then on top of
that interpret the coefficients for the logistic regression regarding
whether something is in the perfect or imperfect state? I am also not quite
sure what the common approach for the zformula is here. The OWL
elaborations only use zformula=z~1, so no random effect; I would use the
same formula as for the glmer.
I am appreciating some help and pointers.
Thanks!
Philipp