Multilevel zero-inflated models: model selection and model assumption checks

I won't address everything, but will give a few short pointers
Thank you in advance for reading my long-winded email! I appreciate any
help and guidance.

I am working on research that involves frequency of events (summed during
an hour) and duration of events (calculated a seconds during an hour). Both
data types have 0s and are positively skewed. The observations (n =
14,000+) are collected hourly for up to 2 weeks for 500 participants. A
simple version of the model is: outcome ~ explanatory +
(1|participant/date).

For the frequency data, I have fit:

   - (Hurdle) Poisson / Negative Binomial (the var > mean)
   - (Hurdle) Gamma (log link) (assuming okay to treat frequency data as
   continuous)

The AIC is best for the hurdle negative binomial model. I understand that
for AIC comparisons, the Poisson models do not get an extra scale
parameter, while the other models. However, the difference in AICs are in
the hundreds, so I imagine the small difference in k parameters would not
change interpretation, as long as the log-likelihood functions are
similar...

(Q1) Does the above sound reasonable?

For the duration data, I have fit:

   - (Hurdle) Gamma (log link)
   - Zero-Inflated Beta (after transforming duration by SECONDS/3600. I
   decided to try Beta because the hourly data is bounded at 3600 seconds and
   I was not sure if an upper bound affects the gamma distribution).

(Q2) Does an upper bound mean make it inappropriate to fit a Gamma
distribution?
For this type of thing, my advise is always "plot the model against the
data!". You can use an effects-type plot overlaid with the data or you
can plot both the fitted and observed values against the predictors (in
addition to the classical fitted vs. observed plot). Which models
capture which aspects of your data? Which models break down in
unacceptable ways? As George Box said, "all models are wrong, but some
models are useful." Which models are most useful (i.e. wrong in ways you
can accept and right in ways you need)?
N.B. If fitting regular gamma/beta, I first transformed data to remove 0s.
For both the regular and zero-inflated beta, I shrunk the 1s using EITHER
the algorithm here: https://www.ncbi.nlm.nih.gov/pubmed/16594767 OR the
inverse hyperbolic sine transformation (IHST)). I had thought about using
ZOIB, but I did not want to use BRMS or GAMLSS (since I am a beginner).

Because I transformed the data in different ways for the different models,
I understand that I cannot compare the AICs. Therefore, I thought about
using cross fold validation. However, the package I was recommended to use
(cvms https://github.com/LudvigOlsen/cvms) says it supports lmer/glmer, but
doesn't appear to support glmmTMB, which is how I fit the above models.

I thought a potential solution might be to fit two separates models
(binomial for 0/1 data and count/continuous for positive data, as suggested
in this post: assessing glmmTMB hurdle model fit using DHARMa scaled
residual plot
<https://stats.stackexchange.com/questions/400147/assessing-glmmtmb-hurdle-model-fit-using-dharma-scaled-residual-plot>).
In this way, I could fit lmer/glmer, and perform cross fold validation
using cvms on the positive data. But I'm not sure if this makes sense (to
partition out the 0 data, and only use the positive data to validate the
models)? I suppose I could simulate data to check predictive accuracy on
the glmmTMB models? Is this a reasonable method of comparing the models?
This is in some sense exactly what a hurdle model does. :)
(Q3) How would you recommend comparing the models with different
distributions and family transformations?
With difficulty if you want classical tests (they aren't nested so you
can't do likelihood-ratio tests and I find even AIC/BIC less than ideal
here because the different link functions means that you're not quite
comparing the same aspects of the data). You could use stratified
cross-validation to get about predictive power, but this can be
challenging depending on the exact nesting/crossing structure of your
data (sorry, I didn't have time to read and think about your particular
data in detail).? Otherwise, I recommend the graphical comparison I
mentioned above.

Hope that helps,
Phillip
Additionally, I considered fitting a Tweedie distribution to both the
frequency and duration. However, I was having trouble finding guidance on
how to interpret the beta coefficients.

(Q4) How does one interpret Tweedie beta coefficients? Do you exponentiate
and discuss as rates?

Finally, I have used the performance package to check the model
assumptions. The results can be found at this link:

<
https://docs.google.com/document/d/1s6qtSAvw297F_cvblAq7Gh2fWgFmz9I5Nxi1K2gjScI/edit?usp=sharing
Pic 1 and 2 are both for the Hurdle Gamma on Duration, but have different
explanatory variables. The pics are pretty similar to the results for the
other models/distributions. I have been reading conflicting advice as to
whether the residuals for GLM models need to be normally distributed, etc.
I understand for large datasets, non-normality may be a non-issue even for
linear regression. We are not concerned with predicting new data, but
explaining the data that we have. Therefore, given the size of my sample
and the distributions. how concerned should I be about the QQ plots and
homogeneity of variance plots below?

Thank you very much!

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models