GLMM post-hoc analysis with bootstrapping - R-SIG-mixed-models

Mon, Oct 28, 2024 1:33 AM #
Hello all
I hope you are all doing well!
I was also hoping someone could perhaps help me out with the following:
For an ecological study, we?re interested in comparing the diversity of different organism groups between one land use type (let?s call it ?type 1?) and three other land use types.
Our model is: Diversity ~ landuse + (1|location)
Currently, I am using the emmeans-package to conduct the post-hoc analysis and thereby obtain

  1.  Estimated marginal means of the diversity per land use type and their 95% confidence intervals, transformed to the response scale (with a bias adjustment for the random effects)
  2.  Contrasts between the emmeans (more specifically ratios, in the response scale) and the 95%CI thereof, with a Dunnett-like adjustment for multiple comparisons
  3.  P-values for the pairwise comparisons between the estimated mean of land use type 1 and the three other land use types, tests performed on the log-scale, with a Dunnet-like adjustment for multiple comparisons

However, as in the emmeans package Wald-intervals and -tests are used, I was hoping I could shift to an approach that uses bootstrap intervals which would be a bit more reliable for my small sample sizes.
Since I don?t manage to figure out how to code this to obtain the same output as above (a-c), I was hoping someone could help me with this.
Below is a minimal reproducible example for how I currently approach the post-hoc analysis.
It is not the best example given the singularity and really low random effect variance, but I hope this doesn?t bother for this question. Otherwise I could provide another dataset.
Thank you very much in advance for any help!
Lieke

# example data (D1 stands for the Shannon diversity)
example_df <- data.frame(
  code = c(1:88),
  landuse = c(rep("type1", 28), rep("type2", 24),
              rep("type3", 24), rep("type4", 12)),
  location = c(rep(letters[c(1,2,2:13)], each = 2),
               rep(letters[c(1,3:13)], each = 2),
               rep(letters[c(1:6,8:13)], each = 2),
               rep(letters[c(5,6,7,9,10,12)], each = 2)),
  D1 = c(2.59, 2.23, 1, 1, 1, 2.46, 2.94, 1, 2.94, 1.71,
          3.92, 2.94, 0, 3.78, 1, 2, 1.3, 1, 0, 0, 0, 3.75,
          0, 1, 0, 2, 2.38, 2, 1.51, 2.38, 1.89, 0, 2.97,
          1.18, 3.26, 5.86, 3.59, 1, 4.63, 2.94, 0, 0, 2.7,
          3.26, 2, 3.79, 0, 1, 4, 0, 2.8, 2, 0, 0, 1.7, 1,
          2, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 0.98, 0, 0, 0, 1, 0,
          0, 1.38, 2, 0, 0, 0, 1.05, 0, 0, 1, 1, 1.16, 0, 2, 1)
)

# tweedie model
model <- glmmTMB(D1 ~ landuse + (1|location),
                 family = tweedie,
                 data = example_df)

# get "sigma"/extra dispersion necessary for bias adjustment (for random effects) when transforming estimates and intervals from log to response scale
extra_disp <- sqrt(sum(insight::get_variance(model)[["var.intercept"]]))
### singularity --> can't compute random effect variances reliably, but hope this is OK for the example
# a) get estimated marginal means and their 95%CI (Wald here) at the response scale
(model.emm <- emmeans(model, ~ landuse, type = "response",
                       bias.adjust = TRUE, sigma = extra_disp))
# b) get contrasts and their 95%CI at the response scale, with a Dunnet-like adjustment for multiple comparisons
(model.ratios <- confint(contrast(emmeans(model, ~ landuse),
                                     type = "response",
                                     bias.adjust = TRUE, sigma = extra_disp,
                                     "trt.vs.ctrl", ref = 1, reverse = TRUE)))

# c) get p-values for contrasts (tests performed on the log scale)
contrast(emmeans(model, ~ landuse), "trt.vs.ctrl", ref = 1, reverse = TRUE)