Truncated Negative Binomial Model Unexpected Marginal Means
A quick test suggests that emmeans is predicting the response based on the mean of the *un*truncated distribution (I don't remember and/or haven't looked into all of the guts of emmeans). Don't know if Russ Lenth (emmeans maintainer) is reading ... n <- 1000 dd <- data.frame(f = factor(rep(1:2, each = n))) gb <- log(c(2,4)) set.seed(101) dd <- transform(dd, y = rnbinom(2*n, mu = exp(gb[f]), size = 2)) dd2 <- subset(dd, y > 0) ## un-truncated means aggregate(y ~ f, data = dd, FUN = mean) ## f y ## 1 1 2.047 ## 2 2 3.917 ## truncated means aggregate(y ~ f, data = dd2, FUN = mean) ## f y ## 1 1 2.781250 ## 2 2 4.446084 library(glmmTMB) library(emmeans) m1 <- glmmTMB(y ~ f, family = truncated_nbinom2, data = dd2) ## doesn't match exactly but close to untruncated means emmeans(m1, ~ f, type = "response") ## f response SE df lower.CL upper.CL ## 1 2.15 0.0891 1614 1.99 2.34 ## 2 3.98 0.1262 1614 3.74 4.23 ## matches means exactly m2 <- glmmTMB(y ~ f, family = nbinom2, data = dd) emmeans(m2, ~ f, type = "response") ## f response SE df lower.CL upper.CL ## 1 2.05 0.0651 1997 1.92 2.18 ## 2 3.92 0.1094 1997 3.71 4.14
On 2/15/22 10:04 AM, Alex Waldman wrote:
Dear All, Hope all is well! This may be a na?ve question but I am running a hurdle negative binomial model to look at the differences in counts of differing types in different locations. My major interest is the conditional model (ie when counts are above 0). I run the following code: model<-glmmTMB(Count ~ Location*Type + (1 | ID), zi=~Location*Type + (1|ID), data=data, family="truncated_nbinom1",control=glmmTMBControl(optimizer=optim, optArgs=list(method="BFGS"))) var.corr <-VarCorr(model) Conditional model: Groups Name Std.Dev. ID (Intercept) 0.37105 Zero-inflation model: Groups Name Std.Dev. ID (Intercept) 1.3207 emmeans <- emmeans(model, ~ Location*Type, type="response", sigma=0.37105, bias.adjust=TRUE) Location Type response SE df lower.CL upper.CL 0 0 1.117 0.277 631 0.687 1.82 1 0 0.940 0.251 631 0.556 1.59 2 0 0.893 0.266 631 0.498 1.60 0 1 1.325 0.254 631 0.909 1.93 1 1 1.090 0.248 631 0.698 1.70 2 1 1.452 0.300 631 0.967 2.18 Confidence level used: 0.95 Intervals are back-transformed from the log scale Bias adjustment applied based on sigma = 0.37105 However, I?m not sure why the estimated means and confidence intervals will include values below 1 in the conditional model as I anticipated these values would represent the average number of non-zero counts? Is there something I may be doing wrong or not understanding? Thanks in advance for your help! Warm Regards, Alex [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering (Acting) Graduate chair, Mathematics & Statistics