Hi everyone,
I'm writing because I have a couple of questions about analyzing data with
complete separation when fitting a Binomial GLMM. I've been following the
suggestions made by Ben Bolker in this StackOverflow post<
https://stats.stackexchange.com/questions/132677/binomial-glmm-with-a-categorical-variable-with-full-successes/132678#132678>
and in the glmmTMB FAQ<
https://bbolker.github.io/mixedmodels-misc/ecostats_chap.html>.
(You can generate data with the lines below)
set.seed(123)
# Number of observations per Trial
n_29 <- 50
n_30 <- 50
Data <- data.frame(
Ind = 1:(n_29 + n_30),
Trial = rep(c(29, 30), times = c(n_29, n_30)),
Response = c(rep(1, n_29 * 0.76), rep(0, n_29 * 0.24),
rep(1, n_30 * 1), rep(0, n_30 * 0))
)
I've been using the bglmer alternative (blme package) mentioned there,
which involves including a prior on the fixed effects to penalize the
parameter estimates. The model looks like this:
m1 <- bglmer(Response ~ as.factor(Trial) + (1|Ind),
family = binomial,
fixef.prior = normal(cov = diag(8,2)),
data = Data)
(where 8 is the variance parameter and 2 the number of levels of the
factor Trial).
I have two main questions:
Penalization Method in bglmer:
I couldn?t determine which method blme uses for penalization in bglmer. Is
it based on maximum likelihood, or does it use Markov-Chain Monte Carlo
(MCMC) methods? I?m relatively new to Bayesian methods and have been
interpreting the bglmer results similarly to how I interpret GLMMs in a
frequentist framework (e.g., calculating p-values). However, I?m still
uncertain about the specific method applied by bglmer. The package
documentation doesn?t provide much detail on this aspect. Additionally, is
it acceptable to analyze the results of this model using a frequentist
approach?
Choosing the Prior Variance:
Regarding the specification of the prior in bglmer function (e.g.,
fixef.prior = normal(cov = diag(variance, n_fixed_effects)) ), what
information should I consider to choose an appropriate variance value?
Would it be reasonable to use as a reference the standard error of a
parameter estimate not affected by complete separation in the non-penalized
GLMM (e.g., using glmmTMB function)? For example, given the summary of the
following non-penalized model:
m1NonPenalized <- glmmTMB(Response ~ as.factor(Trial) +(1|Ind),
family = binomial,
data = DataM1E2)
summary(m1NonPenalized)
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.153e+00 4.958e-01 2.325 0.0201 *
as.factor(Trial)30 3.338e+01 4.458e+06 0.000 1.0000
Here, we observe that Trial 30 is affected by separation as its standard
error is extremely large, whereas the intercept (Trial 29) is not. Would
the standard error of Trial 29 serve as a reference for choosing an
appropriate variance value (i.e., as the std.error is 0.49, a variance
value of 1 or higher would be reasonable)?
Additionally, I encountered an issue with the estimation of probabilities
in the model fitted with bglmer (m1). You can generate an histogram to
inspect the observed frequencies of data in order to follow the discussion
below:
hist <- Data %>% ggplot(aes(x = as.factor(Response))) +
geom_bar(aes(y = after_stat(prop) * 100,group = Trial, fill = Trial),
colour = "black") +
scale_y_continuous(limits = c(0,100), n.breaks = 10) +
ylab("% of responses") +
xlab("Response") +
facet_grid(. ~ Trial)
While the observed probabilities of response were around 0.76 for Trial
29 and 1 for Trial 30, the model estimated probabilities of approximately
0.999 for both, with incredibly small standard errors (much smaller than
expected given the data variability):
Trial prob SE df asymp.LCL asymp.UCL
29 0.99985 2.1918e-04 Inf 0.99747 0.99999
30 0.99998 3.3708e-05 Inf 0.99921 1.00000
I noticed that reducing the variance value from 8 to 1 inside fixef.prior
produced estimates closer to the observed values, as you can see below:
Trial prob SE df asymp.LCL asymp.UCL
29 0.828 0.0769 Inf 0.626 0.933
30 0.973 0.0195 Inf 0.893 0.994
What might be the cause of this issue?
Thank you very much in advance for your time and help!
Cheers,
Tom?s
[[alternative HTML version deleted]]