Dear list,
I would really appreciate your input on an issue in which the fixed effects estimates of a binomial GLMM with relatively low rate of "successes" are too low (compared to the observed effect) by around .5%.
The data comes from six comparable experiments in which we measured participants' performance errors on a series of trials in a design with one factor with two levels ("a" and "b"). Overall the difference in error rate between the two levels was very small (~0.5%), but when analyzing all data (with participant and experiment as random effects) the effect of factor reached significance (p = .03), which is also consistent with other published data.
However, the estimated effects are around .5% lower than the observed mean error rates indicating that the model may perhaps not adequately describe the data. Furthermore, I also get a convergence warning ("Model failed to converge with max|grad| = 0.0013716 (tol = 0.001)").
####### code 1 ##########
require(lme4)
dat <- read.table("http://pastebin.com/raw.php?i=KWjD5EG3")
dat$experiment <- factor(dat$experiment)
# observed effects
aggregate(errors ~ factor, dat, mean) #unweighted
## factor errors
## 1 a 0.02266914
## 2 b 0.02772540
by(dat, dat$factor, function(x) with(x, sum(errors*weights)/sum(weights))) #weighted
## dat$factor: a
## [1] 0.0222343
## ------------------------------------------------------------
## dat$factor: b
## [1] 0.02777651
m1 <- glmer(errors ~ factor + (factor|id) + (1|experiment), dat, weights = dat$weights, family = binomial)
## Warnmeldung:
## In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0013716 (tol = 0.001)
coef(summary(m1))[2,, drop = FALSE]
## Estimate Std. Error z value Pr(>|z|)
## factorb 0.1788881 0.08005533 2.234556 0.02544653
# estimated effect:
logistic <- function(x) 1/(1+exp(-x))
logistic(cumsum(fixef(m1)))
## (Intercept) factorb
## 0.01825229 0.02174991
####### end code 1 ##########
Interestingly, I can replicate this behavior when simulating binomial data similar to the one I am having. When the overall mean is relatively small, the estimates from the GLMM are consistently too low. When increasing the overall mean, this consistent bias in the estimated effect seems to go away (and the observed effect gets larger due to the non-linear nature of the logistic model). The code of the simulation for one data set follows below.
My question is: Does anyone have an idea what I could do to reach a better agreement between observed and predicted values? I tried different link functions, but that didn't help. And I am unsure whether I can trust my model given this discrepancy.
Thanks a lot,
Henrik
##### code 2 #####
mu <- -3.7 # when larger, problem disappears
n_experiments <- 6
n_participants <- 20
n_trials <- 400
sigma_effect <- 0.2
effect_size <- 0.4
sigma_p <- 0.7
sigma_e <- 0.1
prob_level <- 0.5
# create data:
df <- expand.grid(experiment = factor(seq(1, n_experiments)), id = factor(seq(1, n_participants)), factor = c(-1, 1))
df$id <- with(df, experiment:id)
df <- merge(df, data.frame(experiment = seq(1, n_experiments), re_e = rnorm(n_experiments, sd = sigma_e)))
df <- merge(df, data.frame(id = levels(df$id), re_p = rnorm(length(levels(df$id)), sd = sigma_p)))
df <- merge(df, data.frame(id = levels(df$id), re_a = rnorm(length(levels(df$id)), sd = sigma_effect)))
df <- with(df, df[order(id, factor),])
tmp <- rbinom(n_experiments*n_participants, n_trials, prob_level)
tmp <- rep(tmp, each = 2)
tmp[c(FALSE, TRUE)] <- n_trials-tmp[c(FALSE, TRUE)]
df$errors <- mapply(function(x, y) rbinom(1, y, x), with(df, logistic(mu+re_p+(factor*(effect_size/2)+factor*re_a)+re_e)), tmp)/tmp
df$weights <- tmp
df$factor <- factor(df$factor)
aggregate(errors ~ factor, df, mean) #unweighted
## factor errors
## 1 -1 0.02352719
## 2 1 0.03718523
s1 <- glmer(errors ~ factor + (factor|id) + (1|experiment), df, weights = df$weights, family = binomial)
## Warnmeldung:
## In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0130648 (tol = 0.001)
coef(summary(s1))[2,, drop = FALSE]
# Estimate Std. Error z value Pr(>|z|)
# factor1 0.4765694 0.07578773 6.288213 3.211416e-10
logistic(cumsum(fixef(s1)))
# (Intercept) factor1
# 0.01849934 0.02946117
#### end code 2 ####
Dear list,
I would really appreciate your input on an issue in which the fixed
effects estimates of a binomial GLMM with relatively low rate of
"successes" are too low (compared to the observed effect) by around .5%.
Don't know, but one comment and one thought:
(1) the convergence warning goes away with the most recent version
of lme4 (1.1-7)
(2) I'm _guessing_ that you will find that this phenomenon is
due to Jensen's effect (i.e., a nonlinear averaging effect) --
that would mean its magnitude should be proportional to the
random-effects variances and to the strength of the nonlinear
(inverse link) relationship.
(I think there was a similar, possibly neglected, question along
these lines on the mailing list earlier ...)
Ben Bolker
Dear list,
I would really appreciate your input on an issue in which the fixed
effects estimates of a binomial GLMM with relatively low rate of
"successes" are too low (compared to the observed effect) by around .5%.
Don't know, but one comment and one thought:
(1) the convergence warning goes away with the most recent version
of lme4 (1.1-7)
This is true, thanks.
(2) I'm _guessing_ that you will find that this phenomenon is
due to Jensen's effect (i.e., a nonlinear averaging effect) --
that would mean its magnitude should be proportional to the
random-effects variances and to the strength of the nonlinear
(inverse link) relationship.
(I think there was a similar, possibly neglected, question along
these lines on the mailing list earlier ...)
Thanks for the pointer. In fact, when entering experiment as fixed effect (which removes the random effect with the smallest variance), the estimates are way more precise. They are almost perfect. Perhaps I should then use this approach (although I would prefer treating experiment as random). Or what would you do?