fixed effects estimates too small in binomial GLMM with low "success"-rate (
Just to give one more comment (in lieu of actually doing the work): if my conjecture is correct (that it's a Jensen's inequality issue), then the comparison of means on the linear predictor scale would be correct. The problem is that with a lot of GLMM problems you can't sensibly look at the raw data on the linear predictor scale -- e.g. you can't logit-transform binary data and get anything sensible. You could try to work out the expected effect of nonlinear averaging/Jensen's inequality analytically or numerically and see if it matched up with the observed discrepancy ...
On 14-05-19 06:22 AM, Solis wrote:
Dear Ben and others on the list, The 'possibly neglected' mail exchange is probably this one; https://stat.ethz.ch/pipermail/r-sig-mixed-models/2014q1/021919.html I hope the link helps ( I'm about to board a plane) but I'm excited to see my question is picked up and might have an answer. Kind regards, Tom
Op 18 mei 2014 om 12:00 heeft <r-sig-mixed-models-request at r-project.org> het volgende geschreven: Send R-sig-mixed-models mailing list submissions to r-sig-mixed-models at r-project.org To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models or, via email, send a message with subject or body 'help' to r-sig-mixed-models-request at r-project.org You can reach the person managing the list at r-sig-mixed-models-owner at r-project.org When replying, please edit your Subject line so it is more specific than "Re: Contents of R-sig-mixed-models digest..." Today's Topics: 1. fixed effects estimates too small in binomial GLMM with low "success"-rate (Henrik 2. Re: fixed effects estimates too small in binomial GLMM with low "success"-rate (Ben Bolker) 3. Re: fixed effects estimates too small in binomial GLMM with low "success"-rate (Henrik Singmann) 4. "parm" parameter in lme4::confint.merMod (Jake Westfall) ----------------------------------------------------------------------
Message: 1
Date: Sun, 18 May 2014 00:07:13 +0200 From: Henrik Singmann
<henrik.singmann at psychologie.uni-freiburg.de> To:
r-sig-mixed-models at r-project.org Subject: [R-sig-ME] fixed effects
estimates too small in binomial GLMM with low "success"-rate
Message-ID: <5377DD91.9030705 at psychologie.uni-freiburg.de>
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Dear list,
I would really appreciate your input on an issue in which the fixed
effects estimates of a binomial GLMM with relatively low rate of
"successes" are too low (compared to the observed effect) by around
.5%.
The data comes from six comparable experiments in which we measured
participants' performance errors on a series of trials in a design
with one factor with two levels ("a" and "b"). Overall the
difference in error rate between the two levels was very small
(~0.5%), but when analyzing all data (with participant and
experiment as random effects) the effect of factor reached
significance (p = .03), which is also consistent with other
published data.
However, the estimated effects are around .5% lower than the
observed mean error rates indicating that the model may perhaps not
adequately describe the data. Furthermore, I also get a convergence
warning ("Model failed to converge with max|grad| = 0.0013716 (tol
= 0.001)").
####### code 1 ########## require(lme4) dat <-
read.table("http://pastebin.com/raw.php?i=KWjD5EG3") dat$experiment
<- factor(dat$experiment)
# observed effects aggregate(errors ~ factor, dat, mean)
#unweighted ## factor errors ## 1 a 0.02266914 ## 2
b 0.02772540 by(dat, dat$factor, function(x) with(x,
sum(errors*weights)/sum(weights))) #weighted ## dat$factor: a ##
[1] 0.0222343 ##
------------------------------------------------------------ ##
dat$factor: b ## [1] 0.02777651
m1 <- glmer(errors ~ factor + (factor|id) + (1|experiment), dat,
weights = dat$weights, family = binomial) ## Warnmeldung: ## In
checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
: ## Model failed to converge with max|grad| = 0.0013716 (tol =
0.001) coef(summary(m1))[2,, drop = FALSE] ## Estimate
Std. Error z value Pr(>|z|) ## factorb 0.1788881 0.08005533
2.234556 0.02544653
# estimated effect: logistic <- function(x) 1/(1+exp(-x))
logistic(cumsum(fixef(m1))) ## (Intercept) factorb ##
0.01825229 0.02174991
####### end code 1 ##########
Interestingly, I can replicate this behavior when simulating
binomial data similar to the one I am having. When the overall mean
is relatively small, the estimates from the GLMM are consistently
too low. When increasing the overall mean, this consistent bias in
the estimated effect seems to go away (and the observed effect gets
larger due to the non-linear nature of the logistic model). The
code of the simulation for one data set follows below.
My question is: Does anyone have an idea what I could do to reach a
better agreement between observed and predicted values? I tried
different link functions, but that didn't help. And I am unsure
whether I can trust my model given this discrepancy.
Thanks a lot, Henrik
##### code 2 #####
mu <- -3.7 # when larger, problem disappears n_experiments <- 6
n_participants <- 20 n_trials <- 400 sigma_effect <- 0.2
effect_size <- 0.4 sigma_p <- 0.7 sigma_e <- 0.1 prob_level <- 0.5
# create data: df <- expand.grid(experiment = factor(seq(1,
n_experiments)), id = factor(seq(1, n_participants)), factor =
c(-1, 1)) df$id <- with(df, experiment:id) df <- merge(df,
data.frame(experiment = seq(1, n_experiments), re_e =
rnorm(n_experiments, sd = sigma_e))) df <- merge(df, data.frame(id
= levels(df$id), re_p = rnorm(length(levels(df$id)), sd =
sigma_p))) df <- merge(df, data.frame(id = levels(df$id), re_a =
rnorm(length(levels(df$id)), sd = sigma_effect))) df <- with(df,
df[order(id, factor),]) tmp <- rbinom(n_experiments*n_participants,
n_trials, prob_level) tmp <- rep(tmp, each = 2) tmp[c(FALSE, TRUE)]
<- n_trials-tmp[c(FALSE, TRUE)] df$errors <- mapply(function(x, y)
rbinom(1, y, x), with(df,
logistic(mu+re_p+(factor*(effect_size/2)+factor*re_a)+re_e)),
tmp)/tmp df$weights <- tmp df$factor <- factor(df$factor)
aggregate(errors ~ factor, df, mean) #unweighted ## factor
errors ## 1 -1 0.02352719 ## 2 1 0.03718523
s1 <- glmer(errors ~ factor + (factor|id) + (1|experiment), df,
weights = df$weights, family = binomial) ## Warnmeldung: ## In
checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,
: ## Model failed to converge with max|grad| = 0.0130648 (tol =
0.001) coef(summary(s1))[2,, drop = FALSE] # Estimate Std.
Error z value Pr(>|z|) # factor1 0.4765694 0.07578773 6.288213
3.211416e-10
logistic(cumsum(fixef(s1))) # (Intercept) factor1 # 0.01849934
0.02946117
#### end code 2 ####
-- Dr. Henrik Singmann Albert-Ludwigs-Universit?t Freiburg,
Germany http://www.psychologie.uni-freiburg.de/Members/singmann
------------------------------
Message: 2 Date: Sat, 17 May 2014 18:19:05 -0400 From: Ben Bolker
<bbolker at gmail.com> To: Henrik Singmann
<henrik.singmann at psychologie.uni-freiburg.de>,
r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] fixed
effects estimates too small in binomial GLMM with low
"success"-rate Message-ID: <5377E059.2050800 at gmail.com>
Content-Type: text/plain; charset=ISO-8859-15
On 14-05-17 06:07 PM, Henrik Singmann wrote: Dear list, I would really appreciate your input on an issue in which the fixed effects estimates of a binomial GLMM with relatively low rate of "successes" are too low (compared to the observed effect) by around .5%.
Don't know, but one comment and one thought: (1) the convergence warning goes away with the most recent version of lme4 (1.1-7) (2) I'm _guessing_ that you will find that this phenomenon is due to Jensen's effect (i.e., a nonlinear averaging effect) -- that would mean its magnitude should be proportional to the random-effects variances and to the strength of the nonlinear (inverse link) relationship. (I think there was a similar, possibly neglected, question along these lines on the mailing list earlier ...) Ben Bolker ------------------------------ Message: 3 Date: Sun, 18 May 2014 00:38:20 +0200 From: Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> To: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] fixed effects estimates too small in binomial GLMM with low "success"-rate Message-ID: <5377E4DC.6080809 at psychologie.uni-freiburg.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Am 18.05.2014 00:19, schrieb Ben Bolker:
On 14-05-17 06:07 PM, Henrik Singmann wrote:
Dear list, I would really appreciate your input on an issue in which the fixed effects estimates of a binomial GLMM with relatively low rate of "successes" are too low (compared to the observed effect) by around .5%.
Don't know, but one comment and one thought: (1) the convergence warning goes away with the most recent version of lme4 (1.1-7)
This is true, thanks.
(2) I'm _guessing_ that you will find that this phenomenon is due to Jensen's effect (i.e., a nonlinear averaging effect) -- that would mean its magnitude should be proportional to the random-effects variances and to the strength of the nonlinear (inverse link) relationship. (I think there was a similar, possibly neglected, question along these lines on the mailing list earlier ...)
Thanks for the pointer. In fact, when entering experiment as fixed effect (which removes the random effect with the smallest variance), the estimates are way more precise. They are almost perfect. Perhaps I should then use this approach (although I would prefer treating experiment as random). Or what would you do?
Ben Bolker
-- Dr. Henrik Singmann Albert-Ludwigs-Universit?t Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann ------------------------------ Message: 4 Date: Sun, 18 May 2014 00:03:52 -0600 From: Jake Westfall <jake987722 at hotmail.com> To: "r-sig-mixed-models at r-project.org" <r-sig-mixed-models at r-project.org> Subject: [R-sig-ME] "parm" parameter in lme4::confint.merMod Message-ID: <BAY172-W2375062D4CFCE0B6E5F9CBCB330 at phx.gbl> Content-Type: text/plain Hi all, According to the documentation, in confint.merMod() you can use the "parm" parameter to get confidence intervals for only a specified subset of parameters by indicating the integer positions of the desired parameters. But how is one supposed to know which integer positions correspond with which parameters? I poked around a little but found nothing in the methods or slots of the merMod object that indicates this, and nothing in the documentation detailing this. Have I missed something? I guess you can call the function leaving that argument at its default, and the order of the parameters in that output will correspond with the integer positions, but this approach would seem to defeat the point of being able to specify particular subsets of parameters... Jake [[alternative HTML version deleted]] ------------------------------
_______________________________________________ R-sig-mixed-models mailing list R-sig-mixed-models at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models End of R-sig-mixed-models Digest, Vol 89, Issue 22 **************************************************
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models