Dear R-sig-mixed:
I was struck today by the way the Internet has accelerated research.
At one time, it might have taken a month or two to track down the
articles on this problem and conclude I need to ask for advice. Now,
however, I realize the need within hours.
Recall the question that started us debating a few days ago was a
logistic regression in which OP noticed the mis-match between the
predicted probability of success and the observed fraction. We were
debating that, and it had completely slipped my mind that there is a
separate literature on exactly that kind of problem. Yesterday,
somebody else asked me to estimate a logit model in which there were
more than 40000 cases but only a few hundred "successes". That's what
reminded me of the "rare events" problem and logistic regression
parameter estimate bias.
And I think that's the issue that we need to clear up with glmer. What
do you think? Since multilevel model can be seen as a penalized ML
estimation (ala Pinheiro and Bates, or as explained in Simon Wood,
Generalized Additive Models), are we able to get a bias-corrected
variant?