New Variant of Same Question: bias corrected logit estimates - R-SIG-mixed-models

Fri, Apr 12, 2013 8:33 AM #

Dear R-sig-mixed:

I was struck today by the way the Internet has accelerated research.
At one time, it might have taken a month or two to track down the
articles on this problem and conclude I need to ask for advice. Now,
however, I realize the need within hours.

Recall the question that started us debating a few days ago was a
logistic regression in which OP noticed the mis-match between the
predicted probability of success and the observed fraction.  We were
debating that, and it had completely slipped my mind that there is a
separate literature on exactly that kind of problem. Yesterday,
somebody else asked me to estimate a logit model in which there were
more than 40000 cases but only a few hundred "successes". That's what
reminded me of the "rare events" problem and logistic regression
parameter estimate bias.

And I think that's the issue that we need to clear up with glmer. What
do you think? Since multilevel model can be seen as a penalized ML
estimation (ala Pinheiro and Bates, or as explained in Simon Wood,
Generalized Additive Models), are we able to get a bias-corrected
variant?

Furthermore, could lme4's predict method be made to produce "good"
confidence intervals.  And that leads down a separate path to a huge
hassle about competing ways to estimate CI's in glm and the possible
need to appy extra corrections in some special cases. I'll write down
that problem to ask you about it later if you help me understand this
one.

Here's my brief novel on what I've been Googling about for the past 10
hours or so. If it helps you, let me know. If you think I'm wrong,
especially urgently let me know.

To the political science audience, that's a "rare events" logistic
regression problem, our most heavily cited methods paper on that is:

King, G., & Zeng, L. (2001). Logistic Regression in Rare Events Data.
Political Analysis, 9(2), 137?163.
http://pan.oxfordjournals.org/content/9/2/137.abstract

Logistic parameter estimates (mainly the intercept) are wrong and
estimated probabilities are wrong. King & Zeng provided Stata code for
a function "relogit" and later adapted same for R (package: Zelig).
Zelig tries to re-organize the whole regression experience for the R
user, and I didn't want that, so I started looking into the various
corrections to see if I couldn't write an adapter to take a glm or a
glmer output and "bias correct" it. It appears, superficially at
least, that I only need to adjust the intercept estimate by a
weighting factor, which would be super easy to do.

Quite by chance, I found this blog post by Paul Allison, and its
really interesting!

Logistic Regression for Rare Events (2012-02-13)
http://www.statisticalhorizons.com/logistic-regression-for-rare-events

And, wow, is it subtle. Read that over a few times, see if you agree
with me. In a kind way, he says the "rare events" business is a red
herring, and instead we need bias-corrected logistic regression
estimates. Use David Firth's method. The part about the  "prior
correction of the intercept" discussed in King and Zeng, is not the
best approach. Instead, we should see this as a symptom of the more
general problem that ML estimates are biased and the bias is greatest
when there are not too many "successes".  Allison suggests an
estimator proposed by David Firth, which used penalized ML.

Firth D. Bias reduction of maximum likelihood estimates. Biometrika
1993; 80:27?38

I don't think King and Zeng disagree, they also propose an option to
bias-correct the whole vector of coefficients. That bias correction
ends up addressing the more general problem. In the Stata module for
relogit (the version I found was dated 1999-10-28), it says ""Relogit
for Stata does not yet support the FIRTH option", but it does have an
alternative weighting correction.

While fiddling around to see if I could implement that, I learned it
has been done in R:

logistf: Firth's bias reduced logistic regression

http://cran.r-project.org/web/packages/logistf/index.html

That is often discussed as a solution to the problem of separation, as
on the UCLA stats website,
(http://www.ats.ucla.edu/stat/mult_pkg/faq/general/complete_separation_logit_models.htm)

Georg Heinze and Michael Schemper, A solution to the problem of
separation in logistic regression, Statistics in Medicine, 2002, vol.
21 2409-2419.

But it is a two-fer, so far as I can tell. We get bias correction and
separation-proofness.

Heinze, G., & Puhr, R. (2010). Bias-reduced and separation-proof
conditional logistic regression with small or sparse data sets.
Statistics in medicine, 29(7-8), 770?777. doi:10.1002/sim.3794

The part I don't understand (yet) is how the bias correction links to
mixed models. And that's why I'm asking you.

OK?

--
Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu

Ben Bolker

Fri, Apr 12, 2013 1:28 PM #

Paul Johnson <pauljohn32 at ...> writes:

I don't really know the answer to the full question, but I would
venture this:

  * There is no explicit bias-reduction capacity built into the
fixed-effects estimation component of glmer
 * I'm aware of Firth's algorithm and have used the R implementations
but haven't read the paper/don't know the details
 * glmer does handle some of the typical problems with 'rare events'
by doing shrinkage across random effects, but if the events are
rare in the *entire* data set (and not just in individual/small/
undersample regions), I don't think that will help
 * Vince Dorie and Andrew Gelman's blme package, or Jarrod Hadfield's
MCMCglmm package, could be used with more or less informative priors
to achieve a degree of shrinkage.

  I don't know whether there's a clever way to adapt glmer
itself to do shrinkage/bias correction on a single sample.

  Hopefully others with more knowledge will chime in.

David Atkins

Fri, Apr 12, 2013 5:01 PM #

Paul--

I should state upfront that I didn't read the previous thread closely, 
but I *thought* that the primary issue related to conditional vs. 
marginal effects -- where GLMMs (with non-identity link) functions yield 
conditional fixed-effects (i.e., they do not 'average over' the 
random-effects, but are conditional on particular values of the 
random-effects).

This shows up periodically on the listserv, e.g.,

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/015736.html

Though, perhaps your point below was in the later traffic in that thread 
(and if so, please disregard!).

cheers, Dave

Dave Atkins, PhD

Department of Psychiatry and Behavioral Science
University of Washington
datkins at u.washington.edu
206-616-3879 	
http://depts.washington.edu/cshrb/

"We are drowning in information and starving for knowledge."
Rutherford Roger


Paul wrote:

Dear R-sig-mixed:

I was struck today by the way the Internet has accelerated research.
At one time, it might have taken a month or two to track down the
articles on this problem and conclude I need to ask for advice. Now,
however, I realize the need within hours.

Recall the question that started us debating a few days ago was a
logistic regression in which OP noticed the mis-match between the
predicted probability of success and the observed fraction.  We were
debating that, and it had completely slipped my mind that there is a
separate literature on exactly that kind of problem. Yesterday,
somebody else asked me to estimate a logit model in which there were
more than 40000 cases but only a few hundred "successes". That's what
reminded me of the "rare events" problem and logistic regression
parameter estimate bias.

And I think that's the issue that we need to clear up with glmer. What
do you think? Since multilevel model can be seen as a penalized ML
estimation (ala Pinheiro and Bates, or as explained in Simon Wood,
Generalized Additive Models), are we able to get a bias-corrected
variant?

Furthermore, could lme4's predict method be made to produce "good"
confidence intervals.  And that leads down a separate path to a huge
hassle about competing ways to estimate CI's in glm and the possible
need to appy extra corrections in some special cases. I'll write down
that problem to ask you about it later if you help me understand this
one.

Here's my brief novel on what I've been Googling about for the past 10
hours or so. If it helps you, let me know. If you think I'm wrong,
especially urgently let me know.

To the political science audience, that's a "rare events" logistic
regression problem, our most heavily cited methods paper on that is:

King, G., & Zeng, L. (2001). Logistic Regression in Rare Events Data.
Political Analysis, 9(2), 137?163.
http://pan.oxfordjournals.org/content/9/2/137.abstract

Logistic parameter estimates (mainly the intercept) are wrong and
estimated probabilities are wrong. King & Zeng provided Stata code for
a function "relogit" and later adapted same for R (package: Zelig).
Zelig tries to re-organize the whole regression experience for the R
user, and I didn't want that, so I started looking into the various
corrections to see if I couldn't write an adapter to take a glm or a
glmer output and "bias correct" it. It appears, superficially at
least, that I only need to adjust the intercept estimate by a
weighting factor, which would be super easy to do.

Quite by chance, I found this blog post by Paul Allison, and its
really interesting!

Logistic Regression for Rare Events (2012-02-13)
http://www.statisticalhorizons.com/logistic-regression-for-rare-events

And, wow, is it subtle. Read that over a few times, see if you agree
with me. In a kind way, he says the "rare events" business is a red
herring, and instead we need bias-corrected logistic regression
estimates. Use David Firth's method. The part about the  "prior
correction of the intercept" discussed in King and Zeng, is not the
best approach. Instead, we should see this as a symptom of the more
general problem that ML estimates are biased and the bias is greatest
when there are not too many "successes".  Allison suggests an
estimator proposed by David Firth, which used penalized ML.

Firth D. Bias reduction of maximum likelihood estimates. Biometrika
1993; 80:27?38

I don't think King and Zeng disagree, they also propose an option to
bias-correct the whole vector of coefficients. That bias correction
ends up addressing the more general problem. In the Stata module for
relogit (the version I found was dated 1999-10-28), it says ""Relogit
for Stata does not yet support the FIRTH option", but it does have an
alternative weighting correction.

While fiddling around to see if I could implement that, I learned it
has been done in R:

logistf: Firth's bias reduced logistic regression

http://cran.r-project.org/web/packages/logistf/index.html

That is often discussed as a solution to the problem of separation, as
on the UCLA stats website,
(http://www.ats.ucla.edu/stat/mult_pkg/faq/general/complete_separation_logit_models.htm)

Georg Heinze and Michael Schemper, A solution to the problem of
separation in logistic regression, Statistics in Medicine, 2002, vol.
21 2409-2419.

But it is a two-fer, so far as I can tell. We get bias correction and
separation-proofness.

Heinze, G., & Puhr, R. (2010). Bias-reduced and separation-proof
conditional logistic regression with small or sparse data sets.
Statistics in medicine, 29(7-8), 770?777. doi:10.1002/sim.3794

The part I don't understand (yet) is how the bias correction links to
mixed models. And that's why I'm asking you.

OK?

--
Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu