Skip to content
Prev 43216 / 398513 Next

warning associated with Logistic Regression

On Sunday, Jan 25, 2004, at 18:06 Europe/London, (Ted Harding) wrote:

            
This seems arguable.  Accepting that we are talking about point 
estimation (the desirability of which is of course open to question!!), 
then old-fashioned criteria like bias, variance and mean squared error 
can be used as a guide.  For example, we might desire to use an 
estimation method for which the MSE of the estimated logistic 
regression coefficients (suitably standardized) is as small as 
possible; or some other such thing.

The simplest case is estimation of log(pi/(1-pi)) given an observation 
r from binomial(n,pi).  Suppose we find that r=n -- what then can we 
say about pi?  Clearly not much if n is small, rather more if n is 
large.  Better in terms of MSE than the MLE (whose MSE is infinite) is 
to use log(p/(1-p)), with p = (r+0.5)/(n+1).  See for example Cox & 
Snell's book on binary data.  This corresponds to penalizing the 
likelihood by the Jeffreys prior, a penalty function which has good 
frequentist properties also in the more general logistic regression 
context.  References given in the brlr package give the theory and some 
empirical evidence.  The logistf package, also on CRAN, is another 
implementation.

I do not mean to imply that the Jeffreys-prior penalty will be the 
right thing for all applications -- it will not.  (eg if you really do 
have prior information, it would be better to use it.)

In general I agree wholeheartedly that it is best to get more/better 
data!
(cut)

All good wishes,
David