warning associated with Logistic Regression
David Firth <d.firth at warwick.ac.uk> writes:
On Sunday, Jan 25, 2004, at 13:59 Europe/London, Guillem Chust wrote:
Hi All, When I tried to do logistic regression (with high maximum number of iterations) I got the following warning message Warning message: fitted probabilities numerically 0 or 1 occurred in: (if (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y, As I checked from the Archive R-Help mails, it seems that this happens when the dataset exhibits complete separation.
Yes. correct.
Sufficient but not necessary. It can happen just by numerical roundoff if the effect is strong enough. (I have an example with age and prevalent menarche: for nearly all women this happens between the age of 10 and 18, so if you have a couple of 40-year olds in your data set, they'll get a fitted p of 1. Happens even more easily if you throw in a cubic term.)
However, p-values tend to 1
The reported p-values cannot be trusted: the asymptotic theory on which they are based is not valid in such circumstances.
, and residual deviance tends to 0.
This, however, is a clear sign that the fit has diverged, and in that case (but not necessarily otherwise) the asymptotic theory is invalid.
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907