Skip to content
Prev 279073 / 398502 Next

logistic regression - glm.fit: fitted probabilities numerically 0 or 1 occurred

On Dec 1, 2011, at 23:43 , Ben quant wrote:

            
It's easier to explain why you got the warning before. If the OR for a one unit change is 3000, the OR for a 14 unit change is on the order of 10^48 and that causes over/underflow in the conversion to probabilities.

I'm still baffled at how you can get that model fitted to your data, though. One thing is that you can have situations where there are fitted probabilities of one corresponding to data that are all one and/or fitted zeros where data are zero, but you seem to have cases where you have both zeros and ones at both ends of the range of x. Fitting a zero to a one or vice versa would make the likelihood zero, so you'd expect that the algorithm would find a better set of parameters rather quickly. Perhaps the extremely large number of observations that you have has something to do with it? 

You'll get the warning if the fitted zeros or ones occur at any point of the iterative procedure. Maybe it isn't actually true for the final model, but that wouldn't seem consistent with the OR that you cited.

Anyways, your real problem lies with the distribution of the x values. I'd want to try transforming it to something more sane. Taking logarithms is the obvious idea, but you'd need to find out what to do about the zeros -- perhaps log(x + 1e-4) ? Or maybe just cut the outliers down to size with pmin(x,1).