logistic regression - glm.fit: fitted probabilities numerically 0 or 1 occurred
On Dec 1, 2011, at 18:54 , Ben quant wrote:
Sorry if this is a duplicate: This is a re-post because the pdf's mentioned below did not go through.
Still not there. Sometimes it's because your mailer doesn't label them with the appropriate mime-type (e.g. as application/octet-stream, which is "arbitrary binary"). Anyways, see below [snip]
With the above data I do:
l_logit = glm(y~x, data=as.data.frame(l_yx),
family=binomial(link="logit")) Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred Why am I getting this warning when I have data points of varying values for y=1 and y=0? In other words, I don't think I have the linear separation issue discussed in one of the links I provided.
I bet that you do... You can get the warning without that effect (one of my own examples is the probability of menarche in a data set that includes infants and old age pensioners), but not with a huge odds ratio as well. Take a look at d <- as.data.frame(l_yx) with(d, y[order(x)]) if it comes out as all zeros followed by all ones or vice versa, then you have the problem.
PS - Then I do this and I get a odds ratio a crazy size:
l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
$coefficients[2]
l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
coeffcients
l_exp_coef
x 3161.781 So for one unit increase in the predictor variable I get 3160.781% (3161.781 - 1 = 3160.781) increase in odds? That can't be correct either. How do I correct for this issue? (I tried multiplying the predictor variables by a constant and the odds ratio goes down, but the warning above still persists and shouldn't the odds ratio be predictor variable size independent?)
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com