Message-ID: <2F36BA38-DD8B-4660-ACA1-709750BE0D35@gmail.com>
Date: 2011-12-01T18:55:18Z
From: Peter Dalgaard
Subject: logistic regression - glm.fit: fitted probabilities numerically 0 or 1 occurred
In-Reply-To: <CAG2PC+en68F391wQo3VO+qve=9rTW5oK5+M1BTHQPZK1mfiQDg@mail.gmail.com>
On Dec 1, 2011, at 18:54 , Ben quant wrote:
> Sorry if this is a duplicate: This is a re-post because the pdf's mentioned
> below did not go through.
Still not there. Sometimes it's because your mailer doesn't label them with the appropriate mime-type (e.g. as application/octet-stream, which is "arbitrary binary"). Anyways, see below
[snip]
>
> With the above data I do:
>> l_logit = glm(y~x, data=as.data.frame(l_yx),
> family=binomial(link="logit"))
> Warning message:
> glm.fit: fitted probabilities numerically 0 or 1 occurred
>
> Why am I getting this warning when I have data points of varying values for
> y=1 and y=0? In other words, I don't think I have the linear separation
> issue discussed in one of the links I provided.
I bet that you do... You can get the warning without that effect (one of my own examples is the probability of menarche in a data set that includes infants and old age pensioners), but not with a huge odds ratio as well. Take a look at
d <- as.data.frame(l_yx)
with(d, y[order(x)])
if it comes out as all zeros followed by all ones or vice versa, then you have the problem.
>
> PS - Then I do this and I get a odds ratio a crazy size:
>> l_sm = summary(l_logit) # coef pval is $coefficients[8], log odds
> $coefficients[2]
>> l_exp_coef = exp(l_logit$coefficients)[2] # exponentiate the
> coeffcients
>> l_exp_coef
> x
> 3161.781
>
> So for one unit increase in the predictor variable I get 3160.781%
> (3161.781 - 1 = 3160.781) increase in odds? That can't be correct either.
> How do I correct for this issue? (I tried multiplying the predictor
> variables by a constant and the odds ratio goes down, but the warning above
> still persists and shouldn't the odds ratio be predictor variable size
> independent?)
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com