question regarding logit regression using glm
The "problem" is that with 40 parameters, you are able to get a perfect fit for at least some of the observations. To achieve this, it sends selected parameters to +/-Inf. Of course, it quits before it gets to Inf, but most of your parameter estimates exceeded 1e13 in absolute value. What do you want? Do you really need MSA to be a factor, requiring you to estimate 39 parameters for MSA? Does it make sense to parameterize it some other way, like latitude and longitude? You could fit a polynomial in lat + lon and gain substantial insight, I suspect, that you can't get from the factor coefficients. spencer graves
Haibo Huang wrote:
I got the following warning messages when I did a binomial logit regression using glm(): Warning messages: 1: Algorithm did not converge in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, Can some one share your thoughts on how to solve this problem? Please read the following for details. Thank you very much! Best, Ed
Lease=read.csv("lease.csv", header=TRUE)
Lease$ET = factor(Lease$EarlyTermination)
SICCode=factor(Lease$SIC.Code)
TO=factor(Lease$TenantHasOption)
LO=factor(Lease$LandlordHasOption)
TEO=factor(Lease$TenantExercisedOption)
RegA=glm(ET~1+MSA,
+ family=binomial(link=logit), data=Lease, weights=Origil.SQFT) Warning messages: 1: Algorithm did not converge in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart,
summary(RegA)
Call:
glm(formula = ET ~ 1 + MSA, family = binomial(link =
logit),
data = Lease, weights = Origil.SQFT)
Deviance Residuals:
Min 1Q Median 3Q
Max
-6.038e+03 -2.066e-06 0.000e+00 0.000e+00
6.720e+03
Coefficients:
Estimate Std. Error z value
Pr(>|z|)
(Intercept) 5.711e+00 8.466e-02 6.745e+01
<2e-16 ***
MSAAnchorage -6.493e+00 8.541e-02 -7.602e+01
<2e-16 ***
MSAAtlanta 6.894e+14 2.310e+04 2.985e+10
<2e-16 ***
MSAAustin -9.362e+14 4.954e+04 -1.890e+10
<2e-16 ***
MSABoston -2.474e+15 2.151e+04 -1.150e+11
<2e-16 ***
MSACharlotte -2.150e+15 7.265e+04 -2.960e+10
<2e-16 ***
MSAChicago -1.174e+15 2.057e+04 -5.707e+10
<2e-16 ***
MSACleveland -7.607e+14 7.046e+04 -1.080e+10
<2e-16 ***
MSAColumbus -2.768e+15 1.685e+05 -1.642e+10
<2e-16 ***
MSADallas 2.061e+14 3.261e+04 6.321e+09
<2e-16 ***
MSADenver 5.470e+14 3.366e+04 1.625e+10
<2e-16 ***
MSAEast Bay -6.191e+01 1.344e+05 -4.61e-04
1
MSAFt. Worth -6.565e+00 8.483e-02 -7.739e+01
<2e-16 ***
MSAHouston -2.735e+15 3.576e+04 -7.648e+10
<2e-16 ***
MSAIndianapolis -7.483e+14 6.588e+04 -1.136e+10
<2e-16 ***
MSALos Angeles -1.388e+15 2.887e+04 -4.809e+10
<2e-16 ***
MSAMinneapolis -1.011e+15 2.731e+04 -3.702e+10
<2e-16 ***
MSANashville 2.143e+01 9.395e+04 2.28e-04
1
MSANew Orleans -3.370e+15 5.038e+04 -6.689e+10
<2e-16 ***
MSANew York -2.526e+15 2.969e+04 -8.507e+10
<2e-16 ***
MSANorfolk -5.614e+01 2.020e+06 -2.78e-05
1
MSAOakland-East Bay -2.272e+15 3.642e+04 -6.239e+10
<2e-16 ***
MSAOrange County -5.165e+14 2.428e+04 -2.128e+10
<2e-16 ***
MSAOrlando -3.215e+15 1.096e+05 -2.933e+10
<2e-16 ***
MSAPhiladelphia -8.871e+14 4.948e+04 -1.793e+10
<2e-16 ***
MSAPhoenix -1.156e+01 8.807e-02 -1.313e+02
<2e-16 ***
MSAPortland 7.604e+14 3.841e+04 1.980e+10
<2e-16 ***
MSARaleigh-Durham -4.312e+01 1.294e+05 -3.33e-04
1
MSARiverside 1.626e+15 4.645e+05 3.500e+09
<2e-16 ***
MSASacramento -9.873e+14 5.345e+04 -1.847e+10
<2e-16 ***
MSASalt Lake City 1.793e+15 2.029e+05 8.839e+09
<2e-16 ***
MSASan Antonio 9.451e+14 9.473e+04 9.977e+09
<2e-16 ***
MSASan Diego -3.740e+15 6.651e+04 -5.623e+10
<2e-16 ***
MSASan Francisco 3.109e+14 2.394e+04 1.299e+10
<2e-16 ***
MSASan Jose 7.392e+14 2.961e+04 2.497e+10
<2e-16 ***
MSASeattle -2.250e+15 1.581e+04 -1.423e+11
<2e-16 ***
MSASt. Louis -2.606e+15 1.801e+05 -1.447e+10
<2e-16 ***
MSAStamford -6.592e+00 8.469e-02 -7.784e+01
<2e-16 ***
MSAWashington DC 8.460e+13 3.319e+04 2.549e+09
<2e-16 ***
MSAWest Palm Beach -3.924e+01 2.308e+05 -1.70e-04
1
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.'
0.1 ` ' 1
(Dispersion parameter for binomial family taken to be
1)
Null deviance: 123111026 on 9302 degrees of
freedom
Residual deviance: 3028559052 on 9263 degrees of
freedom
AIC: 3028559132
Number of Fisher Scoring iterations: 25
anova(RegA)
Analysis of Deviance Table
Model: binomial, link: logit
Response: ET
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev
NULL 9302 123111026
MSA 39 0 9263 3028559052
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915