Skip to content
Prev 274437 / 398506 Next

binomial GLM quasi separation

On Fri, 2011-10-14 at 02:32 -0700, lincoln wrote:
Have you read ?glm, specifically this bit:

Details:

....

     For the background to warning messages about ?fitted probabilities
     numerically 0 or 1 occurred? for binomial GLMs, see Venables &
     Ripley (2002, pp. 197-8).

There, V&R say (me paraphrasing) that if there are some large fitted
\beta_i the curvature of the log-likelihood at the fitted \beta can be
much less than at \beta_i = 0. The Wald approximation underestimates the
change in the LL on setting \beta_i = 0. As the absolute value of the
fitted \beta_i becomes large (tends to infinity) the t statistic tends
to 0. This is the Hauck Donner effect.

Whilst I am (so very) far from being an expert here - this does seem to
fit the results you showed.

Furthermore, did you follow the steps Ioannis Kosmidis took me through
with my data in that email thread? I have with your data and everything
seems to follow the explanation/situation given by Ioannis. Namely, if
you increase the number of iterations and tolerance in the glm() call
you get the same fit as with a standard glm() call:
Call:
glm(formula = sex ~ twp + hwp + hcp + hnp, family = binomial, 
    data = dat)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.9703  -0.1760   0.3181   0.6061   3.5235  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.4362     0.2687   5.345 9.02e-08 ***
twp            5.5673     1.3602   4.093 4.26e-05 ***
hwp           -4.2793     2.3781  -1.799   0.0719 .  
hcp         -450.1918    56.6823  -7.942 1.98e-15 ***
hnp           -4.5302     3.2825  -1.380   0.1676    
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 581.51  on 425  degrees of freedom
Residual deviance: 294.00  on 421  degrees of freedom
  (41 observations deleted due to missingness)
AIC: 304

Number of Fisher Scoring iterations: 8
Call:
glm(formula = sex ~ twp + hwp + hcp + hnp, family = binomial, 
    data = dat, control = glm.control(epsilon = 1e-16, maxit = 1000))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.9703  -0.1760   0.3181   0.6061   3.5235  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.4362     0.2687   5.345 9.02e-08 ***
twp            5.5673     1.3602   4.093 4.26e-05 ***
hwp           -4.2793     2.3781  -1.799   0.0719 .  
hcp         -450.1918    56.6823  -7.942 1.98e-15 ***
hnp           -4.5302     3.2825  -1.380   0.1676    
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 581.51  on 425  degrees of freedom
Residual deviance: 294.00  on 421  degrees of freedom
  (41 observations deleted due to missingness)
AIC: 304

Number of Fisher Scoring iterations: 9
Profiling the model shows that the LL starts to increase again at low
values, but does so slowly. The LL is very flat around the estimates and
is far from 0, which seems to correspond with the description of the
Hauck Donner effect given By Venables and Ripley in their book. In your
case however, the statistic is still sufficiently large for it to be
identified as significant via the Wald test.

If we fit the model via brglm() we get essentially the same "result" as
fitted by glm():
Call:
brglm(formula = sex ~ twp + hwp + hcp + hnp, family = binomial, 
    data = dat)


Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.4262     0.2662   5.357 8.47e-08 ***
twp            5.3696     1.3323   4.030 5.57e-05 ***
hwp           -4.2813     2.3504  -1.821   0.0685 .  
hcp         -435.9212    55.0566  -7.918 2.42e-15 ***
hnp           -4.6295     3.2459  -1.426   0.1538    
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 560.48  on 425  degrees of freedom
Residual deviance: 294.08  on 421  degrees of freedom
Penalized deviance: 302.8212 
  (41 observations deleted due to missingness)
AIC:  304.08

Ioannis points out that if there were separation, the brglm results
would differ markedly from the glm() ones, and they don't.

As Ioannis mentioned in that thread, you can get the warning about the
fitted probabilities without a separation problem - In my case it was
because there were some very small fitted probabilities, which R was
just warning about.

HTH

G