Std errors in glm models w/ and w/o intercept

3 messages · Brian Ripley, David Winsemius

Sun, Mar 16, 2008 6:11 PM #

I am doing a reanalysis of results that have previously been published. 
My hope was to demonstrate the value of adoption of more modern 
regression methods in preference to the traditional approach of 
univariate stratification. I have encountered a puzzle regarding 
differences between I thought would be two equivalent analyses. Using a 
single factor, I compare poisson models with and without the intercept 
term. As expected, the estimated coefficient and std error of the 
estimate are the same for the intercept and the base level of the 
factor in the two models. The sum of the intercept with each 
coefficient is equal to the individual factor coefficients in the no-
intercept model. The overall model fit statistics are the same. 
However, the std errors for the other factors are much smaller in the 
model without the intercept. 

The offset = log(expected) is based on person-years of follow-up 
multiplied by the annual mortality experience of persons with known 
age, gender, and smoking status in a much larger cohort. My logic in 
removing the intercept was that the offset should be considered the 
baseline, and that I should estimate each level compared with that 
baseline. "18.5-24.9" was used as the reference level in the model with 
intercept. Removing the intercept appears to be a "successful" 
strategy. but have I committed any statistical sin?

Actual_Deaths
BMI           0   1   2   3   4   5   6   7  11  13
  18.5-24.9 311  21   1   0   0   0   0   0   0   0
  15.0-18.4 353  33   8   2   0   1   0   0   0   0
  25.0-29.9 367  19   0   0   0   0   0   0   0   0
  30.0-34.9 349  95  39  17   8   9   3   4   0   1
  35.0-39.9 351  90  50  21  20   3   3   2   1   0
  40.0-55.0 386  60  15   7   4   0   0   1   0   0

BMI + offset(log( MMI_VBT_Expected)),  family="poisson"))

Call:
glm(formula = Actual_Deaths ~ BMI + offset(log(MMI_VBT_Expected)),
    family = "poisson")

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-2.6385  -0.5245  -0.2722  -0.1041   3.4426

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   0.42920    0.20851   2.058   0.0395 *
BMI15.0-18.4  0.31608    0.24524   1.289   0.1974
BMI25.0-29.9 -0.22795    0.30999  -0.735   0.4621
BMI30.0-34.9 -0.09669    0.21506  -0.450   0.6530
BMI35.0-39.9 -0.04290    0.21455  -0.200   0.8415
BMI40.0-55.0  0.19348    0.22569   0.857   0.3913
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 1485.0  on 2654  degrees of freedom
Residual deviance: 1470.0  on 2649  degrees of freedom
AIC: 2760.9

Number of Fisher Scoring iterations: 6
-----------------------------------------------------

BMI + offset(log(MMI_VBT_Expected)) -1 ,
                            family="poisson"))

Call:
glm(formula = Actual_Deaths ~ BMI + offset(log(MMI_VBT_Expected)) -
    1, family = "poisson")

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-2.6385  -0.5245  -0.2722  -0.1041   3.4426

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
BMI18.5-24.9  0.42920    0.20851   2.058   0.0395 *
BMI15.0-18.4  0.74529    0.12910   5.773 7.79e-09 ***
BMI25.0-29.9  0.20125    0.22939   0.877   0.3803
BMI30.0-34.9  0.33251    0.05270   6.309 2.81e-10 ***
BMI35.0-39.9  0.38631    0.05057   7.639 2.19e-14 ***
BMI40.0-55.0  0.62268    0.08639   7.208 5.67e-13 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 1630.7  on 2655  degrees of freedom
Residual deviance: 1470.0  on 2649  degrees of freedom
AIC: 2760.9

It does look statistically sensible that an estimate for BMI="40.0-
55.0" with over 100 events should have a much narrower CI than 
BMI="18.5-24.9" which only has 23 events. Is the model with an  
intercept term somehow "spreading around uncertainty" that really 
"belongs" to the reference category with its relatively low number of 
events?

David Winsemius

Brian Ripley

Sun, Mar 16, 2008 11:30 PM #

On Mon, 17 Mar 2008, David Winsemius wrote:

No, but you have apparently not understood what the 'intercept' means 
here.  With a single factor and the default contr.treatment, it is the 
coefficient used to predict the first category of the factor, and the 
remaining coefficients are log ratios of mean rate for the named category 
to the first.  When you drop the intercept, the coefficients are no longer 
contrasts.

When you drop the intercept, the coding (and hence the interpretation of 
the coefficients) of the first factor in the model changes. See MASS 
chapter 6.  So you are comparing apples with oranges.

with(bmi, table(BMI,Actual_Deaths))

          Actual_Deaths
BMI           0   1   2   3   4   5   6   7  11  13
 18.5-24.9 311  21   1   0   0   0   0   0   0   0
 15.0-18.4 353  33   8   2   0   1   0   0   0   0
 25.0-29.9 367  19   0   0   0   0   0   0   0   0
 30.0-34.9 349  95  39  17   8   9   3   4   0   1
 35.0-39.9 351  90  50  21  20   3   3   2   1   0
 40.0-55.0 386  60  15   7   4   0   0   1   0   0

bmi.base <- with(bmi,glm(Actual_Deaths ~

        BMI + offset(log( MMI_VBT_Expected)),  family="poisson"))

summary(bmi.base)

Call:
glm(formula = Actual_Deaths ~ BMI + offset(log(MMI_VBT_Expected)),
   family = "poisson")

Deviance Residuals:
   Min       1Q   Median       3Q      Max
-2.6385  -0.5245  -0.2722  -0.1041   3.4426

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)   0.42920    0.20851   2.058   0.0395 *
BMI15.0-18.4  0.31608    0.24524   1.289   0.1974
BMI25.0-29.9 -0.22795    0.30999  -0.735   0.4621
BMI30.0-34.9 -0.09669    0.21506  -0.450   0.6530
BMI35.0-39.9 -0.04290    0.21455  -0.200   0.8415
BMI40.0-55.0  0.19348    0.22569   0.857   0.3913
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

   Null deviance: 1485.0  on 2654  degrees of freedom
Residual deviance: 1470.0  on 2649  degrees of freedom
AIC: 2760.9

Number of Fisher Scoring iterations: 6
-----------------------------------------------------

bmi.no.int <- with(bmi,glm(Actual_Deaths ~

                           BMI + offset(log(MMI_VBT_Expected)) -1 ,
                           family="poisson"))

summary(bmi.no.int)

Call:
glm(formula = Actual_Deaths ~ BMI + offset(log(MMI_VBT_Expected)) -
   1, family = "poisson")

Deviance Residuals:
   Min       1Q   Median       3Q      Max
-2.6385  -0.5245  -0.2722  -0.1041   3.4426

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
BMI18.5-24.9  0.42920    0.20851   2.058   0.0395 *
BMI15.0-18.4  0.74529    0.12910   5.773 7.79e-09 ***
BMI25.0-29.9  0.20125    0.22939   0.877   0.3803
BMI30.0-34.9  0.33251    0.05270   6.309 2.81e-10 ***
BMI35.0-39.9  0.38631    0.05057   7.639 2.19e-14 ***
BMI40.0-55.0  0.62268    0.08639   7.208 5.67e-13 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

   Null deviance: 1630.7  on 2655  degrees of freedom
Residual deviance: 1470.0  on 2649  degrees of freedom
AIC: 2760.9

It does look statistically sensible that an estimate for BMI="40.0-
55.0" with over 100 events should have a much narrower CI than
BMI="18.5-24.9" which only has 23 events. Is the model with an
intercept term somehow "spreading around uncertainty" that really
"belongs" to the reference category with its relatively low number of
events?

--
David Winsemius

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

David Winsemius

Mon, Mar 17, 2008 8:09 AM #

Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote in
news:alpine.LFD.1.00.0803170624220.5706 at gannet.stats.ox.ac.uk:

Thank you for your interest in my question, Prof Ripley. I did 
understand that the intercept coeff was the log(ratio) of the base 
group to the offset and that exp(coeff$intercept)) can be interpreted 
as a mortality ratio. Also, that the coefficients in the first model 
were log ratios of effect(BMI) to coefficient(BMI-reference), so that 
exp(coeff$level+coeff$intercept) would be a level's ratio relative to 
the "expected". My concern was with the markedly lower std errors 
around the "other" level coefficients when the intercept was removed. 
My preference would be to use the non-intercept model.

MASS.2ed.ch6, "Linear Statistical Models", says that the lm() models 
with and without intercepts have different contrast matrices and 
discusses interpretation of coefficients. If I to consult a later 
edition, will I find a discussion of the impact of those differences on 
the std errors of the coefficients?

snipped model output ...appended SE(coeff)'s to factor counts

To my eye, the SE's in the no-intercept model make much more sense as 
far as their relationship to the sum of counts.  I also have a related 
concern that I may have in the past been using less efficient 
inferential methods when analyzing models with external standards by 
accepting the default intercept.

David Winsemius