Skip to content

When can the intercept be removed from regression models

7 messages · Thierry Onkelinx, Shadiya Al Hashmi, Martin Maechler +2 more

#
Good morning,

I am in a dilemma regarding the inclusion of the intercept in my mixed
effects logistic regression models.  Most statisticians that I talked to
insist that I shouldn?t remove the constant from my models.  One of the
pros is that the models would be of good fit since the R2 value would be
improved. Conversely, removing the constant means that there is no
guarantee that we would end up in getting biased coefficients since the
slopes would be forced to originate from the 0.

I found only one textbook which does not state it but rather seems to imply
that sometimes we can remove the constant. This is the reference provided
below.

Cornillon, P.A., Guyader, A., Husson, F., J?gou, N., Josse, J., Kloareg,
M., LOber, E and Rouvi?re, L. (2012). *R for Statistics*: CRC Press. Taylor
& Francis Group.



On p.136, it says that ?The p-value of less than 5% for the constant
(intercept) indicates that the constant must appear in the model?.  So
based on this, I am assuming that a p-value of more than 5% for the
intercept would mean that the intercept should be removed.

I would appreciate it if someone could help me with this conundrum.
#
Dear Shadiya,

Thou shall always keep the intercept in the model. Its p-value doesn't
matter.

I use two exceptions against that rule:
1. There is a physical/biological/... reason why the intercept should be 0
2. Removing the intercept gives a different, more convenient
parametrisation (but not does not changes the model fit!)

Note that in logistic regression you use a logit transformation. Hence
forcing the model thru the origin on the logit scale, forces the model to
50% probability at the original scale. I haven't seen an example where that
makes sense.

Bottom line: only remove the intercept when you really know what you are
doing.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-07-26 9:50 GMT+02:00 Shadiya Al Hashmi <saah500 at york.ac.uk>:

  
  
#
Thanks Thierry for your response. 

I tried the model before and after removing the intercept a while ago and I remember that the coefficients were pretty much the same. The only salient difference was that the levels of the first categorical variable in the model formula were all given in the output table instead of the reference level being embedded in the intercept as in the model with intercept.

It would be nice to find examples from the literature where the intercept is removed from the model. Can you think of any?

Shadiya

Sent from my iPhone

  
  
#
> Thanks Thierry for your response.  I tried the model
    > before and after removing the intercept a while ago and I
    > remember that the coefficients were pretty much the same.

but other things are *not* pretty much the same, and you
really really really should obey the advice by Thierry:

   ALWAYS KEEP THE INTERCEPT IN THE MODEL !!!

(at least until you become a very experience stastician / data
 scientist / .. )
 

    >> p-value doesn't matter.
    >  The only salient difference was that the levels of
    > the first categorical variable in the model formula were
    > all given in the output table instead of the reference
    > level being embedded in the intercept as in the model with
    > intercept.

    > It would be nice to find examples from the literature
    > where the intercept is removed from the model. 

hopefully *not*!  at least not apart from the exceptions that
Thierry mentions below.

    > Can you think of any?

    > Shadiya

    > Sent from my iPhone

    >> On Jul 26, 2016, at 11:32 AM, Thierry Onkelinx
>> <thierry.onkelinx at inbo.be> wrote:
>> 
    >> Dear Shadiya,
    >> 
    >> Thou shall always keep the intercept in the model. Its
    >> p-value doesn't matter.
    >> 
    >> I use two exceptions against that rule: 1. There is a
    >> physical/biological/... reason why the intercept should
    >> be 0 2. Removing the intercept gives a different, more
    >> convenient parametrisation (but not does not changes the
    >> model fit!)
    >> 
    >> Note that in logistic regression you use a logit
    >> transformation. Hence forcing the model thru the origin
    >> on the logit scale, forces the model to 50% probability
    >> at the original scale. I haven't seen an example where
    >> that makes sense.
    >> 
    >> Bottom line: only remove the intercept when you really
    >> know what you are doing.
    >> 
    >> Best regards,
    >> 
    >> ir. Thierry Onkelinx Instituut voor natuur- en
    >> bosonderzoek / Research Institute for Nature and Forest
    >> team Biometrie & Kwaliteitszorg / team Biometrics &
    >> Quality Assurance Kliniekstraat 25 1070 Anderlecht
    >> Belgium
    >> 
    >> To call in the statistician after the experiment is done
    >> may be no more than asking him to perform a post-mortem
    >> examination: he may be able to say what the experiment
    >> died of. ~ Sir Ronald Aylmer Fisher The plural of
    >> anecdote is not data. ~ Roger Brinner The combination of
    >> some data and an aching desire for an answer does not
    >> ensure that a reasonable answer can be extracted from a
    >> given body of data. ~ John Tukey
    >> 
    >> 2016-07-26 9:50 GMT+02:00 Shadiya Al Hashmi
    >> <saah500 at york.ac.uk>:
    >>> Good morning,
    >>> 
    >>> I am in a dilemma regarding the inclusion of the
    >>> intercept in my mixed effects logistic regression
    >>> models.  Most statisticians that I talked to insist that
    >>> I shouldn?t remove the constant from my models.  One of
    >>> the pros is that the models would be of good fit since
    >>> the R2 value would be improved. Conversely, removing the
    >>> constant means that there is no guarantee that we would
    >>> end up in getting biased coefficients since the slopes
    >>> would be forced to originate from the 0.
    >>> 
    >>> I found only one textbook which does not state it but
    >>> rather seems to imply that sometimes we can remove the
    >>> constant. This is the reference provided below.
    >>> 
    >>> Cornillon, P.A., Guyader, A., Husson, F., J?gou, N.,
    >>> Josse, J., Kloareg, M., LOber, E and Rouvi?re,
    >>> L. (2012). *R for Statistics*: CRC Press. Taylor &
    >>> Francis Group.
    >>> 
    >>> 
    >>> 
    >>> On p.136, it says that ?The p-value of less than 5% for
    >>> the constant (intercept) indicates that the constant
    >>> must appear in the model?.  So based on this, I am
    >>> assuming that a p-value of more than 5% for the
    >>> intercept would mean that the intercept should be
    >>> removed.
    >>> 
    >>> I would appreciate it if someone could help me with this
    >>> conundrum.
    >>> 
    >>> --
    >>> Shadiya
#
Thanks Martin:)

I will update my models with the intercept and that for sure will take some
time.

Best,

Shadiya




On 26 July 2016 at 13:08, Martin Maechler <maechler at stat.math.ethz.ch>
wrote:

  
    
#
Hi,

since all the stats experts are on this list, I have to ask a question
in relation to models without intercept.

In my layman's conception in a model without intercept like this one:

glmer(response ~ 0 + condition + (1 | study_participant ) + (1 |
test_item), data=data_frame, family=binomial,
control=glmerControl(optimizer="bobyqa"))

the levels of the predictor condition are not estimated in relation to
the intercept but against zero absolute. With binomial data this seems
quite handy as for each condition level the model tells me whether
performance was significantly different from chance (like multiple
intercepts), something a binomial test could do as well (albeit
without accounting for the random components structure).
This can be (and in psycholinguistic research often is) a research question.

Or is this total nonsense?

I have to say that I am confused when int comes to the intercepts in
the random components ....

Tom

---

Tom Fritzsche
University of Potsdam
Department of Linguistics
Karl-Liebknecht-Stra?e 24-25
14476 Potsdam
Germany

office: 14.140
phone: +49 331 977 2296
fax: +49 331 977 2095
e-mail: tom.fritzsche at uni-potsdam.de
web:    www.ling.uni-potsdam.de/~fritzsche




2016-07-26 12:08 GMT+02:00 Martin Maechler <maechler at stat.math.ethz.ch>:
#
Comments below.
On 16-07-26 06:31 AM, Tom Fritzsche wrote:
In this case (where the model has a categorical variable as a main
effect), you're right that the overall model fit is identical whether we
use 0+condition or 1+condition; the model is just differently
parameterized.  I think that in general computing these individual
effects *after* model-fitting, e.g. via the effects or lsmeans package,
is more sensible.  Also keep in mind that if you're comparing lots of
individual levels to zero (1) you might want to take multiple
comparisons into account (see multcomp package), (2) don't fall in the
trap of saying that two levels are different because one is
significantly different from zero and the other isn't.