Skip to content

LRT significant but new variable's beta not

5 messages · Clara Neudecker, Alex Fine, Thompson,Paul +1 more

#
Dear all,

I'm looking for some hints on how to interprete my results. I have a
logistic mixed effects model in which I include a single new variable.
Comparing the old and new model with a likelihood ratio test yields a
significant difference (p < .001), but when I look at the new variable's
beta it's not significant at all.

How do I interprete this? After thinking and googling I have only one
suspicion left: Is it possible that including the new variable makes the
other variables more informative because there is some kind of supressor
effect in the data? Or is there another explanation?

(The phenomenon cannot be a coincidence; the same happens with other
variables as well.)

I attach some output in case it helps.

Best regards and thanks in advance,
Clara Neudecker 



My model without the new variable:
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: binomial  ( logit )
Formula: umzug50000 ~ 1 + gebjahr_c + sex + (1 | zp12401) + (1 | ror96)
   Data: master_5

     AIC      BIC   logLik deviance df.resid 
  1077.9   1110.7   -534.0   1067.9     5167 

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-0.547 -0.169 -0.102 -0.065 35.188 

Random effects:
 Groups  Name        Variance Std.Dev.
 ror96   (Intercept) 0.23757  0.4874  
 zp12401 (Intercept) 0.01509  0.1229  
Number of obs: 5172, groups:  ror96, 96; zp12401, 7

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept) -4.433991   0.001499 -2957.3   <2e-16 ***
gebjahr_c    0.075061   0.001449    51.8   <2e-16 ***
sex.L        0.113758   0.001498    75.9   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
          (Intr) gbjhr_
gebjahr_c -0.001       
sex.L      0.000  0.000




With the new variable pol_fit_ror:
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: binomial  ( logit )
Formula: umzug50000 ~ 1 + gebjahr_c + sex + pol_fit_ror + (1 | zp12401)
+      (1 | ror96)
   Data: master_5

     AIC      BIC   logLik deviance df.resid 
  1027.6   1066.5   -507.8   1015.6     4857 

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-0.564 -0.171 -0.104 -0.066 33.626 

Random effects:
 Groups  Name        Variance  Std.Dev.
 ror96   (Intercept) 0.2125987 0.46108 
 zp12401 (Intercept) 0.0008857 0.02976 
Number of obs: 4863, groups:  ror96, 88; zp12401, 7

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept) -4.086460   0.351692 -11.619   <2e-16 ***
gebjahr_c    0.074675   0.006972  10.711   <2e-16 ***
sex.L        0.158924   0.132901   1.196    0.232    
pol_fit_ror -0.256504   0.288639  -0.889    0.374    
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
            (Intr) gbjhr_ sex.L 
gebjahr_c   -0.293              
sex.L        0.084 -0.152       
pol_fit_ror -0.872 -0.030 -0.010

LRT of the two models:
Data: master_5
Models:
fm501: umzug50000 ~ 1 + gebjahr_c + sex + (1 | zp12401) + (1 | ror96)
fm510: umzug50000 ~ 1 + gebjahr_c + sex + pol_fit_ror + (1 | zp12401) + 
fm510:     (1 | ror96)
      Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
fm501  5 1077.9 1110.7 -533.96   1067.9                             
fm510  6 1027.5 1066.5 -507.78   1015.5 52.362      1  4.615e-13 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
#
The LRT is robust to multicollinearity but the coefficient-based test is
not, so that could be it (collinearity involving the new predictor could be
inflating the standard error on the coefficient).

On Tue, Apr 19, 2016 at 4:20 AM, Clara Neudecker <
clara.hildegard.ruecker at uni-jena.de> wrote:

            

  
    
#
The first model has 5172 obs
The second model has 4863 obs

That is a difference of 309 obs. Clearly the new variable has a bunch of missing values. I would pay attention to that. You are fitting models in different groups, although they do overlap.

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Clara Neudecker
Sent: Tuesday, April 19, 2016 3:21 AM
To: R-sig-mixed-models at r-project.org
Subject: [R-sig-ME] LRT significant but new variable's beta not

Dear all,

I'm looking for some hints on how to interprete my results. I have a logistic mixed effects model in which I include a single new variable.
Comparing the old and new model with a likelihood ratio test yields a significant difference (p < .001), but when I look at the new variable's beta it's not significant at all.

How do I interprete this? After thinking and googling I have only one suspicion left: Is it possible that including the new variable makes the other variables more informative because there is some kind of supressor effect in the data? Or is there another explanation?

(The phenomenon cannot be a coincidence; the same happens with other variables as well.)

I attach some output in case it helps.

Best regards and thanks in advance,
Clara Neudecker 



My model without the new variable:
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: binomial  ( logit )
Formula: umzug50000 ~ 1 + gebjahr_c + sex + (1 | zp12401) + (1 | ror96)
   Data: master_5

     AIC      BIC   logLik deviance df.resid 
  1077.9   1110.7   -534.0   1067.9     5167 

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-0.547 -0.169 -0.102 -0.065 35.188 

Random effects:
 Groups  Name        Variance Std.Dev.
 ror96   (Intercept) 0.23757  0.4874  
 zp12401 (Intercept) 0.01509  0.1229
Number of obs: 5172, groups:  ror96, 96; zp12401, 7

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept) -4.433991   0.001499 -2957.3   <2e-16 ***
gebjahr_c    0.075061   0.001449    51.8   <2e-16 ***
sex.L        0.113758   0.001498    75.9   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
          (Intr) gbjhr_
gebjahr_c -0.001       
sex.L      0.000  0.000




With the new variable pol_fit_ror:
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: binomial  ( logit )
Formula: umzug50000 ~ 1 + gebjahr_c + sex + pol_fit_ror + (1 | zp12401)
+      (1 | ror96)
   Data: master_5

     AIC      BIC   logLik deviance df.resid 
  1027.6   1066.5   -507.8   1015.6     4857 

Scaled residuals: 
   Min     1Q Median     3Q    Max 
-0.564 -0.171 -0.104 -0.066 33.626 

Random effects:
 Groups  Name        Variance  Std.Dev.
 ror96   (Intercept) 0.2125987 0.46108 
 zp12401 (Intercept) 0.0008857 0.02976
Number of obs: 4863, groups:  ror96, 88; zp12401, 7

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept) -4.086460   0.351692 -11.619   <2e-16 ***
gebjahr_c    0.074675   0.006972  10.711   <2e-16 ***
sex.L        0.158924   0.132901   1.196    0.232    
pol_fit_ror -0.256504   0.288639  -0.889    0.374    
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
            (Intr) gbjhr_ sex.L 
gebjahr_c   -0.293              
sex.L        0.084 -0.152       
pol_fit_ror -0.872 -0.030 -0.010

LRT of the two models:
Data: master_5
Models:
fm501: umzug50000 ~ 1 + gebjahr_c + sex + (1 | zp12401) + (1 | ror96)
fm510: umzug50000 ~ 1 + gebjahr_c + sex + pol_fit_ror + (1 | zp12401) + 
fm510:     (1 | ror96)
      Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
fm501  5 1077.9 1110.7 -533.96   1067.9                             
fm510  6 1027.5 1066.5 -507.78   1015.5 52.362      1  4.615e-13 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-----------------------------------------------------------------------
Confidentiality Notice: This e-mail message, including any attachments,
is for the sole use of the intended recipient(s) and may contain
privileged and confidential information.  Any unauthorized review, use,
disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply e-mail and destroy
all copies of the original message.
#
I agree with Alex.  I do find these results *mildly* surprising, but note that:

- estimated coefficient of gebjahr_c doesn't change much, but Z-score
goes from 52 (model 1) to 11 (model 2)
- fixed effects in first model are nearly perfectly independent
- fairly strong correlation (-0.88) between the new variable and gebjahr_c

  Note that these kinds of questions are not specific to mixed models,
but are general to essentially all linear/generalized models as soon
as the experimental/observation design no longer provides
orthogonal/independent estimates of the coefficients.

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.433991   0.001499 -2957.3   <2e-16 ***
gebjahr_c    0.075061   0.001449    51.8   <2e-16 ***
sex.L        0.113758   0.001498    75.9   <2e-16 ***
---
Correlation of Fixed Effects:
          (Intr) gbjhr_
gebjahr_c -0.001
sex.L      0.000  0.000

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.086460   0.351692 -11.619   <2e-16 ***
gebjahr_c    0.074675   0.006972  10.711   <2e-16 ***
sex.L        0.158924   0.132901   1.196    0.232
pol_fit_ror -0.256504   0.288639  -0.889    0.374
--
Correlation of Fixed Effects:
            (Intr) gbjhr_ sex.L
gebjahr_c   -0.293
sex.L        0.084 -0.152
pol_fit_ror -0.872 -0.030 -0.010
On Tue, Apr 19, 2016 at 2:29 PM, Alex Fine <abfine at gmail.com> wrote:
#
Good catch!  That's probably it.  I'm a little bit confused that
this isn't caught by lmer, as in the following example:
Error in anova.merMod(fm1, fm2) :
  models were not all fitted to the same size of dataset

On Tue, Apr 19, 2016 at 2:32 PM, Thompson,Paul
<Paul.Thompson at sanfordhealth.org> wrote: