Skip to content

Anomalous results with glmer().

6 messages · Ben Bolker, David Duffy, Rolf Turner

#
Some months back I sent an inquiry to this list concerning the analysis 
of some linguistics data with which I am involved.  I am *still* 
(psigh!!!) struggling with these data and am getting results which are
making no sense to me.

Basically if I fit a (reasonably sensible) model using an old version of 
lme4 (0.999999-0) I get sensible looking estimates for the fixed effect 
coefficients, but the estimates of the variances of the random effects 
are essentially zero.  Which is silly.

If I fit the same model using lme4 version 1.1-7 (and ignore the warning 
about failure to converge) I get sensible looking estimates of the 
variances of the random effects, but an impossibly wrong estimate
of at least one of the fixed effect coefficients.  (The estimate says 
that the success probability is larger for phoneme type "Mclus" than it 
is for the baseline type "Fclus".  However a raw tabulation show that 
the success probability for Mclus is much, much smaller than for Fclus.

I have included more detail in the attached file notesME.txt for those 
who are interested.  This file induces explicit specification of the 
model that I used. The results of the fit using version 0.999999-0 are 
in the file oldLme4Rslts.txt; the results from version 1.1-7 are in 
newLme4Rslts.txt.

The data set is a bit too big to attach; it has 62601 records.  I have 
therefore made it available (as a *.csv file) on my web page:

     https://www.stat.auckland.ac.nz/~rolf

Click on "Linguistics data for R-SIG-ME".

I am really being driven nuts by this weirdness and would appreciate 
some avuncular advice from the knowledgeable.  (Ben???)

cheers,

Rolf

--
Rolf Turner
Technical Editor ANZJS
-------------- next part --------------

The model that I used in R with the lme4 package was:

    fit <- glmer(y ~ sex + type + age + (1|student) + (1|word),
                 family=binomial,data=X)

The type variable is the type of phoneme.  Of course sex has two
levels, M and F. The age variable consists of the midpoint of the
age intervals (5.0 - 5.5, 5.6 - 5.9, 6.0 - 6.5, etc.) into which the
students were classified.  This variable is treated as a numerical
covariate in the model that I use.  The variables student and word
are treated as random effects.  The response y is binary; 1 for
correct pronunciation of the phoneme in question, 0 for incorrect.

The model given above is simplistic and I was previously advised
to include interactions between the fixed effects and the random
effects.  However (a) this brings my computer to its knees, and
(b) sufficient unto the day is the evil thereof.

What is puzzling me is the estimates of the type coefficients.
When I fit the model using an elderly version of lme4 (0.999999-0)
I get what appear to me to be sensible results for these (and
for the other fixed effects) but the variance estimates for the
random effects are essentially both zero.  This *can't* be so!

When I fit the model using the latest version of lme4 (1.1-7), installed
from the r-forge repository, I get sensible estimates for the variances
of the random effects, but something is very strange about the
estimates of the type coefficients.  Explicitly the Mclus (medial
consonant cluster) coefficient is positive --- "significantly" so
at the 0.10 level (p-value = about 0.08).

This is saying that Mclus yields higher success rate than the
baseline type Fclus (which is the first, and hence reference,
level of the type factor).  Note that I stick with the default
"treatment contrasts".

However when I tabulated successes by phoneme type and I got the
following table:

   type
y   Fclus  Fcon Iclus  Icon Mclus  Mcon Vowel
  0   245  1216  2451   793  1077  1113   768
  1   733  9216  2439  8661   227  9645 24017


Note that the (raw) success rate for "Mclus" is ***MUCH*** smaller
than for any other phoneme type, in particular than for Fclus.
So how can the model produce an indication that it is larger than
the success rate for "Fclus"?

Admittedly the raw tabulation does not (of course) allow for the random
student effect and word effect which are included in the "formal" model.
Even so, I find it hard to believe that there could be such a huge
discrepancy between the "raw result" an the "model result".

Also:  Why is there this complete discordance between the results from
lme4 0.999999-0 and those from lme4 1.1-7?

I am toadally mystified.
-------------- next part --------------

Results from lme4 version 0.999999-0:
=====================================

Generalized linear mixed model fit by the Laplace approximation 
Formula: y ~ sex + type + age + (1 | student) + (1 | word) 
   Data: lingDat 
   AIC   BIC logLik deviance
 35707 35807 -17843    35685
Random effects:
 Groups  Name        Variance   Std.Dev.  
 student (Intercept) 3.8939e-13 6.2401e-07
 word    (Intercept) 0.0000e+00 0.0000e+00
Number of obs: 62601, groups: student, 326; word, 50

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -0.80082    0.13114  -6.107 1.02e-09 ***
sexM        -0.10162    0.02838  -3.581 0.000343 ***
typeFcon     0.93906    0.08036  11.686  < 2e-16 ***
typeIclus   -1.11890    0.07972 -14.035  < 2e-16 ***
typeIcon     1.30674    0.08309  15.726  < 2e-16 ***
typeMclus   -2.69497    0.10455 -25.778  < 2e-16 ***
typeMcon     1.07398    0.08080  13.292  < 2e-16 ***
typeVowel    2.36251    0.08287  28.507  < 2e-16 ***
age          0.30475    0.01667  18.280  < 2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
          (Intr) sexM   typFcn typIcl typIcn typMcl typMcn typVwl
sexM      -0.170                                                 
typeFcon  -0.533 -0.002                                          
typeIclus -0.510  0.005  0.861                                   
typeIcon  -0.518 -0.003  0.827  0.833                            
typeMclus -0.370  0.009  0.656  0.663  0.635                     
typeMcon  -0.531 -0.002  0.850  0.856  0.822  0.653              
typeVowel -0.523 -0.003  0.829  0.835  0.802  0.636  0.824       
age       -0.815  0.057  0.012 -0.023  0.014 -0.041  0.013  0.019
-------------- next part --------------

Results from lme4 version 1.1-7:
================================

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: y ~ sex + type + age + (1 | student) + (1 | word)
   Data: lingDat

     AIC      BIC   logLik deviance df.resid 
 27456.3  27555.8 -13717.2  27434.3    62590 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-37.166   0.049   0.122   0.267  10.350 

Random effects:
 Groups  Name        Variance Std.Dev.
 student (Intercept) 0.166    0.4074  
 word    (Intercept) 3.301    1.8170  
Number of obs: 62601, groups:  student, 326 word, 50

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -3.55421    0.37306   -9.53   <2e-16 ***
sexM        -0.13841    0.05597   -2.47   0.0134 *  
typeFcon     4.45879    0.17727   25.15   <2e-16 ***
typeIclus    1.47383    0.17061    8.64   <2e-16 ***
typeIcon     4.33003    0.18051   23.99   <2e-16 ***
typeMclus    0.33574    0.19876    1.69   0.0912 .  
typeMcon     3.41373    0.17724   19.26   <2e-16 ***
typeVowel    5.67714    0.17745   31.99   <2e-16 ***
age          0.39553    0.03253   12.16   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
          (Intr) sexM   typFcn typIcl typIcn typMcl typMcn typVwl
sexM      -0.120                                                 
typeFcon  -0.435 -0.005                                          
typeIclus -0.410 -0.001  0.912                                   
typeIcon  -0.431 -0.005  0.947  0.883                            
typeMclus -0.375  0.001  0.838  0.804  0.841                     
typeMcon  -0.429 -0.004  0.936  0.881  0.938  0.850              
typeVowel -0.436 -0.006  0.953  0.906  0.943  0.846  0.939       
age       -0.568  0.068  0.020  0.002  0.019 -0.006  0.015  0.025
#
I will try to have a look.
  I agree that the positive variance estimates seem (much) more sensible
in a data set of this size ...

  Obviously it will take me a little while to get all the fits done on a
data set of this non-trivial size, but here are some preliminary thoughts:

 * I will try out the 'allFit.R' code mentioned on the mailing list
previously just to see how the results from 5 or 6 different optimizers
compare.
 * I will try lme4.0 and hope to get results the same as lme4 0.999999-0
 * I will evaluate the deviances of both the old and new fits to see
which fit is actually better, and use bbmle::slicetrans to look at the
shape of the likelihood surface between the two points

  As for explaining the 'crazy' result, if it actually turns out to be
(close to) the MLE for this data set: I would look at pictures of the
data, predictions, etc., and try to see if there's some sort of
confounder/Simpson's paradox thing going on here where the marginal
effect (= raw tabulation) is in fact very different from the conditional
effect ...

  Ben Bolker
On 14-05-27 09:59 PM, Rolf Turner wrote:
#
Thanks Ben.  I now feel that there is hope! :-)
On 28/05/14 14:14, Ben Bolker wrote:
<SNIP>
I have tried to think of ways, like Simpson's paradox, to explain the 
anomaly, but nothing sensible comes to me.  If anyone can point me in 
the right direction .....

And is it not weird that the old version gives an intuitively plausible
estimate for the Mclus coefficient whereas the new version doesn't?

cheers,

Rolf
#
On Wed, 28 May 2014, Rolf Turner wrote:

            
FWIW,
                   MClus
glm               -2.6950  (0.10455)

glmer (+Stud)     -2.74464 (0.10536)
glmer (+Words)     0.36826 (0.19783)
glmer (+S+W)       0.33574 (0.19981)

glmmML (+Stud)    -2.7444 (0.10546)
glmmML (+Words)    0.3683 (0.19816)

It's words that will always get you into trouble...

| David Duffy (MBBS PhD)
| email: David.Duffy at qimrberghofer.edu.au  ph: INT+61+7+3362-0217 fax: -0101
| Genetic Epidemiology, QIMR Berghofer Institute of Medical Research
| 300 Herston Rd, Brisbane, Queensland 4006, Australia  GPG 4D0B994A
#
Very useful observation.  Thanks.
On 14-05-27 10:49 PM, David Duffy wrote:
#
Thanks.

Yes.  That makes quite a bit of sense.  Phoneme types are going to be 
partially confounded with words.  A glimmer of light is starting to seep 
through the cracks.

Still leaves a bit of a puzzle as to why the 0.999999-0 and 1.1-7 
results are so different.

cheers,

Rolf
On 28/05/14 14:49, David Duffy wrote: