Skip to content

how reliable are inferences drawn from binomial models for small datasets fitted with lme4?

2 messages · Luca Borger, Roger Levy

#
Hello,

but for m01 I think the model output:

####
(1  | F2), dat, family=binomial))
Generalized linear mixed model fit by the Laplace approximation
Formula: Response ~ Treatment + (Treatment - 1 | F1) + (1 | F2)
    Data: dat
    AIC   BIC logLik deviance
  87.58 107.1 -37.79    75.58
Random effects:
  Groups Name        Variance   Std.Dev.   Corr
  F2     (Intercept) 1.6467e-12 1.2832e-06
  F1     Treatment1  9.0240e+00 3.0040e+00
         Treatment2  3.4832e+00 1.8663e+00 -1.000
Number of obs: 190, groups: F2, 24; F1, 16
###


indicates that there is not enough information to estimate that model (Corr=-1.00) and that the random effects part should be simplified notwithstanding the logLik value? Sorry fpr adding a question and hope this is relevant.




Cheers,

Luca





----- Messaggio originale -----
Da: "Roger Levy" <rlevy at ling.ucsd.edu>
A: "Jarrod Hadfield" <j.hadfield at ed.ac.uk>
Cc: r-sig-mixed-models at r-project.org
Inviato: Domenica, 5 luglio 2009 20:06:56 GMT -05:00 U.S.A./Canada, stati orientali
Oggetto: Re: [R-sig-ME] how reliable are inferences drawn from binomial models for small datasets fitted with lme4?

Dear Jarrod,

Many thanks for your thoughts on this!  (more below)
On Jul 5, 2009, at 2:44 PM, Jarrod Hadfield wrote:

            
Actually in the original data there are 61 successes and no failures.   
In the imagined version of the data with 60 successes and one failure,  
the lme4 and Bayesian results are more in agreement.
Ah, but the trouble is that there *does* seem to be a reasonable  
amount of evidence for treatment-specific random effects of at least  
one of F1 and F2.  For example:

 > print (m1 <- lmer(Response ~ Treatment + (1 | F1) + (1  | F2), dat,  
family=binomial))
Generalized linear mixed model fit by the Laplace approximation
Formula: Response ~ Treatment + (1 | F1) + (1 | F2)
    Data: dat
    AIC   BIC logLik deviance
  90.15 103.1 -41.08    82.15
Random effects:
  Groups Name        Variance Std.Dev.
  F2     (Intercept) 0.0000   0.0000
  F1     (Intercept) 3.4413   1.8551
Number of obs: 190, groups: F2, 24; F1, 16

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   2.9542     0.6277   4.706 2.52e-06 ***
Treatment2    2.6760     1.2083   2.215   0.0268 *
 > print (m01 <- lmer(Response ~ Treatment + (Treatment - 1 | F1) +  
(1  | F2), dat, family=binomial))
Generalized linear mixed model fit by the Laplace approximation
Formula: Response ~ Treatment + (Treatment - 1 | F1) + (1 | F2)
    Data: dat
    AIC   BIC logLik deviance
  87.58 107.1 -37.79    75.58
Random effects:
  Groups Name        Variance   Std.Dev.   Corr
  F2     (Intercept) 1.6467e-12 1.2832e-06
  F1     Treatment1  9.0240e+00 3.0040e+00
         Treatment2  3.4832e+00 1.8663e+00 -1.000
Number of obs: 190, groups: F2, 24; F1, 16

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   3.8284     0.9857   3.884 0.000103 ***
Treatment2    1.3655     2.0034   0.682 0.495487

Going by the reported log-likelihoods and applying the reasoning of  
Baayen, Bates, and Davidson, a likelihood ratio test should be  
conservative and there so there is quite a bit of evidence for m01  
over m00 above.

This leads back to the original issue, though.  One intuition that I  
have is that for the two models

   Response ~ Treatment + (Treatment - 1 | F1) + (Treatment - 1 | F2)
   Response ~ 1 + (Treatment - 1 | F1) + (Treatment - 1 | F2)

the log-likelihood should really have to be a fair bit better in the  
former than in the latter model (more than a difference of 0.42!),  
because whereas both models have to explain the observed proportions,  
the latter model also has to be lucky enough for the random effects to  
line up such that the observed proportions have a non-trivial  
likelihood.  That is, with the log-likelihood of interest being

   f(y | \beta, \theta) = \int P(y | \beta, b) P(b | \sigma) db

the former model should have much more probability from P(b | \sigma)  
in the region of b where P(y | \beta, b) is large enough to matter.

Best

Roger
--

Roger Levy                      Email: rlevy at ling.ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://ling.ucsd.edu/~rlevy

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
On Jul 5, 2009, at 8:58 PM, Luca Borger wrote:

            
Dear Luca,

Many thanks for your thoughts on this -- I did notice the correlation  
for F1 but in my thinking, this was sensible behavior by the model and  
didn't necessarily indicate a need to simplify the F1 random effects  
component.  Let me explain my reasoning.  Going back to the marginal  
mean responses per F1/Treatment combination:

 > with(dat,tapply(Response, list(F1, Treatment), mean))
            1  2
1  1.0000000  1
2  0.5714286  1
3  0.8888889  1
4  1.0000000 NA
5  0.0000000  1
6  1.0000000 NA
7  0.6666667  1
8  1.0000000  1
9  1.0000000  1
10 1.0000000  1
11 0.8181818  1
12 0.6000000  1
13 1.0000000 NA
14 1.0000000  1
15 1.0000000  1
16 1.0000000  1

it seems to me that the only way a random effect can contribute to the  
likelihood is by bringing down the linear predictor in Treatment for  
levels 2, 3, 7, 11, and 12 of F1.  So the random effects for these  
levels in Treatment 1 must be negative.  Now, given this constraint,  
what choice random effects for these levels in Treatment 2 maximize  
the likelihood?  Since there are never any Treatment 2 failures, these  
random effects should be as high as possible.  This leads to an  
inferred correlation of -1 for the random effects of F1.

The surprise to me isn't really that the random-effect correlation is  
so extreme, but rather that the model log-likelihoods seem to indicate  
that there's sufficient evidence to go with the model with more  
complex random-effects structure.  Which leads me to ask about how  
reliable log-likelihood differences for mixed-effects logit models on  
small datasets like this may be.

Many thanks once more!

Roger

--

Roger Levy                      Email: rlevy at ling.ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://ling.ucsd.edu/~rlevy