Skip to content

lmer vs glmmPQL

10 messages · Ken Beath, Federico Calboli, Daniel Ezra Johnson +3 more

#
Hi All,

I'm doing a simple logistic regression with one fixed and one random  
effects, and I'm comparing the results I got from lmer() and  
glmmPQL(). I'm finding that lmer gives my a "better" p-value for my  
fixed effects. Because I'm a paranoid old man I'd go for the glmmPQL  
results then, but my collaborators are less paranoid and I'm sure  
they'd prefer the results from lmer. Am I too conservative? (I ralise  
it looks like I'm asking for counselling more than advice, but there  
you go...).

Best,

Federico

My models:

mod1 = glmmPQL(y ~ genotype, random = ~1|block, family = binomial, data)
mod2 = lmer(y ~ genotype + (1|block), family = binomial, data)

my data:

 > data
   genotype block y.1 y.2
1  A     a  16  29
2     B     a  19  26
3      C     a  23  23
4  A     c   6  24
5     B     c  11  11
6      C     c  13  14
7  A     b   4  17
8     B     b  10   8
9      C     b  12   6


 > data[[1]]
[1] A B    C     A B    C     A B    C
attr(,"contrasts")
         [,1] [,2]
B       1   -1
A   -2    0
C        1    1
Levels: B A C

my results:

 > summary(mod1)
Linear mixed-effects model fit by maximum likelihood
  Data: dat
   AIC BIC logLik
    NA  NA     NA

Random effects:
  Formula: ~1 | block
          (Intercept)  Residual
StdDev: 1.285532e-06 0.8077838

Variance function:
  Structure: fixed weights
  Formula: ~invwt
Fixed effects: y ~ genotype
                  Value  Std.Error DF   t-value p-value
(Intercept) -0.3327269 0.12516679  4 -2.658269  0.0565
genotype1    0.3288359 0.09065856  4  3.627190  0.0222
genotype2    0.1138920 0.14947659  4  0.761938  0.4885
  Correlation:
           (Intr) gntyp1
genotype1 -0.068
genotype2 -0.027 -0.019

Standardized Within-Group Residuals:
        Min         Q1        Med         Q3        Max
-1.0807836 -0.8047002 -0.4620287  0.8940787  1.5832300

Number of Observations: 9
Number of Groups: 3


 > summary(mod2)
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ genotype + (1 | block)
    Data: dat
    AIC   BIC logLik deviance
  13.92 14.71 -2.960    5.919
Random effects:
  Groups Name        Variance Std.Dev.
  block  (Intercept)  0        0
Number of obs: 9, groups: block, 3

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.33273    0.12652  -2.630 0.008541 **
genotype1    0.32884    0.09164   3.588 0.000333 ***
genotype2    0.11389    0.15109   0.754 0.450965
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
           (Intr) gntyp1
genotype1 -0.068
genotype2 -0.027 -0.019




--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com
#
On 24/06/2009, at 2:18 AM, Federico Calboli wrote:

            
This seems to results from the use of a t-test with few df in glmmPQL  
and z in lmer. z seems fine to me. What is more of a problem is that  
your random effects variance is effectively 0. There are only 3 blocks  
so fitting a random effects model will be difficult and appears  
unnecessary.

Ken
#
On 23 Jun 2009, at 22:46, Ken Beath wrote:
That was a sample dataset so I could see what kind of data I had to  
deal with, the 'real' hing should have a variance > 0 for the random  
effect. My philosphycal issue was, given such a relatively  
straightforward model, should I be more (glmmPQL) or less (lmer)  
conservative?

Best,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com
#
Federico Calboli wrote:
My take would be to pick lmer over glmmPQL every time, provided
it can handle your problem -- in general it should be more accurate.
Picking on the basis of more or less conservative for any given problem
feels biased.

  Ben Bolker
#
Regarding this thread, what about the method of fitting nested models
and using anova() to estimate a p-value.

For example:

mod2 = lmer(y ~ genotype + (1|block), family = binomial, data)
mod0 = lmer(y ~ (1|block), family = binomial, data)

anova(mod0,mod2)

How does that p-value (which is one value for the whole term
"genotype") relate to the individual coefficient p-values derived from
the Z-scores inside summary(mod2)?

Thanks,
Dan
#
Daniel Ezra Johnson wrote:
The Z-scores are Wald tests.  The anova results are likelihood
ratio tests.  The latter are in general more reliable but are known
to be *un*reliable for small-sample (i.e.
small-number-of-random-effects-levels) LMMs (Pinheiro and Bates).
Wald tests are not *known* to be unreliable in the small-sample
case, but I believe that is a statement of ignorance rather than
a statement that they're OK ...

  Ben Bolker
#
Ben Bolker wrote:
This is an interesting issue -- as Ben points out, likelihood-ratio 
tests are known to be anti-conservative for tests between linear mixed 
models differing only in fixed-effects structure.  I think the 
best-known demonstration of this can be found in Pinheiro & Bates, 2000, 
pages 87-92.  F-tests are less problematic.

For logit mixed models, presumably the results will depend a bit on how 
fitting is done because the likelihood has to be approximated.  For the 
Laplace approximation currently in use in lme4, I've done some 
simulations and generally found that likelihood-ratio tests are still 
anti-conservative; tests based on Wald Z are not as anti-conservative, 
though they don't seem quite nominal.  You can do simulations along 
these lines:

set.seed(2)
invlogit <- function(x) {
   temp <- exp(x)
   return(temp/(1+temp))
}
L <- 10
REP <- 4
N <- 4*10
fac <- gl(L,REP)
f <- function(ignore) {
   x <- runif(N)
   r <- rnorm(L)
   y <- rbinom(N,1,invlogit(r[fac]))
   m0 <- lmer(y ~ (1 | fac), family="binomial")
   m1 <- lmer(y ~ x + (1 | fac), family="binomial")
   temp <- pnorm(fixef(m1)[2]/sqrt(vcov(m1)[2,2]))
   p.z <- 2*min(temp, 1-temp)
   p.lr <- 1-pchisq(as.numeric(2*(logLik(m1)-logLik(m0))),1)
   return(c(p.z,p.lr))
}
K <- 5000
p.vals <- sapply(1:K,f)
apply(p.vals, 1, function(x) sum(x<0.05))/K
# this gave me 0.0562 for Wald Z, 0.0608 for LRT

# Is Wald Z anti-conservative?  The below says "yes" with p=0.044
2*(1-pbinom(sum(p.vals[1,]<0.05), K, 0.05))


Unlike what Ben states above, however, it's my intuition (and informal 
observation on the basis of simulations) that what determines how 
anticonservative the LRT is is not the number of random effect levels 
but rather the ratio between the degrees of freedom involved in the LRT 
to the residual degrees of freedom in the more general model.  The more 
residual degrees of freedom (basically, the more observations), the less 
anti-conservative the LRT is.  This can be assessed for a specific 
dataset by fitting a null-hypothesis model, repeatedly generating new, 
artificial responses from it, and looking at the distribution of LRTs on 
the artificial responses (along the lines of P&B 2000).

Roger
#
Roger Levy wrote:
But note that F tests may not be available or well-defined for
GLMMs ...  (Distinguishing here between conditional F tests (e.g. p. 90
of Pinheiro & Bates 2000) and Wald F tests of parameter combinations --
as far as I know the former are not applicable to GLMMs.
Wald F tests are always available but don't necessarily (???) share
the same properties/reliability as the conditional F tests in the LMM
case ...)
Hmmm.  Can't we separate estimation (for which Laplace and AGQ are
probably adequate) from inference?  Would there be any point in
considering inference on estimates from a less accurate (PQL)
approximation scheme as long as we have more accurate schemes available?

  I worry whether the Wald Z tests are consistently better or
just "differently wrong" ...

You can do simulations along
Your conclusion agrees with what Pinheiro and Bates say (p. 88:
"as the number of parameters being removed from the fixed effects
becomes large, compared to the total number of observations, this
inaccuracy in the reported p-values can be substantial").
I don't really know how to decide what is more relevant to
"asymptopia" -- if N is the total number of obs., n the number
of blocks (say m = number per block so N=mn), and p the number of
parameters (fixed? random? variances counting as 1, or n-1, or
something in between?), is it 1-p/N [as PB suggest], or (N-p) [suggested
above], or n? Demidenko p. 145 shows that for LMMs  N \to \infty is
not sufficient for consistency, that n \to \infty is also required ...
I don't know if that helps in practice.

  I don't know where my intuition comes from -- perhaps it's just
conservatism/pessimism, since it's often easier to get N>>p than to get
n large ...

  I do agree that the gold standard is simulations as suggested above.
There are examples in the "worked example" section of glmm.wikidot.com .
These simulations can be very slow -- hopefully (??) Doug Bates will be
able to work out an mcmcsamp() scheme for GLMMs soon, and we can see
if that (much faster) approach generally agrees with null simulations ...


@book{demidenko_mixed_2004,
	edition = {1},
	title = {Mixed Models: Theory and Applications},
	isbn = {0471601616},
	publisher = {{Wiley-Interscience}},
	author = {Eugene Demidenko},
	month = jul,
	year = {2004}
}

  
    
#
On Wed, Jun 24, 2009 at 4:51 PM, Ben Bolker<bolker at ufl.edu> wrote:
If you look at Doug's schedule for July (www.stat.wisc.edu/~bates) you
will understand why he has been missing in action recently.  In my
"spare time" - when not working on a major report finished last week
and a grant proposal due this week and in preparing these various
courses - I am working on the draft of a book about lme4 because if I
don't get some part of that manuscript to John Kimmel soon he is going
to be very rude to me in Rennes.
#
Just to clarify, I wasn't trying to nag ... very excited about
the prospect of the book !  (How would you, and Springer/John Kimmel,
feel, about drafts of the book being accessible prior to publication ??)

  Ben Bolker