lmer vs glmmPQL

Roger Levy wrote:
Ben Bolker wrote:
Daniel Ezra Johnson wrote:
Regarding this thread, what about the method of fitting nested models
and using anova() to estimate a p-value.

For example:

mod2 = lmer(y ~ genotype + (1|block), family = binomial, data)
mod0 = lmer(y ~ (1|block), family = binomial, data)

anova(mod0,mod2)

How does that p-value (which is one value for the whole term
"genotype") relate to the individual coefficient p-values derived from
the Z-scores inside summary(mod2)?

Thanks,
Dan
? ?The Z-scores are Wald tests. ?The anova results are likelihood
ratio tests. ?The latter are in general more reliable but are known
to be *un*reliable for small-sample (i.e.
small-number-of-random-effects-levels) LMMs (Pinheiro and Bates).
Wald tests are not *known* to be unreliable in the small-sample
case, but I believe that is a statement of ignorance rather than
a statement that they're OK ...
This is an interesting issue -- as Ben points out, likelihood-ratio
tests are known to be anti-conservative for tests between linear mixed
models differing only in fixed-effects structure. ?I think the
best-known demonstration of this can be found in Pinheiro & Bates, 2000,
pages 87-92. ?F-tests are less problematic.
?But note that F tests may not be available or well-defined for
GLMMs ... ?(Distinguishing here between conditional F tests (e.g. p. 90
of Pinheiro & Bates 2000) and Wald F tests of parameter combinations --
as far as I know the former are not applicable to GLMMs.
Wald F tests are always available but don't necessarily (???) share
the same properties/reliability as the conditional F tests in the LMM
case ...)

For logit mixed models, presumably the results will depend a bit on how
fitting is done because the likelihood has to be approximated. ?For the
Laplace approximation currently in use in lme4, I've done some
simulations and generally found that likelihood-ratio tests are still
anti-conservative; tests based on Wald Z are not as anti-conservative,
though they don't seem quite nominal.
?Hmmm. ?Can't we separate estimation (for which Laplace and AGQ are
probably adequate) from inference? ?Would there be any point in
considering inference on estimates from a less accurate (PQL)
approximation scheme as long as we have more accurate schemes available?

?I worry whether the Wald Z tests are consistently better or
just "differently wrong" ...

You can do simulations along
these lines:

set.seed(2)
invlogit <- function(x) {
? ?temp <- exp(x)
? ?return(temp/(1+temp))
}
L <- 10
REP <- 4
N <- 4*10
fac <- gl(L,REP)
f <- function(ignore) {
? ?x <- runif(N)
? ?r <- rnorm(L)
? ?y <- rbinom(N,1,invlogit(r[fac]))
? ?m0 <- lmer(y ~ (1 | fac), family="binomial")
? ?m1 <- lmer(y ~ x + (1 | fac), family="binomial")
? ?temp <- pnorm(fixef(m1)[2]/sqrt(vcov(m1)[2,2]))
? ?p.z <- 2*min(temp, 1-temp)
? ?p.lr <- 1-pchisq(as.numeric(2*(logLik(m1)-logLik(m0))),1)
? ?return(c(p.z,p.lr))
}
K <- 5000
p.vals <- sapply(1:K,f)
apply(p.vals, 1, function(x) sum(x<0.05))/K
# this gave me 0.0562 for Wald Z, 0.0608 for LRT

# Is Wald Z anti-conservative? ?The below says "yes" with p=0.044
2*(1-pbinom(sum(p.vals[1,]<0.05), K, 0.05))

Unlike what Ben states above, however, it's my intuition (and informal
observation on the basis of simulations) that what determines how
anticonservative the LRT is is not the number of random effect levels
but rather the ratio between the degrees of freedom involved in the LRT
to the residual degrees of freedom in the more general model. ?The more
residual degrees of freedom (basically, the more observations), the less
anti-conservative the LRT is. ?This can be assessed for a specific
dataset by fitting a null-hypothesis model, repeatedly generating new,
artificial responses from it, and looking at the distribution of LRTs on
the artificial responses (along the lines of P&B 2000).
? ?Your conclusion agrees with what Pinheiro and Bates say (p. 88:
"as the number of parameters being removed from the fixed effects
becomes large, compared to the total number of observations, this
inaccuracy in the reported p-values can be substantial").
I don't really know how to decide what is more relevant to
"asymptopia" -- if N is the total number of obs., n the number
of blocks (say m = number per block so N=mn), and p the number of
parameters (fixed? random? variances counting as 1, or n-1, or
something in between?), is it 1-p/N [as PB suggest], or (N-p) [suggested
above], or n? Demidenko p. 145 shows that for LMMs ?N \to \infty is
not sufficient for consistency, that n \to \infty is also required ...
I don't know if that helps in practice.

?I don't know where my intuition comes from -- perhaps it's just
conservatism/pessimism, since it's often easier to get N>>p than to get
n large ...

?I do agree that the gold standard is simulations as suggested above.
There are examples in the "worked example" section of glmm.wikidot.com .
These simulations can be very slow -- hopefully (??) Doug Bates will be
able to work out an mcmcsamp() scheme for GLMMs soon, and we can see
if that (much faster) approach generally agrees with null simulations ...
If you look at Doug's schedule for July (www.stat.wisc.edu/~bates) you
will understand why he has been missing in action recently.  In my
"spare time" - when not working on a major report finished last week
and a grant proposal due this week and in preparing these various
courses - I am working on the draft of a book about lme4 because if I
don't get some part of that manuscript to John Kimmel soon he is going
to be very rude to me in Rennes.

@book{demidenko_mixed_2004,
? ? ? ?edition = {1},
? ? ? ?title = {Mixed Models: Theory and Applications},
? ? ? ?isbn = {0471601616},
? ? ? ?publisher = {{Wiley-Interscience}},
? ? ? ?author = {Eugene Demidenko},
? ? ? ?month = jul,
? ? ? ?year = {2004}
}

Roger

--
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

lmer vs glmmPQL

Thread (10 messages)