specifying crossed random effects for glmmPQL / lme
There are a few issues here: see comments inline.
On 17-09-26 05:03 PM, Van Rynald Liceralde wrote:
Hello,
I'm trying to fit a GLMM on simulated response time data (continuous,
positively skewed) obtained from hypothetical participants (Subject)
responding to the same set of hypothetical items (Item), so it's a
fully-crossed design. I intend to include several crossed-random effects
for Subject and Item, so in lme4 language, it would look like the following:
glmer(y ~ x1*x2*z1 + (1+x1+x2|Subject) + (1|Item),
family=Gamma("identity"), data=foo)
I've seen the arguments that say that one should use a Gamma with identity link for response time data; I didn't find them 100% convincing, but whatever (can someone remind me of the reference?) Nevertheless, be aware that fitting models where the link function doesn't constrain the predicted value to to the domain of the specified probability distribution (e.g. Gamma/inverse, Gamma/identity, binomial/identity ...) is much more likely to be computationally problematic.
However, as I read from Ben Bolker's GLMM FAQ ( https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#fn1), the estimation procedure used by glmer (adaptive Gauss-Hermite quadrature) can only handle up to 2-3 random effects. Indeed, running glmer on my simulated data not only results in inevitable non-convergence but also takes such a long time to run.
AGHQ is not glmer's default; Laplace (equivalently, AGHQ with a single quadrature point) is.
Someone recommended to me to use MASS::glmmPQL instead because the cases in which penalized quasi-likelihood (PQL) would perform poorly (count/binomial DV, mean DV < 5) doesn't apply to my data (continuous DV, identity link, many items, and many subjects). Moreover, PQL could handle more random effects than GHQ; it could also allow for correlations of random effects to be estimated; and it estimates the model faster than GHQ. (I don't actually know about any of those being accurate characterizations of PQL and GHQ; would be happy to be corrected and pointed to the right direction.)
The underlying characteristic for whether glmmPQL works well is how close the sampling distributions of the conditional modes are to being Gaussian. This generally fails badly in settings where there is little information on each cluster, which is true for low-count data; I'm not quite sure how "information per observation" maps onto the Gamma distribution, although very small shape parameters/skewed distributions would probably be worse than approximately Normal responses. If you have many items per subject you're probably OK. It is certainly true that where it is sufficiently accurate, PQL is faster than Laplace or AGHQ. I'm not sure what you mean by "also allow for correlations of random effects to be estimated" ...
The solution suggested online on CrossValidated is as follows:
bar <- glmmPQL(y ~ x1*x2*z1, random=list(Subject=~1+x1+x2, Item=~1),
family=Gamma("identity"),data=foo)
but this way of doing it seems to model the random effect for Item as if it
was nested under Subject, but I want them to be identified as crossed. I
was wondering if someone can point me to how I'd be able to specify my
model using glmmPQL such that the effects of Subject and Item are truly
crossed. Thank you so much!
Unfortunately crossed effects are rather challenging to implement in nlme (the platform underlying glmmPQL). There is one example in one of the later chapters of Pinheiro and Bates (2000), but I'm not in a position to look it up right now ...
Sincerely, Van Liceralde