Skip to content

Maximal random-effects lmer not converging

2 messages · Stephen Politzer-Ahles, Ben Bolker

#
Hello,

I am trying to model a somewhat complicated dataset (which includes a
2x2x4 interaction) with maximal random effects, based on the
suggestions from Barr et al. (2013). The maximal model is of course
not converging, and there are several things I don't understand about
how to proceed.

1. I've seen several suggestions that, when a model fails to converge,
you should look at the non-convergent model and then kick out
whichever random slope accounted for the least variance. But since my
model includes a four-level factor, I get different variances for each
level of the factor (and the problem is compounded by the interaction
terms, see the snippet below; there are also other random effects for
control variables, which I have not shown):

Random effects:
 Groups    Name                       Variance  Std.Dev.  Corr
Subject   Factor1a:Factor2a  1.553e-08 1.246e-04
           Factor1b:Factor2a  2.000e-08 1.414e-04 0.69
           Factor1a:Factor2b   7.322e-09 8.557e-05 0.69 0.99
           Factor1b:Factor2b   2.624e-08 1.620e-04 0.55 0.70 0.71
           Factor1a:Factor2c   5.017e-08 2.240e-04 0.41 0.65 0.65 0.89
           Factor1b:Factor2c   2.220e-08 1.490e-04 0.25 0.48 0.55 0.78
0.90
           Factor1a:Factor2d 3.972e-08 1.993e-04 0.50 0.67 0.72 0.93
0.94 0.95
           Factor1b:Factor2d 1.642e-08 1.282e-04 0.36 0.79 0.78 0.83
0.81 0.71 0.81

So how do I evaluate the amount of variance accounted for by a
particular factor (or interaction), in order to determine which ones
to remove from the model?

2. I am trying to model the random effects structure without
correlations, since I'm having a hard time getting convergence. Barr
et al. (2013) suggest that if you're not using correlations, then the
factors should be coded with deviation coding rather than treatment
coding. However, deviation coding does not make theoretical sense for
the variables I'm looking at; my design has a 4-level factor, and one
of those is a 'baseline' level against which I want to compare the
other three (my dependent measure is reaction times, and I want to see
which conditions are faster than baseline). So in this case should I
estimate the model with deviation coding, and then use post-hoc tests
(with some package like glht) later on to compare conditions somehow?
Or just go ahead using treatment coding instead of deviation coding?

Thank you,
Steve


Stephen Politzer-Ahles
New York University, Abu Dhabi
Neuroscience of Language Lab
http://www.nyu.edu/projects/politzer-ahles/
4 days later
#
Stephen Politzer-Ahles <spa268 at ...> writes:
When you say "not converging", what do you mean exactly?  Are you getting
warnings, and if so what are they (precisely)?  Or are you stating the
fact that you're getting estimates of random-effects variances that are
effectively zero, or estimates of correlations that are +/- 1?
Well, this is *one* component of the variance structure -- there's no
way to drop one part of it.  (You can't say "I want to fit an interaction
among A, B, and C, but I want to drop the B:C term" -- or at least it's
difficult and unlikely to be sensible).  You could try (Factor1+Factor2|Subject)
instead of (Factor1:Factor2|subject) -- that would reduce this block from 
an 8x8 variance-covariance matrix (dimension=nlevels(1)*nlevels(2)) to a 5x5 
(nlevels(1)+nlevels(2)-1) variance-covariance matrix, or from 8*9/2=36
parameters
to 5*6/2=15 ...
Can't help you with this one without spending a lot more time thinking
about it.  Sorry.  The fundamental problem is that when you force correlations
to zero, the predictions about what's going on at any particular combination
of factor levels then depends on the coding -- it is no longer invariant
to the coding chosen ...