Skip to content

Model is nearly unidentifiable with lmer

7 messages · Alex Fine, Chunyun Ma, Ben Bolker

#
?Dear all,

?This is my first post in the mailing list. ?
I have been running some model ?with lmer and came across this warning
message:

In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue

   - Rescale variables?

Here is the formula of my model (I substituted variables names with generic
names):

y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1 +
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)

Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable

A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1 12 1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403

When I reduced the # of random effect to (1+Xc|sub), the warning message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
? better understand t
he
??
warning message if anyone could
? kindly?
suggest
?some
 reference paper/book.

Thank you very for your help!!

Chunyun
?
#
Short answer: try rescaling all of your continuous variables.  It
can't hurt/will change only the interpretation.  If you get the same
log-likelihood with the rescaled variables, that indicates that the
large eigenvalue was not actually a problem in the first place.

   I don't think the standard citation from the R citation file
<https://cran.r-project.org/web/packages/lme4/citation.html>, or the
book chapter I wrote recently (chapter 13 of Fox et al, Oxford
University Press 2015 -- online supplements at
<http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>)
cover rescaling in much detail. Schielzeth 2010
doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about
the interpretive advantages of scaling.

   Ben Bolker
On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
#
You might also try using sum-coding rather than (the default) dummy coding
with the categorical predictors.  Assuming the design is roughly balanced,
this is like mean-centering the categorical variables.  This will change
the interpretation of the coefficients.

Here is some further reading:  http://talklab.psy.gla.ac.uk/tvw/catpred/
On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:

            

  
    
6 days later
#
Hi dear Ben and Alex!

Thank you very much for your help and guidance! I just started reading your
references. As I was exploring the alternatives you have suggested, another
question came up. This may sounds silly, but I haven't found a definitive
answer online: in the lmer formula, is it necessary to convert the random
factor into factor using factor()?  Given that I have a RM design, my
random factor will always be subject, which is numerical unless I force it
into factor...

Thank you again!

Warmly,  Chunyun
On Sun, Oct 11, 2015 at 8:28 PM, Alex Fine <abfine at gmail.com> wrote:

            

  
  
#
Hi again dear Ben and Alex!

I scaled the continuous predictor (Xc) using scale(Xc, centre=T, scale=T)
and the warning did disappear! Also, the log likelihood remains the same.
As Ben suggested, this indicates the large eigenvalue was not actually a
problem in the first place, although I still feel hazy about why the
warning appeared previously (I need to refresh my memory of what
eigenvalues are).

I also converted the subject using factor(). I would love to better
understand when it is necessary to factor a variable. I did find a post
from stackoverflow
<http://stackoverflow.com/questions/21226069/when-are-factors-necessary-appropriate-in-r>
on a similar topic, but it did not mention the random factor in a lmer
formula.

Alex, I tried both dummy coding and sum coding as you suggested. I got the
same warning message with either coding scheme. I still need to carefully
read your full paper to understand what ?maximal random-effect structure?
is.

To recap, my remaining questions are:

   - Can I ignore the eigenvalue warning and proceed with the raw variable
   (because the rescaling makes it hard to interpret) since the log likelihood
   does not change?
   - In using lmer for RM design, if the random factor is
   subject/participant, should I always makes sure subject has been converted
   to factor using factor()? Any further reference would be appreciated.

Many thanks!

Warmly, Chunyun
On Sun, Oct 18, 2015 at 11:46 AM, Chunyun Ma <mcypsy at gmail.com> wrote:
Hi dear Ben and Alex!

  
  
#
lme4 always treats grouping variables (those on the right side of a
bar in a random-effects term such as (1|g) ) as factors, no matter
what their underlying type is.  This is particularly useful for models
such as  z ~ year + (1|year), which treats year as numeric (i.e.
fitting a linear regression line) in the fixed-effects part of the
model but as a categorical grouping variable (i.e. fitting year-level
deviations from the regression line) in the random-effects part of the
model.

  That said, if you have variables that are numeric in appearance but
are always going to be treated as categorical (e.g. subject IDs that
are arbitrary numeric codes), it's best practice to explicitly convert
them to factors early in your workflow.
On Sun, Oct 18, 2015 at 11:46 AM, Chunyun Ma <mcypsy at gmail.com> wrote:
#
[cc'd to r-sig-mixed-models]
On Sun, Oct 18, 2015 at 1:06 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
Yes.
See previous e-mail.