Model is nearly unidentifiable with lmer - R-SIG-mixed-models

Sun, Oct 11, 2015 3:37 PM #

?Dear all,

?This is my first post in the mailing list. ?
I have been running some model ?with lmer and came across this warning
message:

In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue

   - Rescale variables?

Here is the formula of my model (I substituted variables names with generic
names):

y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1 +
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)

Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable

A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1 12 1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403

When I reduced the # of random effect to (1+Xc|sub), the warning message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
? better understand t
he
??
warning message if anyone could
? kindly?
suggest
?some
 reference paper/book.

Thank you very for your help!!

Chunyun
?

Ben Bolker

Sun, Oct 11, 2015 5:18 PM #

Short answer: try rescaling all of your continuous variables.  It
can't hurt/will change only the interpretation.  If you get the same
log-likelihood with the rescaled variables, that indicates that the
large eigenvalue was not actually a problem in the first place.

   I don't think the standard citation from the R citation file
<https://cran.r-project.org/web/packages/lme4/citation.html>, or the
book chapter I wrote recently (chapter 13 of Fox et al, Oxford
University Press 2015 -- online supplements at
<http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>)
cover rescaling in much detail. Schielzeth 2010
doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about
the interpretive advantages of scaling.

   Ben Bolker

On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Alex Fine

Sun, Oct 11, 2015 5:28 PM #

You might also try using sum-coding rather than (the default) dummy coding
with the categorical predictors.  Assuming the design is roughly balanced,
this is like mean-centering the categorical variables.  This will change
the interpretation of the coefficients.

Here is some further reading:  http://talklab.psy.gla.ac.uk/tvw/catpred/

On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:

Short answer: try rescaling all of your continuous variables.  It
can't hurt/will change only the interpretation.  If you get the same
log-likelihood with the rescaled variables, that indicates that the
large eigenvalue was not actually a problem in the first place.

   I don't think the standard citation from the R citation file
<https://cran.r-project.org/web/packages/lme4/citation.html>, or the
book chapter I wrote recently (chapter 13 of Fox et al, Oxford
University Press 2015 -- online supplements at
<http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>)
cover rescaling in much detail. Schielzeth 2010
doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about
the interpretive advantages of scaling.

   Ben Bolker


On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:

Dear all,

This is my first post in the mailing list.
I have been running some model with lmer and came across this warning
message:

In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue

   - Rescale variables?

Here is the formula of my model (I substituted variables names with

generic

names):

y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1 +
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)

Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable

A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1 12

1 0

1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403

When I reduced the # of random effect to (1+Xc|sub), the warning message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he

warning message if anyone could
kindly
suggest
some
 reference paper/book.

Thank you very for your help!!

Chunyun


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Alex Fine
Ph. (336) 302-3251
web:  http://internal.psychology.illinois.edu/~abfine/
<http://internal.psychology.illinois.edu/~abfine/AlexFineHome.html>

	[[alternative HTML version deleted]]

Chunyun Ma

Sun, Oct 18, 2015 8:46 AM #

Hi dear Ben and Alex!

Thank you very much for your help and guidance! I just started reading your
references. As I was exploring the alternatives you have suggested, another
question came up. This may sounds silly, but I haven't found a definitive
answer online: in the lmer formula, is it necessary to convert the random
factor into factor using factor()?  Given that I have a RM design, my
random factor will always be subject, which is numerical unless I force it
into factor...

Thank you again!

Warmly,  Chunyun

On Sun, Oct 11, 2015 at 8:28 PM, Alex Fine <abfine at gmail.com> wrote:

You might also try using sum-coding rather than (the default) dummy coding
with the categorical predictors.  Assuming the design is roughly balanced,
this is like mean-centering the categorical variables.  This will change
the interpretation of the coefficients.

Here is some further reading:  http://talklab.psy.gla.ac.uk/tvw/catpred/

On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:

Short answer: try rescaling all of your continuous variables.  It
can't hurt/will change only the interpretation.  If you get the same
log-likelihood with the rescaled variables, that indicates that the
large eigenvalue was not actually a problem in the first place.

   I don't think the standard citation from the R citation file
<https://cran.r-project.org/web/packages/lme4/citation.html>, or the
book chapter I wrote recently (chapter 13 of Fox et al, Oxford
University Press 2015 -- online supplements at
<http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>)
cover rescaling in much detail. Schielzeth 2010
doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about
the interpretive advantages of scaling.

   Ben Bolker


On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:

Dear all,

This is my first post in the mailing list.
I have been running some model with lmer and came across this warning
message:

In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue

   - Rescale variables?

Here is the formula of my model (I substituted variables names with

generic

names):

y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1 +
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)

Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable

A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1

12 1 0

1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403

When I reduced the # of random effect to (1+Xc|sub), the warning message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he

warning message if anyone could
kindly
suggest
some
 reference paper/book.

Thank you very for your help!!

Chunyun


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Chunyun Ma

Sun, Oct 18, 2015 10:06 AM #

Hi again dear Ben and Alex!

I scaled the continuous predictor (Xc) using scale(Xc, centre=T, scale=T)
and the warning did disappear! Also, the log likelihood remains the same.
As Ben suggested, this indicates the large eigenvalue was not actually a
problem in the first place, although I still feel hazy about why the
warning appeared previously (I need to refresh my memory of what
eigenvalues are).

I also converted the subject using factor(). I would love to better
understand when it is necessary to factor a variable. I did find a post
from stackoverflow
<http://stackoverflow.com/questions/21226069/when-are-factors-necessary-appropriate-in-r>
on a similar topic, but it did not mention the random factor in a lmer
formula.

Alex, I tried both dummy coding and sum coding as you suggested. I got the
same warning message with either coding scheme. I still need to carefully
read your full paper to understand what ?maximal random-effect structure?
is.

To recap, my remaining questions are:

   - Can I ignore the eigenvalue warning and proceed with the raw variable
   (because the rescaling makes it hard to interpret) since the log likelihood
   does not change?
   - In using lmer for RM design, if the random factor is
   subject/participant, should I always makes sure subject has been converted
   to factor using factor()? Any further reference would be appreciated.

Many thanks!

Warmly, Chunyun

On Sun, Oct 18, 2015 at 11:46 AM, Chunyun Ma <mcypsy at gmail.com> wrote:

Hi dear Ben and Alex!

Thank you very much for your help and guidance! I just started reading
your references. As I was exploring the alternatives you have suggested,
another question came up. This may sounds silly, but I haven't found a
definitive answer online: in the lmer formula, is it necessary to convert
the random factor into factor using factor()?  Given that I have a RM
design, my random factor will always be subject, which is numerical unless
I force it into factor...

Thank you again!

Warmly,  Chunyun

On Sun, Oct 11, 2015 at 8:28 PM, Alex Fine <abfine at gmail.com> wrote:

You might also try using sum-coding rather than (the default) dummy
coding with the categorical predictors.  Assuming the design is roughly
balanced, this is like mean-centering the categorical variables.  This will
change the interpretation of the coefficients.

Here is some further reading:  http://talklab.psy.gla.ac.uk/tvw/catpred/

On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:

Short answer: try rescaling all of your continuous variables.  It
can't hurt/will change only the interpretation.  If you get the same
log-likelihood with the rescaled variables, that indicates that the
large eigenvalue was not actually a problem in the first place.

   I don't think the standard citation from the R citation file
<https://cran.r-project.org/web/packages/lme4/citation.html>, or the
book chapter I wrote recently (chapter 13 of Fox et al, Oxford
University Press 2015 -- online supplements at
<http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>)
cover rescaling in much detail. Schielzeth 2010
doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about
the interpretive advantages of scaling.

   Ben Bolker


On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:

Dear all,

This is my first post in the mailing list.
I have been running some model with lmer and came across this warning
message:

In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue

   - Rescale variables?

Here is the formula of my model (I substituted variables names with

generic

names):

y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1

Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)

Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable

A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1

12 1 0

1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403

When I reduced the # of random effect to (1+Xc|sub), the warning

message

disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he

warning message if anyone could
kindly
suggest
some
 reference paper/book.

Thank you very for your help!!

Chunyun


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Sun, Oct 18, 2015 10:47 AM #

lme4 always treats grouping variables (those on the right side of a
bar in a random-effects term such as (1|g) ) as factors, no matter
what their underlying type is.  This is particularly useful for models
such as  z ~ year + (1|year), which treats year as numeric (i.e.
fitting a linear regression line) in the fixed-effects part of the
model but as a categorical grouping variable (i.e. fitting year-level
deviations from the regression line) in the random-effects part of the
model.

  That said, if you have variables that are numeric in appearance but
are always going to be treated as categorical (e.g. subject IDs that
are arbitrary numeric codes), it's best practice to explicitly convert
them to factors early in your workflow.

On Sun, Oct 18, 2015 at 11:46 AM, Chunyun Ma <mcypsy at gmail.com> wrote:

Hi dear Ben and Alex!

Thank you very much for your help and guidance! I just started reading your
references. As I was exploring the alternatives you have suggested, another
question came up. This may sounds silly, but I haven't found a definitive
answer online: in the lmer formula, is it necessary to convert the random
factor into factor using factor()?  Given that I have a RM design, my random
factor will always be subject, which is numerical unless I force it into
factor...

Thank you again!

Warmly,  Chunyun

On Sun, Oct 11, 2015 at 8:28 PM, Alex Fine <abfine at gmail.com> wrote:

You might also try using sum-coding rather than (the default) dummy coding
with the categorical predictors.  Assuming the design is roughly balanced,
this is like mean-centering the categorical variables.  This will change the
interpretation of the coefficients.

Here is some further reading:  http://talklab.psy.gla.ac.uk/tvw/catpred/

On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:

Short answer: try rescaling all of your continuous variables.  It
can't hurt/will change only the interpretation.  If you get the same
log-likelihood with the rescaled variables, that indicates that the
large eigenvalue was not actually a problem in the first place.

   I don't think the standard citation from the R citation file
<https://cran.r-project.org/web/packages/lme4/citation.html>, or the
book chapter I wrote recently (chapter 13 of Fox et al, Oxford
University Press 2015 -- online supplements at
<http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>)
cover rescaling in much detail. Schielzeth 2010
doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about
the interpretive advantages of scaling.

   Ben Bolker


On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:

Dear all,

This is my first post in the mailing list.
I have been running some model with lmer and came across this warning
message:

In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue

   - Rescale variables?

Here is the formula of my model (I substituted variables names with
generic
names):

y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1
+
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)

Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable

A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1
12 1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403

When I reduced the # of random effect to (1+Xc|sub), the warning
message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he

warning message if anyone could
kindly
suggest
some
 reference paper/book.

Thank you very for your help!!

Chunyun


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Sun, Oct 18, 2015 10:48 AM #

[cc'd to r-sig-mixed-models]

On Sun, Oct 18, 2015 at 1:06 PM, Chunyun Ma <mcypsy at gmail.com> wrote:

Yes.

See previous e-mail.

Many thanks!

Warmly, Chunyun

On Sun, Oct 18, 2015 at 11:46 AM, Chunyun Ma <mcypsy at gmail.com> wrote:

Hi dear Ben and Alex!

Thank you very much for your help and guidance! I just started reading
your references. As I was exploring the alternatives you have suggested,
another question came up. This may sounds silly, but I haven't found a
definitive answer online: in the lmer formula, is it necessary to convert
the random factor into factor using factor()?  Given that I have a RM
design, my random factor will always be subject, which is numerical unless I
force it into factor...

Thank you again!

Warmly,  Chunyun

On Sun, Oct 11, 2015 at 8:28 PM, Alex Fine <abfine at gmail.com> wrote:

You might also try using sum-coding rather than (the default) dummy
coding with the categorical predictors.  Assuming the design is roughly
balanced, this is like mean-centering the categorical variables.  This will
change the interpretation of the coefficients.

Here is some further reading:  http://talklab.psy.gla.ac.uk/tvw/catpred/

On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:

Short answer: try rescaling all of your continuous variables.  It
can't hurt/will change only the interpretation.  If you get the same
log-likelihood with the rescaled variables, that indicates that the
large eigenvalue was not actually a problem in the first place.

   I don't think the standard citation from the R citation file
<https://cran.r-project.org/web/packages/lme4/citation.html>, or the
book chapter I wrote recently (chapter 13 of Fox et al, Oxford
University Press 2015 -- online supplements at
<http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>)
cover rescaling in much detail. Schielzeth 2010
doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about
the interpretive advantages of scaling.

   Ben Bolker


On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:

Dear all,

This is my first post in the mailing list.
I have been running some model with lmer and came across this warning
message:

In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue

   - Rescale variables?

Here is the formula of my model (I substituted variables names with
generic
names):

y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1
+
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)

Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable

A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1
12 1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403

When I reduced the # of random effect to (1+Xc|sub), the warning
message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he

warning message if anyone could
kindly
suggest
some
 reference paper/book.

Thank you very for your help!!

Chunyun


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models