?Dear all, ?This is my first post in the mailing list. ? I have been running some model ?with lmer and came across this warning message: In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue - Rescale variables? Here is the formula of my model (I substituted variables names with generic names): y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1 + Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub) Xc: continuous var Xd: level-1 dummy variable(s) Zd: level-2 dummy variable A snapshot of data. I can also provide the full dataset if necessary. sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1 12 1 0 1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403 When I reduced the # of random effect to (1+Xc|sub), the warning message disappeared, but the model fit became poorer. My question is: which variable(s) should I rescale? I?d be happy to ? better understand t he ?? warning message if anyone could ? kindly? suggest ?some reference paper/book. Thank you very for your help!! Chunyun ?
Model is nearly unidentifiable with lmer
7 messages · Alex Fine, Chunyun Ma, Ben Bolker
Short answer: try rescaling all of your continuous variables. It can't hurt/will change only the interpretation. If you get the same log-likelihood with the rescaled variables, that indicates that the large eigenvalue was not actually a problem in the first place. I don't think the standard citation from the R citation file <https://cran.r-project.org/web/packages/lme4/citation.html>, or the book chapter I wrote recently (chapter 13 of Fox et al, Oxford University Press 2015 -- online supplements at <http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>) cover rescaling in much detail. Schielzeth 2010 doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about the interpretive advantages of scaling. Ben Bolker
On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
Dear all,
This is my first post in the mailing list.
I have been running some model with lmer and came across this warning
message:
In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?
Here is the formula of my model (I substituted variables names with generic
names):
y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1 +
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)
Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable
A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1 12 1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403
When I reduced the # of random effect to (1+Xc|sub), the warning message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he
warning message if anyone could
kindly
suggest
some
reference paper/book.
Thank you very for your help!!
Chunyun
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
You might also try using sum-coding rather than (the default) dummy coding with the categorical predictors. Assuming the design is roughly balanced, this is like mean-centering the categorical variables. This will change the interpretation of the coefficients. Here is some further reading: http://talklab.psy.gla.ac.uk/tvw/catpred/
On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:
Short answer: try rescaling all of your continuous variables. It can't hurt/will change only the interpretation. If you get the same log-likelihood with the rescaled variables, that indicates that the large eigenvalue was not actually a problem in the first place. I don't think the standard citation from the R citation file <https://cran.r-project.org/web/packages/lme4/citation.html>, or the book chapter I wrote recently (chapter 13 of Fox et al, Oxford University Press 2015 -- online supplements at <http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>) cover rescaling in much detail. Schielzeth 2010 doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about the interpretive advantages of scaling. Ben Bolker On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
Dear all, This is my first post in the mailing list. I have been running some model with lmer and came across this warning message: In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue - Rescale variables? Here is the formula of my model (I substituted variables names with
generic
names): y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1 + Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub) Xc: continuous var Xd: level-1 dummy variable(s) Zd: level-2 dummy variable A snapshot of data. I can also provide the full dataset if necessary. sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1 12
1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403
When I reduced the # of random effect to (1+Xc|sub), the warning message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he
warning message if anyone could
kindly
suggest
some
reference paper/book.
Thank you very for your help!!
Chunyun
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Alex Fine Ph. (336) 302-3251 web: http://internal.psychology.illinois.edu/~abfine/ <http://internal.psychology.illinois.edu/~abfine/AlexFineHome.html> [[alternative HTML version deleted]]
6 days later
Hi dear Ben and Alex! Thank you very much for your help and guidance! I just started reading your references. As I was exploring the alternatives you have suggested, another question came up. This may sounds silly, but I haven't found a definitive answer online: in the lmer formula, is it necessary to convert the random factor into factor using factor()? Given that I have a RM design, my random factor will always be subject, which is numerical unless I force it into factor... Thank you again! Warmly, Chunyun
On Sun, Oct 11, 2015 at 8:28 PM, Alex Fine <abfine at gmail.com> wrote:
You might also try using sum-coding rather than (the default) dummy coding with the categorical predictors. Assuming the design is roughly balanced, this is like mean-centering the categorical variables. This will change the interpretation of the coefficients. Here is some further reading: http://talklab.psy.gla.ac.uk/tvw/catpred/ On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:
Short answer: try rescaling all of your continuous variables. It can't hurt/will change only the interpretation. If you get the same log-likelihood with the rescaled variables, that indicates that the large eigenvalue was not actually a problem in the first place. I don't think the standard citation from the R citation file <https://cran.r-project.org/web/packages/lme4/citation.html>, or the book chapter I wrote recently (chapter 13 of Fox et al, Oxford University Press 2015 -- online supplements at <http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>) cover rescaling in much detail. Schielzeth 2010 doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about the interpretive advantages of scaling. Ben Bolker On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
Dear all, This is my first post in the mailing list. I have been running some model with lmer and came across this warning message: In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue - Rescale variables? Here is the formula of my model (I substituted variables names with
generic
names): y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1 + Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub) Xc: continuous var Xd: level-1 dummy variable(s) Zd: level-2 dummy variable A snapshot of data. I can also provide the full dataset if necessary. sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1
12 1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403
When I reduced the # of random effect to (1+Xc|sub), the warning message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he
warning message if anyone could
kindly
suggest
some
reference paper/book.
Thank you very for your help!!
Chunyun
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Alex Fine Ph. (336) 302-3251 web: http://internal.psychology.illinois.edu/~abfine/ <http://internal.psychology.illinois.edu/~abfine/AlexFineHome.html>
Hi again dear Ben and Alex! I scaled the continuous predictor (Xc) using scale(Xc, centre=T, scale=T) and the warning did disappear! Also, the log likelihood remains the same. As Ben suggested, this indicates the large eigenvalue was not actually a problem in the first place, although I still feel hazy about why the warning appeared previously (I need to refresh my memory of what eigenvalues are). I also converted the subject using factor(). I would love to better understand when it is necessary to factor a variable. I did find a post from stackoverflow <http://stackoverflow.com/questions/21226069/when-are-factors-necessary-appropriate-in-r> on a similar topic, but it did not mention the random factor in a lmer formula. Alex, I tried both dummy coding and sum coding as you suggested. I got the same warning message with either coding scheme. I still need to carefully read your full paper to understand what ?maximal random-effect structure? is. To recap, my remaining questions are: - Can I ignore the eigenvalue warning and proceed with the raw variable (because the rescaling makes it hard to interpret) since the log likelihood does not change? - In using lmer for RM design, if the random factor is subject/participant, should I always makes sure subject has been converted to factor using factor()? Any further reference would be appreciated. Many thanks! Warmly, Chunyun
On Sun, Oct 18, 2015 at 11:46 AM, Chunyun Ma <mcypsy at gmail.com> wrote:
Hi dear Ben and Alex!
Thank you very much for your help and guidance! I just started reading your references. As I was exploring the alternatives you have suggested, another question came up. This may sounds silly, but I haven't found a definitive answer online: in the lmer formula, is it necessary to convert the random factor into factor using factor()? Given that I have a RM design, my random factor will always be subject, which is numerical unless I force it into factor... Thank you again! Warmly, Chunyun On Sun, Oct 11, 2015 at 8:28 PM, Alex Fine <abfine at gmail.com> wrote:
You might also try using sum-coding rather than (the default) dummy coding with the categorical predictors. Assuming the design is roughly balanced, this is like mean-centering the categorical variables. This will change the interpretation of the coefficients. Here is some further reading: http://talklab.psy.gla.ac.uk/tvw/catpred/ On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:
Short answer: try rescaling all of your continuous variables. It can't hurt/will change only the interpretation. If you get the same log-likelihood with the rescaled variables, that indicates that the large eigenvalue was not actually a problem in the first place. I don't think the standard citation from the R citation file <https://cran.r-project.org/web/packages/lme4/citation.html>, or the book chapter I wrote recently (chapter 13 of Fox et al, Oxford University Press 2015 -- online supplements at <http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>) cover rescaling in much detail. Schielzeth 2010 doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about the interpretive advantages of scaling. Ben Bolker On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
Dear all, This is my first post in the mailing list. I have been running some model with lmer and came across this warning message: In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue - Rescale variables? Here is the formula of my model (I substituted variables names with
generic
names): y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1
+
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub) Xc: continuous var Xd: level-1 dummy variable(s) Zd: level-2 dummy variable A snapshot of data. I can also provide the full dataset if necessary. sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1
12 1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403 When I reduced the # of random effect to (1+Xc|sub), the warning
message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he
warning message if anyone could
kindly
suggest
some
reference paper/book.
Thank you very for your help!!
Chunyun
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Alex Fine Ph. (336) 302-3251 web: http://internal.psychology.illinois.edu/~abfine/ <http://internal.psychology.illinois.edu/~abfine/AlexFineHome.html>
?
lme4 always treats grouping variables (those on the right side of a bar in a random-effects term such as (1|g) ) as factors, no matter what their underlying type is. This is particularly useful for models such as z ~ year + (1|year), which treats year as numeric (i.e. fitting a linear regression line) in the fixed-effects part of the model but as a categorical grouping variable (i.e. fitting year-level deviations from the regression line) in the random-effects part of the model. That said, if you have variables that are numeric in appearance but are always going to be treated as categorical (e.g. subject IDs that are arbitrary numeric codes), it's best practice to explicitly convert them to factors early in your workflow.
On Sun, Oct 18, 2015 at 11:46 AM, Chunyun Ma <mcypsy at gmail.com> wrote:
Hi dear Ben and Alex! Thank you very much for your help and guidance! I just started reading your references. As I was exploring the alternatives you have suggested, another question came up. This may sounds silly, but I haven't found a definitive answer online: in the lmer formula, is it necessary to convert the random factor into factor using factor()? Given that I have a RM design, my random factor will always be subject, which is numerical unless I force it into factor... Thank you again! Warmly, Chunyun On Sun, Oct 11, 2015 at 8:28 PM, Alex Fine <abfine at gmail.com> wrote:
You might also try using sum-coding rather than (the default) dummy coding with the categorical predictors. Assuming the design is roughly balanced, this is like mean-centering the categorical variables. This will change the interpretation of the coefficients. Here is some further reading: http://talklab.psy.gla.ac.uk/tvw/catpred/ On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:
Short answer: try rescaling all of your continuous variables. It can't hurt/will change only the interpretation. If you get the same log-likelihood with the rescaled variables, that indicates that the large eigenvalue was not actually a problem in the first place. I don't think the standard citation from the R citation file <https://cran.r-project.org/web/packages/lme4/citation.html>, or the book chapter I wrote recently (chapter 13 of Fox et al, Oxford University Press 2015 -- online supplements at <http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>) cover rescaling in much detail. Schielzeth 2010 doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about the interpretive advantages of scaling. Ben Bolker On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
Dear all,
This is my first post in the mailing list.
I have been running some model with lmer and came across this warning
message:
In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?
Here is the formula of my model (I substituted variables names with
generic
names):
y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1
+
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)
Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable
A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1
12 1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403
When I reduced the # of random effect to (1+Xc|sub), the warning
message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he
warning message if anyone could
kindly
suggest
some
reference paper/book.
Thank you very for your help!!
Chunyun
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Alex Fine Ph. (336) 302-3251 web: http://internal.psychology.illinois.edu/~abfine/
[cc'd to r-sig-mixed-models]
On Sun, Oct 18, 2015 at 1:06 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
I scaled the continuous predictor (Xc) using scale(Xc, centre=T, scale=T) and the warning did disappear! Also, the log likelihood remains the same. As Ben suggested, this indicates the large eigenvalue was not actually a problem in the first place, although I still feel hazy about why the warning appeared previously (I need to refresh my memory of what eigenvalues are). I also converted the subject using factor(). I would love to better understand when it is necessary to factor a variable. I did find a post from stackoverflow on a similar topic, but it did not mention the random factor in a lmer formula. Alex, I tried both dummy coding and sum coding as you suggested. I got the same warning message with either coding scheme. I still need to carefully read your full paper to understand what ?maximal random-effect structure? is. To recap, my remaining questions are: Can I ignore the eigenvalue warning and proceed with the raw variable (because the rescaling makes it hard to interpret) since the log likelihood does not change?
Yes.
In using lmer for RM design, if the random factor is subject/participant, should I always makes sure subject has been converted to factor using factor()? Any further reference would be appreciated.
See previous e-mail.
Many thanks! Warmly, Chunyun On Sun, Oct 18, 2015 at 11:46 AM, Chunyun Ma <mcypsy at gmail.com> wrote:
Hi dear Ben and Alex! Thank you very much for your help and guidance! I just started reading your references. As I was exploring the alternatives you have suggested, another question came up. This may sounds silly, but I haven't found a definitive answer online: in the lmer formula, is it necessary to convert the random factor into factor using factor()? Given that I have a RM design, my random factor will always be subject, which is numerical unless I force it into factor... Thank you again! Warmly, Chunyun On Sun, Oct 11, 2015 at 8:28 PM, Alex Fine <abfine at gmail.com> wrote:
You might also try using sum-coding rather than (the default) dummy coding with the categorical predictors. Assuming the design is roughly balanced, this is like mean-centering the categorical variables. This will change the interpretation of the coefficients. Here is some further reading: http://talklab.psy.gla.ac.uk/tvw/catpred/ On Sun, Oct 11, 2015 at 8:18 PM, Ben Bolker <bbolker at gmail.com> wrote:
Short answer: try rescaling all of your continuous variables. It can't hurt/will change only the interpretation. If you get the same log-likelihood with the rescaled variables, that indicates that the large eigenvalue was not actually a problem in the first place. I don't think the standard citation from the R citation file <https://cran.r-project.org/web/packages/lme4/citation.html>, or the book chapter I wrote recently (chapter 13 of Fox et al, Oxford University Press 2015 -- online supplements at <http://ms.mcmaster.ca/~bolker/R%/misc/foxchapter/bolker_chap.html>) cover rescaling in much detail. Schielzeth 2010 doi:10.1111/j.2041-210X.2010.00012.x gives a coherent argument about the interpretive advantages of scaling. Ben Bolker On Sun, Oct 11, 2015 at 6:37 PM, Chunyun Ma <mcypsy at gmail.com> wrote:
Dear all,
This is my first post in the mailing list.
I have been running some model with lmer and came across this warning
message:
In checkConv(attr(opt, ?derivs?), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?
Here is the formula of my model (I substituted variables names with
generic
names):
y ~ Intercept + Xc + Xd1 + Xd2 + Xc:Xd1 + Xc:Xd2 + Zd + Zd:Xc + Zd:Xd1
+
Zd:Xd2 + (1 + Xc + Xd1 + Xd2 | sub)
Xc: continuous var
Xd: level-1 dummy variable(s)
Zd: level-2 dummy variable
A snapshot of data. I can also provide the full dataset if necessary.
sub Xc Xd1 Xd2 Zd y 1 36 0 0 1 1346 1 45 0 1 1 1508 1 72 1 0 1 1246 1
12 1 0
1 1164 1 24 1 0 1 1295 1 36 1 0 1 1403
When I reduced the # of random effect to (1+Xc|sub), the warning
message
disappeared, but the model fit became poorer.
My question is: which variable(s) should I rescale? I?d be happy to
better understand t
he
warning message if anyone could
kindly
suggest
some
reference paper/book.
Thank you very for your help!!
Chunyun
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models