Hi all
I have a question about how the ordering of variable names in the
random effects structure of an lmer model leads to different results.
These are the data:
###############
x <- structure(list(OVERLAParcsine = c(0.232077682862713, 0.656060590924923,
0.546850950695944, 0.668742703202372, 0.631058840778021, 0.433445320069886,
0.315193032440724, 0.656060590924923, 0.389796296474261, 0.455598673395823,
0.500654712404588, 0.477995198518952, 0.304692654015398, 0.631058840778021,
0.489290778014116, 0.694498265626556, 0.656060590924923, 0.466765339047296,
0.411516846067488, 0.582364237868743, 0.33630357515398, 0.36826789343664,
0.489290778014116, 0.582364237868743, 0.283794109208328, 0.631058840778021,
0.33630357515398, 0.606505855213087, 0.512089752934148, 0.150568272776686,
0.273393031467473, 0.466765339047296, 0.160690652951911, 0.120289882394788,
0.558600565342801, 0.400631592701372, 0.273393031467473, 0.72081876087009,
0.444492776935819, 0.681553211563117, 0.546850950695944, 0.523598775598299,
0.273393031467473, 0.694498265626556, 0.294226837748982, 0.500654712404588,
0.411516846067488, 0.618728690672251), NAME = structure(c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L), .Label = c("Anne", "Aran", "Becky", "Carl", "Dominic",
"Gail", "Joel", "John", "Liz", "Nicole", "Ruth", "Warren"), class = "factor"),
PERSON = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("caretaker",
"child"), class = "factor"), PHASE = c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L)), class =
"data.frame", row.names = c(NA, -48L))
###############
With the following order of variables in the random-effects structure,
I get convergence warnings,
###############
summary(m1a <- lme4::lmer(OVERLAParcsine ~ 1+PERSON*PHASE +
(1+PERSON+PHASE|NAME), data=x), correlation=F) # warning
logLik(m1a) # 31.89056
###############
but not with this order:
###############
summary(m1b <- lme4::lmer(OVERLAParcsine ~ 1+PERSON*PHASE +
(1+PHASE+PERSON|NAME), data=x), correlation=F) # fine
logLik(m1b) # 31.89128
###############
Why does the order of the random effects matter when PHASE is still
considered numeric? Thanks for any input you may have,
STG
Order of terms for random slopes
4 messages · Stefan Th. Gries, Ben Bolker, Thierry Onkelinx
Thanks. This is a known issue: https://github.com/lme4/lme4/issues/449 At the risk of sounding like a stuffy old statistical fart: - yes, lme4 *should* give an identical fit either way - it's not terribly surprising that a model with 11 parameters fitted to 48 observations is numerically unstable ... - there don't seem to be any _substantive_ differences in the estimate ... cheers Ben Bolker https://github.com/lme4/lme4/issues/449
On 2018-08-29 12:31 PM, Stefan Th. Gries wrote:
Hi all
I have a question about how the ordering of variable names in the
random effects structure of an lmer model leads to different results.
These are the data:
###############
x <- structure(list(OVERLAParcsine = c(0.232077682862713, 0.656060590924923,
0.546850950695944, 0.668742703202372, 0.631058840778021, 0.433445320069886,
0.315193032440724, 0.656060590924923, 0.389796296474261, 0.455598673395823,
0.500654712404588, 0.477995198518952, 0.304692654015398, 0.631058840778021,
0.489290778014116, 0.694498265626556, 0.656060590924923, 0.466765339047296,
0.411516846067488, 0.582364237868743, 0.33630357515398, 0.36826789343664,
0.489290778014116, 0.582364237868743, 0.283794109208328, 0.631058840778021,
0.33630357515398, 0.606505855213087, 0.512089752934148, 0.150568272776686,
0.273393031467473, 0.466765339047296, 0.160690652951911, 0.120289882394788,
0.558600565342801, 0.400631592701372, 0.273393031467473, 0.72081876087009,
0.444492776935819, 0.681553211563117, 0.546850950695944, 0.523598775598299,
0.273393031467473, 0.694498265626556, 0.294226837748982, 0.500654712404588,
0.411516846067488, 0.618728690672251), NAME = structure(c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L), .Label = c("Anne", "Aran", "Becky", "Carl", "Dominic",
"Gail", "Joel", "John", "Liz", "Nicole", "Ruth", "Warren"), class = "factor"),
PERSON = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("caretaker",
"child"), class = "factor"), PHASE = c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L)), class =
"data.frame", row.names = c(NA, -48L))
###############
With the following order of variables in the random-effects structure,
I get convergence warnings,
###############
summary(m1a <- lme4::lmer(OVERLAParcsine ~ 1+PERSON*PHASE +
(1+PERSON+PHASE|NAME), data=x), correlation=F) # warning
logLik(m1a) # 31.89056
###############
but not with this order:
###############
summary(m1b <- lme4::lmer(OVERLAParcsine ~ 1+PERSON*PHASE +
(1+PHASE+PERSON|NAME), data=x), correlation=F) # fine
logLik(m1b) # 31.89128
###############
Why does the order of the random effects matter when PHASE is still
considered numeric? Thanks for any input you may have,
STG
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Thanks. This is a known issue: https://github.com/lme4/lme4/issues/449
Ohh, ok, I had googled a bit on 'order of terms', 'random effects' etc. but hadn't come across this, sorry.
- it's not terribly surprising that a model with 11 parameters fitted to 48 observations is numerically unstable ...
Absolutely, the example is from a workshop and was used only for didactic purposes, and ...
there don't seem to be any _substantive_ differences in the estimate ...
... yes, we only wanted to make sure there wasn't something superobvious but important we had missed. Thanks for the quick feedback!
Dear Stefan, IMHO you shouldn't use an overfitted model for didatic purposes. Teach students that you need a sufficiently large data set depending on the complexity of the model. Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be /////////////////////////////////////////////////////////////////////////////////////////// To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey /////////////////////////////////////////////////////////////////////////////////////////// <https://www.inbo.be> 2018-08-29 20:51 GMT+02:00 Stefan Th. Gries <stgries at gmail.com>:
Thanks. This is a known issue: https://github.com/lme4/lme4/issues/449
Ohh, ok, I had googled a bit on 'order of terms', 'random effects' etc. but hadn't come across this, sorry.
- it's not terribly surprising that a model with 11 parameters fitted to
48 observations is numerically unstable ... Absolutely, the example is from a workshop and was used only for didactic purposes, and ...
there don't seem to be any _substantive_ differences in the estimate ...
... yes, we only wanted to make sure there wasn't something superobvious but important we had missed. Thanks for the quick feedback!
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models