Sorry, I am somewhat late to this conversation. I am responding to this
thread, because it fits my comment very well, but it was initially
triggered by a previous thread, especially Rune Haubo's post here [1]. So I
hope it is ok to continue here.
I have a few comments and questions. For details I refer to an RPub I put
up along with this post [2]. I start with a translation between Rune
Haubo's fm's and the terminology I use in the RPub:
fm1 = y ~ 1 + f + (1 | g) # minimal LMM (minLMM)
fm3 = y ~ 1 + f + (0 + f || g) # zero-corr param LMM with 0 in RE
(zcpLMM_RE0)
fm4 = y ~ 1 + f + (1 | g) + (1|f:g) # LMM w/ fixed x random factor
interaction (intLMM),
fm6 = y ~ 1 + f + (1 + f | g) # maximal LMM (maxLMM)
fm7 = y ~ 1 + f + (1 + f || g) # zero-corr param LMM with 1 in RE
(zcpLMM_RE1)
Notes: f is a fixed factor, g is a group (random) factor; fm1 to fm6 are
in Rune Haubo's post; fm7 is new (added by me). I have not used fm2 and fm5
so far (see below).
(I) The post was triggered by the question whether intLMM is nested under
zcpLMM. I had included this LRT in my older RPub cited in the thread, but I
stand corrected and agree with Rune Haubo that intLMM is not nested under
zcpLMM. For example, in the new RPub, I show that slightly modified
Machines data exhibit smaller deviance for intLMM than zcpLMM despite an
additional model parameter in the latter. Thanks for the critical reading.
(II) Here are Runo Haubo's sequences (left, resorted) augmented with my
translation (right)
(1) fm6 -> fm5 -> fm4 -> fm1 # maxLMM_RE1 -> fm5 -> intLMM -> minLMM
(2) fm6 -> fm5 -> fm4 -> fm2 # maxLMM_RE1 -> fm5 -> intLMM -> fm2
(3) fm6 -> fm5 -> fm3 -> fm2 # maxLMM_RE1 -> fm5 -> zcpLMM_RE0 -> fm2
and here are sequences I came up with (left) augmented with translation
into RH's fm's.
(1) maxLMM_RE1 -> intLMM -> minLMM # fm6 -> fm4 -> fm1
(3) maxLMM_RE0 -> zcpLMM_RE0 # fm6 -> fm3
(4) maxLMM_RE1 -> zcpLMM_RE1 -> minLMM # fm6 -> fm7 -> fm1 (new sequence)
(III) I have questions about fm2 and fm5.
fm2: fm2 redefines the levels of the group factor (e.g., in the cake
data there are 45 groups in fm2 compared to 15 in the other models). Why is
fm2 nested under fm3 and fm6? Somehow it looks to me that you include an
f:g interaction without the g main effect (relative to fm4). This looks
like an interesting model; I would appreciate a bit more conceptual support
for its interpretation in the model hierarchy.
fm5: fm5 specifies 4 variance components (VCs), but the factor has only
3 levels. So to me this looks like there is redundancy built into the
model. In support of this intuition, for the cake data, one of the VCs is
estimated with 0. However, in the Machine data the model was not
degenerate. So I am not sure. In other words, if the factor levels are A,
B, C, and the two contrasts are c1 and c2, I thought I can specify either
(1 + c1 + c2) or (0 + A + B + C). fm5 specifies (1 + A + B + C) which is
rank deficient in the fixed effect part, but not necessarily in the
random-effect term. What am I missing here?
[1] https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q2/026775.html
[2] http://rpubs.com/Reinhold/391027
Best,
Reinhold Kliegl
On Thu, May 17, 2018 at 12:43 PM, Maarten Jung <Maarten.Jung at mailbox.tu-
dresden.de> wrote:
Dear list,
When one wants to specify a lmer model including variance components but
correlation parameters for categorical predictors (factors) afaik one has
to convert the factors to numeric covariates or use lme4::dummy(). Until
recently I thought m2a (or equivalently m2b using the double-bar syntax)
would be the correct way to specify such a zero-correlation parameter
But in this thread [1] Rune Haubo Bojesen Christensen pointed out that
model does not make sense to him. Instead he suggests m3 as an
model.
I think this is a *highly relevant difference* for everyone who uses
factors in lmer and therefore I'm bringing up this issue again. But maybe
I'm mistaken and just don't get what is quite obvious for more
mixed modelers.
Please note that the question is on CrossValidated [2] but some consider
as off-topic and I don't think there will be an answer any time soon.
So here are my questions:
How should one specify a lmm without correlation parameters for factors
what are the differences between m2a and m3?
Is there a preferred model for model comparison with m4 (this model is
discussed here [3])?
library("lme4")
data("Machines", package = "MEMSS")
d <- Machines
contrasts(d$Machine) # default coding: contr.sum
m1 <- lmer(score ~ Machine + (Machine | Worker), d)
c1 <- model.matrix(m1)[, 2]
c2 <- model.matrix(m1)[, 3]
m2a <- lmer(score ~ Machine + (1 | Worker) + (0 + c1 | Worker) + (0 + c2
Worker), d)
m2b <- lmer(score ~ Machine + (c1 + c2 || Worker), d)
VarCorr(m2a)
Groups Name Std.Dev.
Worker (Intercept) 5.24354
Worker.1 c1 2.58446
Worker.2 c2 3.71504
Residual 0.96256
m3 <- lmer(score ~ Machine + (1 | Worker) + (0 + dummy(Machine, "A") |
Worker) +
(0 + dummy(Machine, "B") |
Worker) +
(0 + dummy(Machine, "C") |
Worker), d)
VarCorr(m3)
Groups Name Std.Dev.
Worker (Intercept) 3.78595
Worker.1 dummy(Machine, "A") 1.94032
Worker.2 dummy(Machine, "B") 5.87402
Worker.3 dummy(Machine, "C") 2.84547
Residual 0.96158
m4 <- lmer(score ~ Machine + (1 | Worker) + (1 | Worker:Machine), d)
[1] https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q2/026775.html
[2] https://stats.stackexchange.com/q/345842/136579
[3] https://stats.stackexchange.com/q/304374/136579
Best regards,
Maarten
[[alternative HTML version deleted]]