What is the appropriate zero-correlation parameter model for factors in lmer?

Tue, May 22, 2018 12:45 AM

Ok, I figured out the answer to the question about fm2.

fm2 is indeed a very nice baseline for fm3 and fm4. So I distinguish
between min1LMM and min2LMM.

 fm1 = y ~ 1 + f + (1 | g)               # minimal LMM version 1  (min1LMM)
 fm2 = y ~ 1 + f + (1 | f:g)             # minimal LMM version 2  (min2LMM)
 fm3 = y ~ 1 + f + (0 + f || g)          # zcpLMM with 0 in RE (zcpLMM_RE0)
 fm4 = y ~ 1 + f + (1 | g) + (1 | f:g)   # LMM w/ f x g interaction (intLMM)
 fm5 = y ~ 1 + f + (1 | g) + (0 + f | g) # N/A
 fm6 = y ~ 1 + f + (1 + f |  g)          # maximal LMM (maxLMM)
 fm7 = y ~ 1 + f + (1 + f || g)          # zcpLMM with 1 in RE (zcpLMM_RE1)


(1) maxLMM_RE1 -> intLMM     -> min1LMM  # fm6 -> fm4 -> fm1
(2) maxLMM_RE1 -> intLMM     -> min2LMM  # fm6 -> fm4 -> fm2
(3) maxLMM_RE0 -> zcpLMM_RE0 -> min2LMM  # fm6 -> fm3 -> fm2
(4) maxLMM_RE1 -> zcpLMM_RE1 -> min1LMM  # fm6 -> fm7 -> fm1  (new sequence)


On Tue, May 22, 2018 at 12:21 AM, Reinhold Kliegl <reinhold.kliegl at gmail.com

wrote:

Sorry, I am somewhat late to this conversation. I am responding to this
thread, because it fits my comment very well, but it was initially
triggered by a previous thread, especially Rune Haubo's post here [1]. So I
hope it is ok to continue here.

I have a few comments and questions. For details I refer to an RPub I put
up along with this post [2]. I start with a translation between Rune
Haubo's fm's and the terminology I use in the RPub:

 fm1 = y ~ 1 + f + (1 | g)            # minimal LMM (minLMM)
 fm3 = y ~ 1 + f + (0 + f || g)       # zero-corr param LMM with 0 in RE
(zcpLMM_RE0)
 fm4 = y ~ 1 + f + (1 | g) + (1|f:g)  # LMM w/ fixed x random factor
interaction (intLMM),
 fm6 = y ~ 1 + f + (1 + f |  g)       # maximal LMM (maxLMM)
 fm7 = y ~ 1 + f + (1 + f || g)       # zero-corr param LMM with 1 in RE
(zcpLMM_RE1)

Notes: f is a fixed factor, g is a group (random) factor; fm1 to fm6 are
in Rune Haubo's post; fm7 is new (added by me). I have not used fm2 and fm5
so far (see below).

(I) The post was triggered by the question whether intLMM is nested under
zcpLMM. I had included this LRT in my older RPub cited in the thread, but I
stand corrected and agree with Rune Haubo that intLMM is not nested under
zcpLMM. For example, in the new RPub, I show that slightly modified
Machines data exhibit smaller deviance for intLMM than zcpLMM despite an
additional model parameter in the latter. Thanks for the critical reading.


(II) Here are Runo Haubo's sequences (left, resorted) augmented with my
translation (right)

(1) fm6 -> fm5 -> fm4 -> fm1  # maxLMM_RE1 -> fm5 -> intLMM     -> minLMM
(2) fm6 -> fm5 -> fm4 -> fm2  # maxLMM_RE1 -> fm5 -> intLMM     -> fm2
(3) fm6 -> fm5 -> fm3 -> fm2  # maxLMM_RE1 -> fm5 -> zcpLMM_RE0 -> fm2

and here are sequences I came up with (left) augmented with translation
into RH's fm's.

(1) maxLMM_RE1 -> intLMM     -> minLMM  # fm6 -> fm4 -> fm1
(3) maxLMM_RE0 -> zcpLMM_RE0            # fm6 -> fm3
(4) maxLMM_RE1 -> zcpLMM_RE1 -> minLMM  # fm6 -> fm7 -> fm1  (new sequence)


(III) I have questions about fm2 and fm5.
   fm2: fm2 redefines the levels of the group factor (e.g., in the cake
data there are 45 groups in fm2 compared to 15 in the other models). Why is
fm2 nested under fm3 and fm6? Somehow it looks to me that you include an
f:g interaction without the g main effect (relative to fm4). This looks
like an interesting model; I would appreciate a bit more conceptual support
for its interpretation in the model hierarchy.
   fm5: fm5 specifies 4 variance components (VCs), but the factor has only
3 levels. So to me this looks like there is redundancy built into the
model. In support of this intuition, for the cake data, one of the VCs is
estimated with 0. However, in the Machine data the model was not
degenerate. So I am not sure. In other words, if the factor levels are A,
B, C, and the two contrasts are c1 and c2, I thought I can specify either
(1 + c1 + c2) or (0 + A + B + C). fm5 specifies (1 + A + B + C) which is
rank deficient in the fixed effect part, but not necessarily in the
random-effect term. What am I missing here?

[1] https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q2/026775.html
[2] http://rpubs.com/Reinhold/391027

Best,
Reinhold Kliegl


On Thu, May 17, 2018 at 12:43 PM, Maarten Jung <Maarten.Jung at mailbox.tu-
dresden.de> wrote:

Dear list,

When one wants to specify a lmer model including variance components but

no

correlation parameters for categorical predictors (factors) afaik one has
to convert the factors to numeric covariates or use lme4::dummy(). Until
recently I thought m2a (or equivalently m2b using the double-bar syntax)
would be the correct way to specify such a zero-correlation parameter

model.

But in this thread [1] Rune Haubo Bojesen Christensen pointed out that

this

model does not make sense to him. Instead he suggests m3 as an

appropriate

model.
I think this is a *highly relevant difference* for everyone who uses
factors in lmer and therefore I'm bringing up this issue again. But maybe
I'm mistaken and just don't get what is quite obvious for more

experienced

mixed modelers.
Please note that the question is on CrossValidated [2] but some consider

it

as off-topic and I don't think there will be an answer any time soon.

So here are my questions:
How should one specify a lmm without correlation parameters for factors

and

what are the differences between m2a and m3?
Is there a preferred model for model comparison with m4 (this model is

also

discussed here [3])?

library("lme4")
data("Machines", package = "MEMSS")

d <- Machines
contrasts(d$Machine)  # default coding: contr.sum

m1 <- lmer(score ~ Machine + (Machine | Worker), d)

c1 <- model.matrix(m1)[, 2]
c2 <- model.matrix(m1)[, 3]
m2a <- lmer(score ~ Machine + (1 | Worker) + (0 + c1 | Worker) + (0 + c2

Worker), d)
m2b <- lmer(score ~ Machine + (c1 + c2 || Worker), d)
VarCorr(m2a)
 Groups   Name        Std.Dev.
 Worker   (Intercept) 5.24354
 Worker.1 c1          2.58446
 Worker.2 c2          3.71504
 Residual             0.96256

m3 <- lmer(score ~ Machine + (1 | Worker) + (0 + dummy(Machine, "A") |
Worker) +
                                            (0 + dummy(Machine, "B") |
Worker) +
                                            (0 + dummy(Machine, "C") |
Worker), d)
VarCorr(m3)
 Groups   Name                Std.Dev.
 Worker   (Intercept)         3.78595
 Worker.1 dummy(Machine, "A") 1.94032
 Worker.2 dummy(Machine, "B") 5.87402
 Worker.3 dummy(Machine, "C") 2.84547
 Residual                     0.96158

m4 <- lmer(score ~ Machine + (1 | Worker) + (1 | Worker:Machine), d)


[1] https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q2/026775.html
[2] https://stats.stackexchange.com/q/345842/136579
[3] https://stats.stackexchange.com/q/304374/136579

Best regards,
Maarten

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

What is the appropriate zero-correlation parameter model for factors in lmer?

Thread (9 messages)