Skip to content
Prev 632 / 20628 Next

Another question about model specification

On Tue, Mar 4, 2008 at 11:46 AM, Martin Eklund
<martin.eklund at farmbio.uu.se> wrote:
I started writing a kind of a grumpy response to this inquiry because
the documentation for lmer specifically states that terms in the
formula are separated by '+' symbols, so exploring the possible
effects of other types of operators is not likely to be productive.

Nevertheless I will concede that specifying an interaction between a
fixed-effects factor and a random factor is not a transparent
operation.  There are two ways that such an interaction can be
specified.

First, an interaction between a fixed factor and a random factor is a
random effect.  That is, randomness propagates through interactions.
To specify an interaction between a fixed factor and a random factor
you must consider not only the form of the model matrix that results
but also the form of the variance-covariance matrix for the resulting
random effects.  You can specify the interaction as

lmer(weight ~ treatment + (1| family) + (1|family:treatment))

or as

lmer(weight ~ treatment + (treatment | family))

In the first specification you will estimate two variance components,
one for family and one for the family:treatment interaction.  This
assumes that the random effects for each level of the treatment within
a family are independent and have the same variance.

In the second specification you estimate a vector of random effects
for each family where the length of the vector is the number of
treatments.  Let's say there are m treatments.  Then you will estimate
a full m by m variance-covariance matrix for the random effects.  This
becomes problematic when m is large.

That is, the first model is simpler than the second - in fact, it is a
submodel of the second model.

Here is an example using the (probably fictitious) Machines data from
the MEMSS package.
'data.frame':	54 obs. of  3 variables:
 $ Worker : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 2 2 2 3 3 3 4 ...
 $ Machine: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
 $ score  : num  52 52.8 53.1 51.8 52.8 53.1 60 60.2 58.4 51.1 ...
Linear mixed model fit by REML
Formula: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
   Data: Machines
   AIC   BIC logLik deviance REMLdev
 227.7 239.6 -107.8    225.5   215.7
Random effects:
 Groups         Name        Variance Std.Dev.
 Worker         (Intercept) 22.85892 4.78110
 Worker:Machine (Intercept) 13.90845 3.72940
 Residual                    0.92465 0.96159
Number of obs: 54, groups: Worker, 6; Worker:Machine, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)   52.356      2.486  21.062
MachineB       7.967      2.177   3.660
MachineC      13.917      2.177   6.393

Correlation of Fixed Effects:
         (Intr) MachnB
MachineB -0.438
MachineC -0.438  0.500
Linear mixed model fit by REML
Formula: score ~ Machine + (Machine | Worker)
   Data: Machines
   AIC   BIC logLik deviance REMLdev
 228.3 248.2 -104.2    216.6   208.3
Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Worker   (Intercept) 16.64051 4.07928
          MachineB    34.54670 5.87764   0.484
          MachineC    13.61398 3.68971  -0.365  0.297
 Residual              0.92463 0.96158
Number of obs: 54, groups: Worker, 6

Fixed effects:
            Estimate Std. Error t value
(Intercept)   52.356      1.681  31.151
MachineB       7.967      2.421   3.291
MachineC      13.917      1.540   9.037

Correlation of Fixed Effects:
         (Intr) MachnB
MachineB  0.463
MachineC -0.374  0.301
Data: Machines
Models:
Mm1: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
Mm2: score ~ Machine + (Machine | Worker)
      Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
Mm1.p  6  237.47  249.40 -112.73
Mm2.p 10  236.61  256.50 -108.31 8.8515      4    0.06492