Skip to content

Another question about model specification

2 messages · Martin Eklund, Douglas Bates

#
Hello again,

Thank you very much for the quick reply on my earlier question!  
Unfortunately, I still have problems to fully understand how to  
specify models in lme4. Specifically, how do you specify interactions  
between factors?

Consider the example data below (also attached for convenience).  
'treatment' and 'family' are factors, 'treatment' is fixed and  
'family' is random. Fitting the models

(i)	lmer(weight ~ treatment + (1|family))
(ii)	lmer(weight ~ treatment * (1|family))
(iii)	lmer(weight ~ (treatment + (1|family))^2)
(iv)	lmer(weight ~ treatment : (1|family))

produce exactly the same results. I assume that this is the expected  
behavior, but to me it is a bit confusing. If we pretend that  
'family' is a fixed factor and use lm to fit the models above we get  
different results (or, to be precise, (ii) = (iii) ? (i) ? (iv)),  
which is the expected behavior to me. So, my question really boils  
down to: if I want to fit the model

weight ~ treatment + (1|family) + (an interaction term between  
treatment and (1|family))

how would I specify that? And in general, how are interactions  
specified?

Thank you!

Best regards,

Martin.

==========
R commands
==========
data <- read.delim(file="/path/to/test.txt", sep="\t", header=TRUE)
data$treatment <- as.factor(data$treatment)
data$family <- as.factor(data$family)
attach(data)
lmer(weight ~ treatment + (1|family))
lmer(weight ~ treatment * (1|family))
lmer(weight ~ (treatment + (1|family))^2)
lmer(weight ~ treatment : (1|family))

===========
Example data
===========
treatment	family	weight
2	1	0.099
3	1	0.099
1	1	0.105
1	1	0.112
1	1	0.113
1	1	0.114
1	1	0.119
3	1	0.121
1	1	0.123
1	1	0.124
3	1	0.130
3	1	0.130
2	1	0.142
2	1	0.143
2	1	0.144
2	1	0.164
2	2	0.086
1	2	0.097
1	2	0.097
1	2	0.105
1	2	0.106
2	2	0.107
1	2	0.109
2	2	0.113
2	2	0.114
1	2	0.114
3	2	0.115
2	2	0.132
3	2	0.141
2	2	0.131

?

========================================
Martin Eklund
PhD Student
Department of Pharmaceutical Biosciences
Uppsala University, Sweden
Ph: +46-18-4714281
========================================
#
On Tue, Mar 4, 2008 at 11:46 AM, Martin Eklund
<martin.eklund at farmbio.uu.se> wrote:
I started writing a kind of a grumpy response to this inquiry because
the documentation for lmer specifically states that terms in the
formula are separated by '+' symbols, so exploring the possible
effects of other types of operators is not likely to be productive.

Nevertheless I will concede that specifying an interaction between a
fixed-effects factor and a random factor is not a transparent
operation.  There are two ways that such an interaction can be
specified.

First, an interaction between a fixed factor and a random factor is a
random effect.  That is, randomness propagates through interactions.
To specify an interaction between a fixed factor and a random factor
you must consider not only the form of the model matrix that results
but also the form of the variance-covariance matrix for the resulting
random effects.  You can specify the interaction as

lmer(weight ~ treatment + (1| family) + (1|family:treatment))

or as

lmer(weight ~ treatment + (treatment | family))

In the first specification you will estimate two variance components,
one for family and one for the family:treatment interaction.  This
assumes that the random effects for each level of the treatment within
a family are independent and have the same variance.

In the second specification you estimate a vector of random effects
for each family where the length of the vector is the number of
treatments.  Let's say there are m treatments.  Then you will estimate
a full m by m variance-covariance matrix for the random effects.  This
becomes problematic when m is large.

That is, the first model is simpler than the second - in fact, it is a
submodel of the second model.

Here is an example using the (probably fictitious) Machines data from
the MEMSS package.
'data.frame':	54 obs. of  3 variables:
 $ Worker : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 2 2 2 3 3 3 4 ...
 $ Machine: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
 $ score  : num  52 52.8 53.1 51.8 52.8 53.1 60 60.2 58.4 51.1 ...
Linear mixed model fit by REML
Formula: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
   Data: Machines
   AIC   BIC logLik deviance REMLdev
 227.7 239.6 -107.8    225.5   215.7
Random effects:
 Groups         Name        Variance Std.Dev.
 Worker         (Intercept) 22.85892 4.78110
 Worker:Machine (Intercept) 13.90845 3.72940
 Residual                    0.92465 0.96159
Number of obs: 54, groups: Worker, 6; Worker:Machine, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)   52.356      2.486  21.062
MachineB       7.967      2.177   3.660
MachineC      13.917      2.177   6.393

Correlation of Fixed Effects:
         (Intr) MachnB
MachineB -0.438
MachineC -0.438  0.500
Linear mixed model fit by REML
Formula: score ~ Machine + (Machine | Worker)
   Data: Machines
   AIC   BIC logLik deviance REMLdev
 228.3 248.2 -104.2    216.6   208.3
Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Worker   (Intercept) 16.64051 4.07928
          MachineB    34.54670 5.87764   0.484
          MachineC    13.61398 3.68971  -0.365  0.297
 Residual              0.92463 0.96158
Number of obs: 54, groups: Worker, 6

Fixed effects:
            Estimate Std. Error t value
(Intercept)   52.356      1.681  31.151
MachineB       7.967      2.421   3.291
MachineC      13.917      1.540   9.037

Correlation of Fixed Effects:
         (Intr) MachnB
MachineB  0.463
MachineC -0.374  0.301
Data: Machines
Models:
Mm1: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
Mm2: score ~ Machine + (Machine | Worker)
      Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
Mm1.p  6  237.47  249.40 -112.73
Mm2.p 10  236.61  256.50 -108.31 8.8515      4    0.06492