Another question about model specification

On Tue, Mar 4, 2008 at 11:46 AM, Martin Eklund
Hello again,
 Thank you very much for the quick reply on my earlier question!
 Unfortunately, I still have problems to fully understand how to
 specify models in lme4. Specifically, how do you specify interactions
 between factors?
 Consider the example data below (also attached for convenience).
 'treatment' and 'family' are factors, 'treatment' is fixed and
 'family' is random. Fitting the models
 (i)     lmer(weight ~ treatment + (1|family))
 (ii)    lmer(weight ~ treatment * (1|family))
 (iii)   lmer(weight ~ (treatment + (1|family))^2)
 (iv)    lmer(weight ~ treatment : (1|family))
 produce exactly the same results. I assume that this is the expected
 behavior, but to me it is a bit confusing.
I started writing a kind of a grumpy response to this inquiry because
the documentation for lmer specifically states that terms in the
formula are separated by '+' symbols, so exploring the possible
effects of other types of operators is not likely to be productive.

Nevertheless I will concede that specifying an interaction between a
fixed-effects factor and a random factor is not a transparent
operation.  There are two ways that such an interaction can be
specified.

First, an interaction between a fixed factor and a random factor is a
random effect.  That is, randomness propagates through interactions.
To specify an interaction between a fixed factor and a random factor
you must consider not only the form of the model matrix that results
but also the form of the variance-covariance matrix for the resulting
random effects.  You can specify the interaction as

lmer(weight ~ treatment + (1| family) + (1|family:treatment))

or as

lmer(weight ~ treatment + (treatment | family))

In the first specification you will estimate two variance components,
one for family and one for the family:treatment interaction.  This
assumes that the random effects for each level of the treatment within
a family are independent and have the same variance.

In the second specification you estimate a vector of random effects
for each family where the length of the vector is the number of
treatments.  Let's say there are m treatments.  Then you will estimate
a full m by m variance-covariance matrix for the random effects.  This
becomes problematic when m is large.

That is, the first model is simpler than the second - in fact, it is a
submodel of the second model.

Here is an example using the (probably fictitious) Machines data from
the MEMSS package.
data(Machines, package = "MEMSS")
str(Machines)
'data.frame':	54 obs. of  3 variables:
 $ Worker : Factor w/ 6 levels "1","2","3","4",..: 1 1 1 2 2 2 3 3 3 4 ...
 $ Machine: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
 $ score  : num  52 52.8 53.1 51.8 52.8 53.1 60 60.2 58.4 51.1 ...
(Mm1 <- lmer(score ~ Machine + (1|Worker) + (1|Worker:Machine), Machines))
Linear mixed model fit by REML
Formula: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
   Data: Machines
   AIC   BIC logLik deviance REMLdev
 227.7 239.6 -107.8    225.5   215.7
Random effects:
 Groups         Name        Variance Std.Dev.
 Worker         (Intercept) 22.85892 4.78110
 Worker:Machine (Intercept) 13.90845 3.72940
 Residual                    0.92465 0.96159
Number of obs: 54, groups: Worker, 6; Worker:Machine, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)   52.356      2.486  21.062
MachineB       7.967      2.177   3.660
MachineC      13.917      2.177   6.393

Correlation of Fixed Effects:
         (Intr) MachnB
MachineB -0.438
MachineC -0.438  0.500
(Mm2 <- lmer(score ~ Machine + (Machine|Worker), Machines))
Linear mixed model fit by REML
Formula: score ~ Machine + (Machine | Worker)
   Data: Machines
   AIC   BIC logLik deviance REMLdev
 228.3 248.2 -104.2    216.6   208.3
Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Worker   (Intercept) 16.64051 4.07928
          MachineB    34.54670 5.87764   0.484
          MachineC    13.61398 3.68971  -0.365  0.297
 Residual              0.92463 0.96158
Number of obs: 54, groups: Worker, 6

Fixed effects:
            Estimate Std. Error t value
(Intercept)   52.356      1.681  31.151
MachineB       7.967      2.421   3.291
MachineC      13.917      1.540   9.037

Correlation of Fixed Effects:
         (Intr) MachnB
MachineB  0.463
MachineC -0.374  0.301
anova(Mm2, Mm1)
Data: Machines
Models:
Mm1: score ~ Machine + (1 | Worker) + (1 | Worker:Machine)
Mm2: score ~ Machine + (Machine | Worker)
      Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
Mm1.p  6  237.47  249.40 -112.73
Mm2.p 10  236.61  256.50 -108.31 8.8515      4    0.06492
 If we pretend that
 'family' is a fixed factor and use lm to fit the models above we get
 different results (or, to be precise, (ii) = (iii) ? (i) ? (iv)),
 which is the expected behavior to me. So, my question really boils
 down to: if I want to fit the model

 weight ~ treatment + (1|family) + (an interaction term between
 treatment and (1|family))

 how would I specify that? And in general, how are interactions
 specified?

 Thank you!

 Best regards,

 Martin.

 ==========
 R commands
 ==========
 data <- read.delim(file="/path/to/test.txt", sep="\t", header=TRUE)
 data$treatment <- as.factor(data$treatment)
 data$family <- as.factor(data$family)
 attach(data)
 lmer(weight ~ treatment + (1|family))
 lmer(weight ~ treatment * (1|family))
 lmer(weight ~ (treatment + (1|family))^2)
 lmer(weight ~ treatment : (1|family))

 ===========
 Example data
 ===========
 treatment       family  weight
 2       1       0.099
 3       1       0.099
 1       1       0.105
 1       1       0.112
 1       1       0.113
 1       1       0.114
 1       1       0.119
 3       1       0.121
 1       1       0.123
 1       1       0.124
 3       1       0.130
 3       1       0.130
 2       1       0.142
 2       1       0.143
 2       1       0.144
 2       1       0.164
 2       2       0.086
 1       2       0.097
 1       2       0.097
 1       2       0.105
 1       2       0.106
 2       2       0.107
 1       2       0.109
 2       2       0.113
 2       2       0.114
 1       2       0.114
 3       2       0.115
 2       2       0.132
 3       2       0.141
 2       2       0.131

 ?

 ========================================
 Martin Eklund
 PhD Student
 Department of Pharmaceutical Biosciences
 Uppsala University, Sweden
 Ph: +46-18-4714281
 ========================================

 _______________________________________________
 R-sig-mixed-models at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Another question about model specification

Thread (2 messages)