Skip to content

alternative interaction representations

4 messages · Sébastien Bihorel, Reinhold Kliegl

#
Hi,

## With the CO2 data, suppose we want to build a LME model of 'uptake' with
## 'conc' (continuous) and want to know whether there is a change in slope
## at conc=300, with random slopes for plants

CO2new <- subset(CO2, Type == "Quebec" & Treatment == "nonchilled")
CO2new <- within(CO2new, {
    ## The more intuitive way to set up the interaction is to first define
    ## a factor breaking up the 'conc' predictor
    stage1 <- cut(conc, breaks=c(floor(min(conc)), 300,
                          ceiling(max(conc))),
                  labels=c("pre", "post"), include.lowest=TRUE)
    ## Alternative, direct representation of interaction
    stage2 <- ifelse(conc > 300, conc - 300, 0)
    ## We center conc at 300 for interpreting intercept here
    conc <- conc - 300
})
str(CO2new)
xyplot(uptake ~ conc, data=CO2new, groups=Plant, type="b")

## Consider a model with fixed effects for intercept, conc, and varying
## slopes.  Using the more intuitive representation:

(fm1 <- lmer(uptake ~ conc + conc:stage1 + (conc:stage1 | Plant), data=CO2new))

## And using the direct representation of the interaction

(fm2 <- lmer(uptake ~ conc + stage2 + (conc + stage2 | Plant), data=CO2new))

## In this simple case, it doesn't seem to matter which representation is
## used.  For other models where an interaction with another factor, say
## Type, is needed in the model to indicate 3-way interactions with conc
## then the latter seems to allow for a simpler model (which may impact
## lmer performance) because the interaction would then be modelled as a
## 2-way interaction.
##
## Is this a fair comparison of using direct representations of
## interactions compared to the more natural factor-based representations?
## Overall, is it preferable to use one rather than the other?

Cheers,
#
# This representation fits two linear slopes, one below and one after
conc = 300, splicing them at 0:
CO2new$conc1 <- ifelse(CO2new$conc < 300, CO2new$conc - 300, 0)
CO2new$conc2 <- ifelse(CO2new$conc > 300, CO2new$conc - 300, 0)

# Basic LMM
print(LMM <- lmer(uptake ~ conc1 + conc2 + (1 | Plant), data=CO2new), cor=FALSE)
#  ... the linear uptake is significant below 300, no longer
significant after 300
# ... the intercept estimates the upake at conc=300

# To test whether there is significant between-plant variance in
slopes below and above conc:
# Varying-slopes LMM
print(LMM.conc.1 <-   lmer(uptake ~ conc1 + conc2 + (1 | Plant) +
(0+conc1 | Plant), data=CO2new), cor=FALSE)
print(LMM.conc.2 <-   lmer(uptake ~ conc1 + conc2 + (1 | Plant) +
(0+conc2 | Plant), data=CO2new), cor=FALSE)
#print(LMM.conc.1.2 <- lmer(uptake ~ conc1 + conc2 + (1 | Plant) +
(0+conc1 | Plant) + (0+conc2 | Plant), data=CO2new), cor=FALSE)
#print(LMM.conc.12 <-  lmer(uptake ~ conc1 + conc2 + (1 + conc1 +
conc2 | Plant), data=CO2new), cor=FALSE)

anova(LMM, LMM.conc.1)
anova(LMM, LMM.conc.2)
#  Apparently there is not enough information in the data to test the
between-slope variance.

Reinhold Kliegl
On Sat, Aug 21, 2010 at 7:07 PM, Sebastian P. Luque <spluque at gmail.com> wrote:
#
On Sun, 22 Aug 2010 08:47:47 +0200,
Reinhold Kliegl <reinhold.kliegl-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org> wrote:

            
Thanks Reinhold, it seems as if these piecewise linear splines with one
or two knots are easier to fit and interpret than using a factor.

Cheers,
2 days later
#
On Sun, 22 Aug 2010 14:39:05 -0500,
"Sebastian P. Luque" <spluque-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org> wrote:

            
Actually I'm having some trouble interpreting the intercept
coefficients, although it may have more to do with piecewise functions
in general, rather than with mixed modelling (so apologies for the
slightly off-topic message).

In a similar case, Fitzmaurice et al.'s say (in Applied Longitudinal
Analysis) that in a model for the mean response Y for a subject i at
time (T) j randomized to 2 groups (G):

B1 + B2*Tij + B3*(Tij-a) + B4*Gi + B5*Tij*Gi + B6*(Tij-a)*Gi

where Yij can be modelled as linear spline with a single knot at 'a'.
The term (Tij-a) is Tij-a when Tij > a and zero otherwise.  The 'B's are
linear coefficients.  Expressing the model in terms of the two lines of
the model for the baseline group:

B1 + B2*Tij                            (Tij <= a)
(B1 - B3) + (B2 + B3)*Tij              (Tij > a)

In the last case (Tij > a), I can't see why the intercept = (B1 - B3).
B3 is a slope, so why is it playing a role (and subtracted from B1) in
the intercept there?

Thanks for any light on this.