Message-ID: <BANLkTi=2oqvsvQQnQP5qs0ELMtYqhCHnzQ@mail.gmail.com>
Date: 2011-06-20T19:35:07Z
From: Gabor Grothendieck
Subject: correlation of random intercepts and slopes
In-Reply-To: <BANLkTikvBhPTd_4c+t6boA5WnTgtWDuMsA@mail.gmail.com>
On Mon, Jun 20, 2011 at 2:56 PM, Titus von der Malsburg
<malsburg at gmail.com> wrote:
> Dear list!
>
> I have a situation where lmer reports much higher
> correlations of random slopes and intercepts than I would
> expect. ?I generated fake data for a fictitious experiment
> using the function new.df shown below. ?The experiment has a
> within-factor 'cond' with levels 1 and 2, a set of items and
> subjects which are both measured repeatedly. ?There are
> random intercepts and slopes for items and subjects. ?On top
> there's sampling error. ?The dependent measure 'rt' is some
> kind of reaction time. ?Example:
>
> ?> d <- new.df(cond1.rt=600, effect.size=10, sdev=10,
> ? ? ? ? ? ? ? ?nsubj=49, sdev.int.subj=10, sdev.slp.subj=10,
> ? ? ? ? ? ? ? ?nitems=16, sdev.int.items=10,
> ? ? ? ? ? ? ? ?sdev.slp.items=10)
>
> cond1.rt is the average reaction time in condition 1.
> cond1.rt+effect.size is the rt in condition 2, sdev
> the sampling error, the rest concerns number of subjects and
> items, and the sdevs of their random intercepts and slopes.
> This is the head of the data frame:
>
> ?> head(d)
> ? ?subj item cond ? ? ? rt re.int.subj re.slp.subj re.int.item re.slp.item
> ?1 ? ?1 ? ?1 ? ?1 628.4764 ? ?12.15373 ? ?4.138659 ? -0.932447 ? ?6.619795
> ?2 ? ?1 ? ?1 ? ?2 603.1140 ? ?12.15373 ? -4.138659 ? -0.932447 ? -6.619795
> ?3 ? ?1 ? ?2 ? ?1 617.8467 ? ?12.15373 ? ?4.138659 ? -2.980553 ? ?2.471397
> ?4 ? ?1 ? ?2 ? ?2 606.5777 ? ?12.15373 ? -4.138659 ? -2.980553 ? -2.471397
> ?5 ? ?1 ? ?3 ? ?1 622.8397 ? ?12.15373 ? ?4.138659 ? -2.133049 ? ?5.101761
> ?6 ? ?1 ? ?3 ? ?2 616.4011 ? ?12.15373 ? -4.138659 ? -2.133049 ? -5.101761
>
> The random intercepts and slopes have small correlations:
>
> ?> with(subset(d, cond==1), cor(re.int.subj, re.slp.subj))
> ?[1] 0.2650591
> ?> with(subset(d, cond==1), cor(re.int.item, re.slp.item))
> ?[1] -0.3144838
>
> But when I roll an lmer model, I see much higher correlations
> of slopes and intercepts:
>
> ?> lmer(rt~cond+(cond|item)+(cond|subj), d)
> ?Linear mixed model fit by REML
> ?Formula: rt ~ cond + (cond | item) + (cond | subj)
> ? ? Data: d
> ? ? AIC ? BIC logLik deviance REMLdev
> ? 12050 12098 ?-6016 ? ?12040 ? 12032
> ?Random effects:
> ? Groups ? Name ? ? ? ?Variance Std.Dev. Corr
> ? subj ? ? (Intercept) 349.735 ?18.7012
> ? ? ? ? ? ?cond ? ? ? ? 82.905 ? 9.1052 ?-0.839
> ? item ? ? (Intercept) 201.250 ?14.1863
> ? ? ? ? ? ?cond ? ? ? ? 81.835 9.0463 ?-0.732
> ? Residual ? ? ? ? ? ? ?98.758 ? 9.9377
> ?Number of obs: 1568, groups: subj, 49; item, 16
>
> ?Fixed effects:
> ? ? ? ? ? ? ?Estimate Std. Error t value
> ?(Intercept) ?594.053 ? ? ?4.511 ?131.70
> ?cond ? ? ? ? ? 8.120 ? ? ?2.657 ? ?3.06
>
> ?Correlation of Fixed Effects:
> ? ? ? (Intr)
> ?cond -0.765
>
>
> When I xyplot the data, I can't see any correlation of
> slopes and intercepts. ?Could anybody please enlighten
> me as to why lmer reports these high correlations?
>
> Many thanks!
>
> ?Titus
>
>
> ?new.df <- function(cond1.rt=600, effect.size=10, sdev=10,
> ? ? ? ? ? ? ? ? ? ? sdev.int.subj=10, sdev.slp.subj=10, nsubj=10,
> ? ? ? ? ? ? ? ? ? ? sdev.int.items=10, sdev.slp.items=10, nitems=10) {
>
> ? ?ncond <- 2
>
> ? ?subj <- rep(1:nsubj, each=nitems*ncond)
> ? ?item <- rep(1:nitems, nsubj, each=ncond)
> ? ?cond <- rep(0:1, nsubj*nitems)
> ? ?err ?<- rnorm(nsubj*nitems*ncond, 0, sdev)
> ? ?d ? ?<- data.frame(subj=subj, item=item, cond=cond+1, err=err)
>
> ? ?# Adding random intercepts and slopes for subjects:
> ? ?re.int.subj ? <- rnorm(nsubj, 0, sdev.int.subj)
> ? ?d$re.int.subj <- rep(re.int.subj, each=nitems*ncond)
> ? ?re.slp.subj ? <- rnorm(nsubj, 0, sdev.slp.subj)
> ? ?d$re.slp.subj <- rep(re.slp.subj, each=nitems*ncond) * (cond - 0.5)
>
> ? ?# Adding random intercepts and slopes for items:
> ? ?re.int.item ? <- rnorm(nitems, 0, sdev.int.items)
> ? ?d$re.int.item <- rep(re.int.item, nsubj, each=ncond)
> ? ?re.slp.item ? <- rnorm(nitems, 0, sdev.int.items)
> ? ?d$re.slp.item <- rep(re.slp.item, nsubj, each=ncond) * (cond - 0.5)
>
> ? ?d$rt <- (cond1.rt + cond*effect.size
> ? ? ? ? ? ? + d$re.int.subj + d$re.slp.subj
> ? ? ? ? ? ? + d$re.int.item + d$re.slp.item
> ? ? ? ? ? ? + d$err)
>
> ? ?d
> ?}
>
Try centering cond:
d <- transform(d, cond = cond - mean(cond))
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com