Skip to content

within subjects, 3 conditions, 3 lines

5 messages · Stanislav Aggerwal, marKo, ONKELINX, Thierry

#
I am trying to figure out mixed models using a simulated example.

I get the essentially the same parameter values and SEs whether I do:
1. lm() ignoring the within subjects nature of the expt
2. lmer()
3. fit a line to each subject's data and find mean and SE of parameters
across subjects

At this point, my feeling is I should just do (1) or possibly (3) above.
Please tell me why I should do anything else. Maybe my example is somehow
weird? It is typical of expts I deal with.

- cond is a factor with levels a, b, and c
- x is a continuous indep var
- each condition has slope and intercept
- each subject receives all conditions; each subject has own slope
and intercept randomly varying about population value

==========================

set.seed(1234)
nsubj<-30
z<-expand.grid(x=1:5,cond=c("a","b","c"),subj=as.factor(1:nsubj))
x<-z$x
cond<-z$cond
subj<-z$subj

a1<- rep(1+rnorm(nsubj,mean=0,sd=.1),5)  #inta=1
a2<- rep(1+rnorm(nsubj,mean=0,sd=.1),5)  #intb-inta=2-1=1
a3<- rep(2+rnorm(nsubj,mean=0,sd=.1),5)  #intc-inta=3-1=2
a4<- rep(3+rnorm(nsubj,mean=0,sd=.1),5)  #slopea=3
a5<- rep(-1+rnorm(nsubj,mean=0,sd=.1),5) #slopeb-slopea
a6<- rep(-2+rnorm(nsubj,mean=0,sd=.1),5) #slopec-slopea

y<- (cond=="a")*(a1 + a4*x) + (cond=="b")*(a1+a2 + (a4+a5)*x) +
  (cond=="c")*(a1+a3 + (a4+a6)*x) + rnorm(5*3*nsubj,mean=0,sd=.5)

#### ignore within subjects design and treat as independent errors
fit1<-lm(y~cond*x)
summary(fit)

## mixed model using lmer()
library(lme4)
fit2<-lmer(y~x*cond + (x*cond|subj))
summary(fit2)

## lme from nlme package
library(nlme)
fit3<-lme(y~x*cond, random=~(x*cond)|subj)
#does not converge

##fit line to each subject; get mean and se of params across subjects
beta<-matrix(0,ncol=6,nrow=nsubj)
for(i in 1:nsubj)
  {
  f<-lm(y[subj==i]~cond[subj==i]*x[subj==i])
  beta[i,]<-coef(f)
  }

colMeans(beta)
apply(beta,2,sd)/sqrt(nsubj)
# 0.9730283  0.3577780  1.8709841  3.0270420 -0.8013505 -1.9840602
# 0.10121264 0.17860228 0.15895837 0.02820369 0.04765495 0.03834946

======================

This last method makes the most sense to me. Getting 95% CIs and p-values
for the params would be straightforward.

Thanks very much for your help.

Stan
1 day later
#
I will attempt an answer to my own question. Possibly the reason I got the
same answer with the 3 methods is that my simulation was wrong. It
simulated the responses as:

y<- (cond=="a")*(a1 + a4*x) + (cond=="b")*(a1+a2 + (a4+a5)*x) +
  (cond=="c")*(a1+a3 + (a4+a6)*x) + rnorm(5*3*nsubj,mean=0,sd=.5)

This contains the intercepts and slopes which are unique to each subject
and randomly sampled from some population values. However the term:
rnorm(5*3*nsubj,mean=0,sd=.5)
is wrong because it says the errors are independent. Instead they should be
correlated within each subject.

I will give that a go.

Stan
#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I think that you have correctly identified the problem.

Cheers,

Marko
On 01/21/2015 09:16 AM, Stanislav Aggerwal wrote:
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJUv2OgAAoJEJcj4KySkkQsUbgP/jhOyGuwPpAyS8I6uFElxztg
sQdiGeUaVKjutcJJsHisezEcRxcf84k09hWnR1bnc/fuFQgDqS58qnoUa2JJqiZh
BrnCjIV7Cj///DLC0MLzyzvmZLq21medi6vdUwdAF5gc1GOJHoA9IWgoEACucqEc
SbpWiaqUWpNK2rCOZ3crTEBZNHEHO72qKBq670L5zDTfRfx4IXsjGMyY68Lf/Tz/
jECjDGM3byO2pw5KginLWl3hA+0ChcKNNmti3rY8p9JVZ8c4iEDrT0Hjvx/VN8zn
JPcRSJFR263haozcENATtAgUmBCT66j+IfB/7aQgAOthTdEWWjtnnnVxTbDvBkik
tVB617PXdUL5l7kcmxZYcwiigOQerJrEDcUhoL8SMtL54JluWg9MZIZxXmjGHMlm
mtrvbv6rfU3XzB80ck8zm6SSgPwP1IYoqJZ7QWBx0XTuU057tjyG8rA6F/jkYc3t
Uoe0NwY1Ox9Q62paJQ4sysyf2dgyW2RYRVVQuK1UjlH9ZfmETtq36J6yyEXjuhir
ztVsKeofNgRKo4FrmhcOLcBPofzQBnzTI1RjuqCxcbt9vsr4Wwn1uvup20OqDYb7
VKbzoKr8HR26YDS7Oc3vUiOzjszNEALnKAWFoUfW/BQRQGzIZl0lR1g52SzRphL5
u/shRpd5ZYjTJUgT9IQ+
=/B+Q
-----END PGP SIGNATURE-----
#
I have revised the simulation. Now each subject has autocorrelated errors.
It is a substantial autocorrelation. Still, the 3 methods, including the
obviously "wrong" and simple lm() method, all give the same parameter
estimates and SEs.

Thanks for any insights. Should I really just ignore the repeated measures
design and use lm() for problems like this? (Funny how lme() cannot fit
these data)

Cheers,
Stan

The simulation:

######### previous example assumed indep errors. Now use errors that
# are correlated within each subject
set.seed(1234)
nsubj<-30
z<-expand.grid(x=1:5,cond=c("a","b","c"),subj=as.factor(1:nsubj))
x<-z$x
cond<-z$cond
subj<-z$subj
y<-rep(0,nrow(z))

a1<- rep(1+rnorm(nsubj,mean=0,sd=.1),each=5)  #inta=1
a2<- rep(1+rnorm(nsubj,mean=0,sd=.1),each=5)  #intb-inta=2-1=1
a3<- rep(2+rnorm(nsubj,mean=0,sd=.1),each=5)  #intc-inta=3-1=2
a4<- rep(3+rnorm(nsubj,mean=0,sd=.1),each=5)  #slopea=3
a5<- rep(-1+rnorm(nsubj,mean=0,sd=.1),each=5) #slopeb-slopea
a6<- rep(-2+rnorm(nsubj,mean=0,sd=.1),each=5) #slopec-slopea

y[cond=="a"]<-a1    + a4*x[cond=="a"]
y[cond=="b"]<-a1+a2 + (a4+a5)*x[cond=="b"]
y[cond=="c"]<-a1+a3 + (a4+a6)*x[cond=="c"]

autocorrelated errors
for(i in 1:nsubj)
  {
  y[subj==i]<-y[subj==i] +

as.numeric(filter(rnorm(5*3,mean=0,sd=.5),filter=0.5,method="recursive"))
  }

plot(x[cond=="a"],y[cond=="a"],xlab="x", ylab="y")
points(x[cond=="b"],y[cond=="b"],col='red')
points(x[cond=="c"],y[cond=="c"],col='green')

## ignore within subjects design and treat as independent errors
fit<-lm(y~cond*x)
summary(fit)
#            Estimate Std. Error t value Pr(>|t|)
#(Intercept)  1.00355    0.13628   7.364 8.75e-13 ***
#condb        0.84320    0.19273   4.375 1.51e-05 ***
#condc        2.03207    0.19273  10.544  < 2e-16 ***
#x            2.99975    0.04109  73.005  < 2e-16 ***
#condb:x     -0.95972    0.05811 -16.516  < 2e-16 ***
#condc:x     -1.98477    0.05811 -34.156  < 2e-16 ***

b<-coef(fit)
abline(b[1],b[4])  #line for cond a
abline(b[1]+b[2],b[4]+b[5],col='red')  #line for cond b
abline(b[1]+b[3],b[4]+b[6],col='green')  #line for cond c

## use Linear Mixed Effects to take account of repeated measures
# lmer() from lme4 package
library(lme4)
fit2<-lmer(y~x*cond + (x*cond|subj))
summary(fit2)
# no p-values. Same param estimates and ses as fit()
#Fixed effects:
#            Estimate Std. Error t value
#(Intercept)  1.00355    0.12632    7.94
#x            2.99975    0.03463   86.62
#condb        0.84320    0.21619    3.90
#condc        2.03207    0.18783   10.82
#x:condb     -0.95972    0.05558  -17.27
#x:condc     -1.98477    0.04671  -42.50

## lme from nlme package
#library(nlme)
#fit3<-lme(y~x*cond, random=~(x*cond)|subj)
##does not converge

##fit line to each subject; get mean and sd of params across subjects

beta<-matrix(0,ncol=6,nrow=nsubj)
for(i in 1:nsubj)
  {
  f<-lm(y[subj==i]~cond[subj==i]*x[subj==i])
  beta[i,]<-coef(f)
  }

m<-colMeans(beta)
s<-apply(beta,2,sd)/sqrt(nsubj)
z<-m/s
pval<-pnorm(abs(z),lower.tail=F)*2 #two-tailed
#pval<-pt(abs(z),df=nsubj-1,lower.tail=F)*2 #two-tailed
cbind(m,s,z,pval)
#              m          s          z         pval
#[1,]  1.0035461 0.12250006   8.192209 2.564748e-16
#[2,]  0.8432030 0.21703074   3.885178 1.022551e-04
#[3,]  2.0320664 0.19155457  10.608290 2.726940e-26
#[4,]  2.9997546 0.03330906  90.058219 0.000000e+00
#[5,] -0.9597171 0.05593306 -17.158318 5.446869e-66
#[6,] -1.9847668 0.05045750 -39.335421 0.000000e+00
#
Dear Stan,

You get the right answer for the wrong question. Because you have a bug in the code that simulates the data. You added a second noise term instead of a random intercept. Hint: set all parameters not related to the random intercept to 0 in a1 to a6. Then a1 to a6 should be equal.

Best regards,

Thierry

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey


-----Oorspronkelijk bericht-----
Van: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] Namens Stanislav Aggerwal
Verzonden: woensdag 21 januari 2015 11:07
Aan: r-sig-mixed-models at r-project.org
Onderwerp: Re: [R-sig-ME] within subjects, 3 conditions, 3 lines

I have revised the simulation. Now each subject has autocorrelated errors.
It is a substantial autocorrelation. Still, the 3 methods, including the obviously "wrong" and simple lm() method, all give the same parameter estimates and SEs.

Thanks for any insights. Should I really just ignore the repeated measures design and use lm() for problems like this? (Funny how lme() cannot fit these data)

Cheers,
Stan

The simulation:

######### previous example assumed indep errors. Now use errors that # are correlated within each subject
set.seed(1234)
nsubj<-30
z<-expand.grid(x=1:5,cond=c("a","b","c"),subj=as.factor(1:nsubj))
x<-z$x
cond<-z$cond
subj<-z$subj
y<-rep(0,nrow(z))

a1<- rep(1+rnorm(nsubj,mean=0,sd=.1),each=5)  #inta=1
a2<- rep(1+rnorm(nsubj,mean=0,sd=.1),each=5)  #intb-inta=2-1=1
a3<- rep(2+rnorm(nsubj,mean=0,sd=.1),each=5)  #intc-inta=3-1=2
a4<- rep(3+rnorm(nsubj,mean=0,sd=.1),each=5)  #slopea=3
a5<- rep(-1+rnorm(nsubj,mean=0,sd=.1),each=5) #slopeb-slopea
a6<- rep(-2+rnorm(nsubj,mean=0,sd=.1),each=5) #slopec-slopea

y[cond=="a"]<-a1    + a4*x[cond=="a"]
y[cond=="b"]<-a1+a2 + (a4+a5)*x[cond=="b"]
y[cond=="c"]<-a1+a3 + (a4+a6)*x[cond=="c"]

autocorrelated errors
for(i in 1:nsubj)
  {
  y[subj==i]<-y[subj==i] +

as.numeric(filter(rnorm(5*3,mean=0,sd=.5),filter=0.5,method="recursive"))
  }

plot(x[cond=="a"],y[cond=="a"],xlab="x", ylab="y")
points(x[cond=="b"],y[cond=="b"],col='red')
points(x[cond=="c"],y[cond=="c"],col='green')

## ignore within subjects design and treat as independent errors
fit<-lm(y~cond*x)
summary(fit)
#            Estimate Std. Error t value Pr(>|t|)
#(Intercept)  1.00355    0.13628   7.364 8.75e-13 ***
#condb        0.84320    0.19273   4.375 1.51e-05 ***
#condc        2.03207    0.19273  10.544  < 2e-16 ***
#x            2.99975    0.04109  73.005  < 2e-16 ***
#condb:x     -0.95972    0.05811 -16.516  < 2e-16 ***
#condc:x     -1.98477    0.05811 -34.156  < 2e-16 ***

b<-coef(fit)
abline(b[1],b[4])  #line for cond a
abline(b[1]+b[2],b[4]+b[5],col='red')  #line for cond b
abline(b[1]+b[3],b[4]+b[6],col='green')  #line for cond c

## use Linear Mixed Effects to take account of repeated measures # lmer() from lme4 package
library(lme4)
fit2<-lmer(y~x*cond + (x*cond|subj))
summary(fit2)
# no p-values. Same param estimates and ses as fit() #Fixed effects:
#            Estimate Std. Error t value
#(Intercept)  1.00355    0.12632    7.94
#x            2.99975    0.03463   86.62
#condb        0.84320    0.21619    3.90
#condc        2.03207    0.18783   10.82
#x:condb     -0.95972    0.05558  -17.27
#x:condc     -1.98477    0.04671  -42.50

## lme from nlme package
#library(nlme)
#fit3<-lme(y~x*cond, random=~(x*cond)|subj) ##does not converge

##fit line to each subject; get mean and sd of params across subjects

beta<-matrix(0,ncol=6,nrow=nsubj)
for(i in 1:nsubj)
  {
  f<-lm(y[subj==i]~cond[subj==i]*x[subj==i])
  beta[i,]<-coef(f)
  }

m<-colMeans(beta)
s<-apply(beta,2,sd)/sqrt(nsubj)
z<-m/s
pval<-pnorm(abs(z),lower.tail=F)*2 #two-tailed
#pval<-pt(abs(z),df=nsubj-1,lower.tail=F)*2 #two-tailed
cbind(m,s,z,pval)
#              m          s          z         pval
#[1,]  1.0035461 0.12250006   8.192209 2.564748e-16
#[2,]  0.8432030 0.21703074   3.885178 1.022551e-04
#[3,]  2.0320664 0.19155457  10.608290 2.726940e-26 #[4,]  2.9997546 0.03330906  90.058219 0.000000e+00 #[5,] -0.9597171 0.05593306 -17.158318 5.446869e-66 #[6,] -1.9847668 0.05045750 -39.335421 0.000000e+00


_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Disclaimer<https://www.inbo.be/nl/disclaimer-mailberichten-van-het-inbo>