Different random effects variances for outcomes and groups

Tue, Sep 25, 2007 6:05 AM

Hi David,

Thanks for your reply.  I comment on your various questions below.

David Afshartous wrote:

Yes, you are right here (apart from a typo - there are 100 x 2 x 3 x 2 = 
1200 observations).  The groups do contain different children.

I think you're right here and this corresponds with what you say below.  
I think my model should be instead written as

y_{ir}(t) = \beta_{0r} + \beta_{1r} group_i + \beta_{2r} t + \beta_{3r}
group_i : t  + group_i * b_{irp} + \epsilon_{ir}(t),

where p=0,1 represents the value of 'group_i' and is distributed as you 
suggest.  This seems to make more sense than everyone having a random 
intercept from one distribution and group 1 having an additional one.  
This way they are completely separate.

Yes, I should have considered this more carefully.  Certainly the random 
effects for coordinates 1 and 2 for a particular individual should have 
been simulated from a multivariate normal distribution rather than from 
two separate normals.  I don't think that there is so much a group 
component and a coordinate component, as a coordinate component that 
differs depending on which group the individual comes from.  So, for 
example, instead of the randeff statement below we could have (repeated 
across times and inserted in the correct positions for each group):

randeff.gp0 <- mvrnorm(n.subj/2, c(0,0), matrix(c(100,50,50,400), 
nc=2))  # coordinates 1 and 2 (respectively) for individuals in group 0 #
randeff.gp1 <- mvrnorm(n.subj/2, c(0,0), matrix(c(225,70,70,625), 
nc=2))  # coordinates 1 and 2 (respectively) for individuals in group 1 #

I think the group itself should only be a fixed effect because, while 
individuals are randomly sampled from two populations of cleft-lip 
patients and healthy controls, the groups are fixed (not sampled from a 
population of groups).  What do you think?  This has been one point that 
I have wondered long and hard about.

I do agree and I'm not sure now what model the first lmer2 model 
statement is actually fitting, because I would have expected at least 
some of the coord1:group and coord2:group random effects to be zero.  
The second lmer2 statement gives more sensible answers and corresponds 
with the rewritten model above.  Also when I check the estimated random 
effects variances and covariances against the actual values, there is 
good correspondence so I think I will proceed with this parameterisation.

Many thanks for your help,
Sarah




On 9/19/07 8:28 AM, "Sarah Barry" <sarah at stats.gla.ac.uk> wrote:

Dear all,

I wonder if someone could give me some insight on coding lmer.  I have
facial shape data on children in two groups at four time points (3,6,12
and 24 months).  Each child has a set of coordinate positions measured
on their face at each time point (the set of coordinates is the same
across individuals and times).  Take coordinates 1 and 2 only for now
(reproducible code at the bottom of this email for simulated data).

If I plot the trends for coordinates 1 and 2 for each individual over
time, there is a different amount of variance amongst the individuals
(at least in the intercept, and maybe in the slope) for the two
coordinates and also within the two groups, with group 1 (cleft) having
higher variation than group 0 (control).  I want to allow for these
sources of variation in the model.  The other thing is that I would
expect coordinate positions within an individual to be correlated so I
also want to allow for this.  The model, therefore, would be (for
coordinate r=1,2 measured on individual i at time t, group_i an
indicator variable taking value one for group 1 and zero otherwise):

y_{ir}(t) = \beta_{0r} + \beta_{1r} group_i + \beta_{2r} t + \beta_{3r}
group_i : t  + b_{ir0} + group_i * b_{ir1} + \epsilon_{ir}(t),

where \epsilon_{ir}(t) ~ N(0, \sigma2) and the random effect b_{irp} ~
N(0, \sigma_{rp}2), for p=0,1.   I think the following code is
appropriate (model 1):

lmer2(y~-1+coord1+coord2+coord1:(time+group+time:group)+coord2:(time+group+tim
e:group) 
+ (0+coord1+coord2+coord1:group+coord2:group|ID), data=simdata)

where coord1 and coord2 are indicator variables for coordinates 1 and 2,
respectively, time is continuous and group is an indicator variable
taking value one for the cleft group and zero for the controls.  Does
the random effects part make sense?  I'm especially unsure about
allowing correlations between all of the random effects terms, although
I think that's it's appropriate under this parameterisation because each
person has a value for both coordinates 1 and 2, and the group effect is
additional. 


An alternative parameterisation is (model 2):

lmer2(y~-1+coord1+coord2+coord1:(time+group+time:group)+coord2:(time+group+tim
e:group) 
+ (0+gp0:coord1+gp0:coord2|ID)+(0+coord1:group+coord2:group|ID),
data=simdata),

where gp0 is an indicator variable taking value one if the individual is
in group 0 and zero otherwise.  It seems to me that this should be
equivalent to model 1, but it doesn't appear to be (perhaps this just
comes down to fewer correlations estimated in model 2).

If a correlation between random effects is estimated to be 1 or -1, is
this generally because the model is over-parameterised?

Reproducible code is below.

set.seed(100)
n.subj <- 200
n.times <- 3
n.coords <- 2
simdata <- data.frame(coord1=c(rep(1,n.subj*n.times),
rep(0,n.subj*n.times)))
simdata$coord2 <- c(rep(0,n.subj*n.times), rep(1,n.subj*n.times))
simdata$coord <- ifelse(simdata$coord1==1, 1, 2)
simdata$ID <- rep(rep(1:n.subj, each=n.times),2)
simdata$time <- rep(1:n.times, n.subj*n.coords)
simdata$group <- rep(c(1,0,1,0), each=n.subj*n.times/2)
simdata$gp0 <- 1-simdata$group
simdata$y <- rep(NA, dim(simdata)[1])
randeff <- c(rep(rnorm(n.subj/2, 0, 15),each=n.times),
rep(rnorm(n.subj/2, 0, 10), each=n.times), rep(rnorm(n.subj/2, 0, 25),
each=n.times), rep(rnorm(n.subj/2, 0, 20), each=3))
for (i in 1:dim(simdata)[1]) simdata$y[i] <- rnorm(1,
randeff[i]+simdata$time[i]+simdata$coord[i]+10*simdata$group[i], 16)

lmer2(y~-1+coord1+coord2+coord1:(time+group+time:group)+coord2:(time+group+tim
e:group) 
+ (0+coord1+coord2+coord1:group+coord2:group|ID), data=simdata)
lmer2(y~-1+coord1+coord2+coord1:(time+group+time:group)+coord2:(time+group+tim
e:group) 
+ (0+gp0:coord1+gp0:coord2|ID)+(0+coord1:group+coord2:group|ID),
data=simdata),

Many thanks for any help.

Best regards,
Sarah

Sarah Barry, MSc
Department of Statistics
University of Glasgow
Tel: +44 (0)141 330 2474
Fax: +44 (0)141 330 4814
www.stats.gla.ac.uk/~sarah

Different random effects variances for outcomes and groups

Thread (9 messages)