what kind of analysis should be used when there are several correlated dependent variables?

Fri, Dec 18, 2009 10:51 AM

Liliana--

One route would be to fit a multivariate mixed-effects model, in which 
the multiple outcomes are jointly treated as a DV.  Neither lme() nor 
lmer() have a way of explicitly specifying a multivariate model (ie, 
MCMCglmm can "cbind" multiple outcomes together), but it is possible to 
fit at least some basic multivariate models with some data manipulation.

The main "trick" here is to reshape your data by stacking all DVs into a 
single DV column and creating a factor that identifies which values in 
the DV correspond to your 4 separate DVs.

I recently used lme() to do this with a study of daily positive mood, 
negative mood, and fatigue in women with breast cancer (each woman had 
30 days of data).

Here was the code to reshape the data (where "id" identifies the 
individual, and "day" identifies -- shockingly -- day):

tmp.long.df <- reshape(my.df,
                       idvar=c("id","day"),
                       varying = c("posemo","negemo","fatigue"),
                       v.names = "dv",
                       times = 1:3,
                       direction="long")
names(tmp.long.df)[8] <- "dv.f" # rename times
tmp.long.df$dv.f <- factor(tmp.long.df$dv.f, 1:3, 
c("posemo","negemo","fatigue"))

So, your separate columns of DVs are specified on "varying", and the new 
data.frame has a column called "times" that identifies which DV is which 
in your new "dv" column.  I change times (which just happened to be the 
8th column in my data, you'll likely need to change that) to "dv.f" and 
then change that to factor with levels identifying separate DVs.

Then, here is an example of a call to lme() that I used:

mult.lme3.2 <- lme(dv ~ -1 + dv.f/(supp.sat.c*(emot.iss.c + phys.iss.c + 
comm.iss.c)),
				 data = mult.df, na.action = na.omit, control = list(msVerbose = TRUE),
				 random = list(id = ~ -1 + dv.f),
				 weights = varIdent(form = ~ 1 | dv.f),
				 correlation = corAR1(form = ~ 1 | id/dv.f))

So, a couple comments:

-- With this set-up, getting rid of the intercept makes the models a bit 
easier to interpret (for me) as you get separate intercepts for each DV.

-- Similarly, the "dv.f/..." fits the covariates nested within each DV, 
which again makes things a bit easier to interpret.

-- The random statement fits separate random-intercepts by DV that are 
correlated.

-- The weights statement allows different residual error terms by DV.

-- For our data, we fit an AR1 correlation model because of 30 days of data.

Hope that helps; I think David Afshartous has some postings about 
multivariate models in lme which you can probably find in the archive 
(if memory serves).

[And, I am again impressed at how flexible the mixed-effects tools are 
in R -- kudos to Jose and Doug for the great software.]

cheers, Dave

Dave Atkins, PhD
Research Associate Professor
Center for the Study of Health and Risk Behaviors
Department of  Psychiatry and Behavioral Science
University of Washington
1100 NE 45th Street, Suite 300
Seattle, WA  98105
206-616-3879
datkins at u.washington.edu

what kind of analysis should be used when there are several correlated dependent variables?

Thread (4 messages)