what kind of analysis should be used when there are several correlated dependent variables?

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20091218/96435e24/attachment.pl>
Liliana--

One route would be to fit a multivariate mixed-effects model, in which 
the multiple outcomes are jointly treated as a DV.  Neither lme() nor 
lmer() have a way of explicitly specifying a multivariate model (ie, 
MCMCglmm can "cbind" multiple outcomes together), but it is possible to 
fit at least some basic multivariate models with some data manipulation.

The main "trick" here is to reshape your data by stacking all DVs into a 
single DV column and creating a factor that identifies which values in 
the DV correspond to your 4 separate DVs.

I recently used lme() to do this with a study of daily positive mood, 
negative mood, and fatigue in women with breast cancer (each woman had 
30 days of data).

Here was the code to reshape the data (where "id" identifies the 
individual, and "day" identifies -- shockingly -- day):

tmp.long.df <- reshape(my.df,
                       idvar=c("id","day"),
                       varying = c("posemo","negemo","fatigue"),
                       v.names = "dv",
                       times = 1:3,
                       direction="long")
names(tmp.long.df)[8] <- "dv.f" # rename times
tmp.long.df$dv.f <- factor(tmp.long.df$dv.f, 1:3, 
c("posemo","negemo","fatigue"))

So, your separate columns of DVs are specified on "varying", and the new 
data.frame has a column called "times" that identifies which DV is which 
in your new "dv" column.  I change times (which just happened to be the 
8th column in my data, you'll likely need to change that) to "dv.f" and 
then change that to factor with levels identifying separate DVs.

Then, here is an example of a call to lme() that I used:

mult.lme3.2 <- lme(dv ~ -1 + dv.f/(supp.sat.c*(emot.iss.c + phys.iss.c + 
comm.iss.c)),
				 data = mult.df, na.action = na.omit, control = list(msVerbose = TRUE),
				 random = list(id = ~ -1 + dv.f),
				 weights = varIdent(form = ~ 1 | dv.f),
				 correlation = corAR1(form = ~ 1 | id/dv.f))

So, a couple comments:

-- With this set-up, getting rid of the intercept makes the models a bit 
easier to interpret (for me) as you get separate intercepts for each DV.

-- Similarly, the "dv.f/..." fits the covariates nested within each DV, 
which again makes things a bit easier to interpret.

-- The random statement fits separate random-intercepts by DV that are 
correlated.

-- The weights statement allows different residual error terms by DV.

-- For our data, we fit an AR1 correlation model because of 30 days of data.

Hope that helps; I think David Afshartous has some postings about 
multivariate models in lme which you can probably find in the archive 
(if memory serves).

[And, I am again impressed at how flexible the mixed-effects tools are 
in R -- kudos to Jose and Doug for the great software.]

cheers, Dave
Dave Atkins, PhD
Research Associate Professor
Center for the Study of Health and Risk Behaviors
Department of  Psychiatry and Behavioral Science
University of Washington
1100 NE 45th Street, Suite 300
Seattle, WA  98105
206-616-3879
datkins at u.washington.edu
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20091219/463e4fc7/attachment.pl>
Dear David,
Thank you very much for the help! I managed to reshape my data and fit a 
multi-variate mixed effects model the way you advised me to do (at least 
I hope this is what I've done).

risuvane2 = reshape (risuvane, idvar = c("subj", "stim"), varying = 
c("cext_linear", "dir0_dist1", "prec01", "fol01"), v.names = "dv4", 
times = 1:4, direction = "long")
risuvane2$time = factor(risuvane2$time, 1:4, c("cext_linear", 
"dir0_dist1", "prec01", "fol01"))

I used your lme-call and replaced your variable names with mine. 
Unfortunately, my understanding of statistics and mathematics is not so 
Honestly, I would recommend trying to do some background reading and 
seek consultation from someone local.  The model that I suggested (and 
you fit) is quite complex.

The Pinheiro and Bates book is the "bible" for lme and will discuss 
model syntax, including weights and correlation.

The following article discusses multivariate models, and I am sure there 
are others:

MacCallum, R. C., Kim, C., Malarkey, W. B., & Kiecolt-Glaser, J. K. 
(1997). Studying multivariate change using
multilevel models and latent curve models. Multivariate Behavioral 
Research, 32, 215-253.

[Also, assuming that the data reshaping and model were specified 
correctly, there appears to be *no* correlation between DVs.  Though, 
there are huge differences across variances; I do not know your data, 
but this strikes me as odd]

All the best, Dave
good as I'd want it to be, so the part about weights and correlations is 
still a mystery to me. I managed to obtain some results when the 
'weights' statement was not included:

 > mult4.lme = lme (dv4 ~ -1 + time*((verb_complex + movent + ROshape + 
ROdimensionality)^2 ), data = risuvane2, random = list(subj = ~ -1 + 
time), correlation = corAR1(form = ~ 1 | subj/time))
 > anova(mult4.lme)
                                   numDF denDF  F-value p-value
time                                   4  3208 576.6028  <.0001
verb_complex                           6  3208 373.4833  <.0001
movent                                 1    29   0.0055  0.9413
ROshape                                1  3208   0.0067  0.9349
ROdimensionality                       1  3208   3.4778  0.0623
verb_complex:movent                    6  3208   2.4096  0.0251
verb_complex:ROshape                   6  3208   0.2297  0.9671
verb_complex:ROdimensionality          6  3208   0.6016  0.7293
movent:ROshape                         1  3208   1.7516  0.1858
movent:ROdimensionality                1  3208   0.0119  0.9133
ROshape:ROdimensionality               1  3208   0.3397  0.5600
time:verb_complex                     18  3208 372.8322  <.0001
time:movent                            3  3208   0.2730  0.8449
time:ROshape                           3  3208   0.0199  0.9962
time:ROdimensionality                  3  3208   3.5088  0.0147
time:verb_complex:movent              18  3208   2.4822  0.0005
time:verb_complex:ROshape             18  3208   0.2257  0.9997
time:verb_complex:ROdimensionality    18  3208   0.6165  0.8898
time:movent:ROshape                    3  3208   1.7935  0.1462
time:movent:ROdimensionality           3  3208   0.0053  0.9995
time:ROshape:ROdimensionality          3  3208   0.3398  0.7966

 > print(mult4.lme, corr = F)
Linear mixed-effects model fit by REML
  Data: risuvane2
  Log-restricted-likelihood: -15231.59
  Fixed: dv4 ~ -1 + time * ((verb_complex + movent + ROshape + 
ROdimensionality)^2)
....
....
.... 

Random effects:
 Formula: ~-1 + time | subj
 Structure: General positive-definite, Log-Cholesky parametrization
                StdDev       Corr               
timecext_linear 19.011722951 tmcxt_ tmd0_1 tmpr01
timedir0_dist1   0.001260818 0                  
timeprec01       0.001263319 0      0           
timefol01        0.001270533 0      0      0    
Residual        25.169136027                    
Correlation Structure: AR(1)
 Formula: ~1 | subj/time
 Parameter estimate(s):
      Phi
0.1935926
Number of Observations: 3360
Number of Groups: 30

However, each time I tried to include weights, I got an error message, 
even when I reduced the number of fixed effects in the model:

 > mult4.lme = lme (dv4 ~ -1 + time*((verb_complex + movent + ROshape + 
ROdimensionality)^2 ), data = risuvane2, random = list(subj = ~ -1 + 
time), weights = varIdent(form = ~ 1 | time), correlation = corAR1(form 
= ~ 1 | subj/time))
Error in lme.formula(dv4 ~ -1 + time * ((verb_complex + movent + ROshape 
+  :
  nlminb problem, convergence error code = 1
  message = iteration limit reached without convergence (9)

In this case, will it be inappropriate/ dangerous to stick to the 
version without weights? Also, I am not sure how to interpret the 
results, and how is this to be reported. Do you have a paper I can use 
as reference?

best regards

Liliana

------------------------------------------------------------------------
*Fra:* David Atkins <datkins at u.washington.edu>
*Til:* r-sig-mixed-models at r-project.org
*Sendt:* fre, desember 18, 2009 7:51:30 PM
*Emne:* Re: [R-sig-ME] what kind of analysis should be used when there 
are several correlated dependent variables?

Liliana--