I'm trying to recover the correlations in a multivariate dataset. Let me
first start with a trivariate case.
require(MASS); require(lme4)
# trivariate case
r1 <- 0.5 # correlation value to be recovered
ns <- 200 # number of samples
S1 <- matrix(c(1,r1,0, # correlation structure of trivariate data
r1,1,0,
0,0,1), nrow=3, ncol=3)
# simulated trivariate data
dat <- data.frame(f = c(rep(paste0('P',1:ns), 2), paste0('S',1:ns)),
y = c(mvrnorm(n=ns, mu=c(0, 0, 0), Sigma=S1)))
Now I can construct the following model
m1 <- lmer(y ~ 1 + (1|f), data=dat)
Then with the variances from the random effects in summary(m1)
summary(m1)
Random effects:
Groups Name Variance Std.Dev.
f (Intercept) 0.4996 0.7068
Residual 0.4907 0.7005
I can recover the correlation r1:
tmp <- unlist(lapply(VarCorr(m1), `[`, 1))
# recover the correlation r1
tmp/(tmp+attr(VarCorr(m1), "sc")^2)
Let's switch to a pentavariate case:
# pentavariate case
r1 <- 0.2; r2 <- 0.8 # correlation value to be recovered
ns <- 200 # number of samples
S <- matrix(c(1,r1,0,0,0, # correlation structure of pentavariate data
r1,1,0,0,0, # the first and second variables are correlated
0,0,1,r2,0, # the third and fourth variables are correlated
0,0,r2,1,0,
0,0,0,0,1), nrow=5,ncol=5)
# simulated data
dat <- data.frame(f = c(rep(paste0('P',1:ns), 2), rep(paste0('T',1:ns), 2),
paste0('S',1:ns)),
R=c(rep('P',2*ns), rep('T',2*ns), rep('S', ns)),
y = c(mvrnorm(n=ns, mu=rep(0,5), Sigma=S)))
dat$R1 <- as.numeric(dat$R=='P') # dummy code the first and second
variables (r1)
dat$R2 <- as.numeric(dat$R=='T') # dummy code the third and fourth
variables (r2)
I have thought about the following models
m2 <- lmer(y ~ 1 + (0+R1|f) + (0+R2|f), data=dat)
m3 <- lmer(y ~ 1 + (1|f) + (0+R2|f), data=dat)
but I've been struggling to figure out a way to recover the correlations r1
and r2 with the variances from the random effects based on the models m2
and m3.
Of course I could use a workaround solution by reducing the
pentavariate data into two trivariate cases:
# workaround solution
m4 <- lmer(y ~ 1 + (1|f), data=dat[dat$R2!=1,]) # recover r1 like model m1
above
m5 <- lmer(y ~ 1 + (1|f), data=dat[dat$R1!=1,]) # recover r2 like model m1
above
However, I would really like to find a way to recover r1 and r2 directly
using models like m2 and m3. Suggestions?
Thanks,
Gang Chen