Thank you,
Yuhang
On Wed, Dec 13, 2023 at 7:06?AM Viechtbauer, Wolfgang (NP)
<mailto:wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
Hi Yuhang,
First of all, I would suggest to create two separate variables for the two
variables, like this:
study ri ni var1 var2
1) 1 .1 85 1 2
...
28) 1 .2 85 7 8
Then you can use rcalc() to create the var-cov matrix of the (raw or r-to-z
transformed) correlation coefficients within studies (the 'V' matrix), that is,
if your dataset is called 'dat', you can do:
tmp <- rcalc(ri ~ var1 + var2 | study, ni=ni, data=dat)
V <- tmp$V
dat <- tmp$dat
Sidenote: For 8 variables, there are 8*7/2 correlations (or generally, for p
variables, p*(p-1)/2 -- this is one of those equations one eventually has
memorized due to using it so often).
For a study (say, study 2) that used multiple measures for one of the variables
(say, variable 8), there are then actually 9 variables and hence 9*8/2 = 36
correlations. The structure then is:
study ri ni var1 var2 measure1 measure2
1) 2 .1 78 1 2 a b
...
7) 2 .3 78 1 8 a x
8) 2 .2 78 1 8 a y
9) 2 .4 78 2 3 b c
...
14) 2 .3 78 2 8 b x
15) 2 .2 78 2 8 b y
16) 2 .5 78 3 4 c d
...
20) 2 .4 78 3 8 c x
21) 2 .5 78 3 8 c y
22) 2 .1 78 4 5 d e
...
25) 2 .0 78 4 8 d x
26) 2 .1 78 4 8 d y
27) 2 .3 78 5 6 e f
...
29) 2 .3 78 5 8 e x
30) 2 .2 78 5 8 e y
31) 2 .2 78 6 7 f g
32) 2 .3 78 6 8 f x
33) 2 .1 78 6 8 f y
34) 2 .1 78 7 8 g x
35) 2 .2 78 7 8 g y
36) 2 .3 78 8 8 x y
The actual values used for measure1 and measure2 are irrelevant, as long as you
use them consistently within a study. For studies that only used a single
measure for each variable, you can leave measure1 and measure2 blank. For
studies that used multiple measures for more than one variable, you have to keep
expanding this structure. It just becomes very tedious to construct.
Then for rcalc(), you need to paste together var1 and measure1 and var2 and
measure2:
dat$v1m1 <- paste0(dat$var1, ".", dat$measure1)
dat$v2m2 <- paste0(dat$var2, ".", dat$measure2)
and use those in rcalc():
tmp <- rcalc(ri ~ v1m1 + v2m2 | study, ni=ni, data=dat)
V <- tmp$V
dat <- tmp$dat
For the actual model fitted with http://rma.mv(), you don't use the combination
of v1m1 and v2m2, but the combination of var1 and var2 as the predictor:
dat$var1var2 <- paste0(dat$var1, ".", dat$var2)
since you *want* the model not to give you estimates of 'measure-specific'
pooled correlations, but you want to average over multiple measures for the same
variable. So the model could be:
http://rma.mv(yi, V, mods = ~ 0 + var1var2, random = ~ var1var2 | study,
struct="UN", data=dat)
However, this model will need to estimate 28*27/2 = 378 correlations plus 28
variances (tau^2 values) for the random effects, so in total 406 (!!) parameters
(the general equation is p*(p+1)/2), plus the 28 fixed effects. That's a lot of
parameters in the unstructured var-cov matrix of the random effects, so unless
you have a lot of studies (hundreds if not thousands), this is going to be
difficult or essentially impossible. This aside, the model allows for no
heterogeneity when there are multiple correlations for the same var1var2 pair. A
simple way to allow for this is to add another estimate specific random effect
to the model:
dat$id <- 1:nrow(dat)
http://rma.mv(yi, V, mods = ~ 0 + var1var2, random = list(~ var1var2 | study, ~
1 | id), struct="UN", data=dat)
This is simplistic, since it assumes that the heterogeneity in multiple
correlations for the same pair is the same regardless of the pair. If you have a
lot of data, one could try:
http://rma.mv(yi, V, mods = ~ 0 + var1var2, random = list(~ var1var2 | study, ~
var1var2 | id), struct=c("UN","DIAG"), data=dat)
which would use separate estimate-level random effects for each pair, but this
adds another 28 parameters to the model. But who cares about another 28 if one
already has 406 ...
Realistically, one needs to simplify the random effects structure. On the
opposite end, there is the minimalistic:
res <- http://rma.mv(yi, V, mods = ~ 0 + var1var2, random = ~ 1 | study/id,
data=dat)
res
which, due to its overly simplistic nature, really needs to be followed-up with:
robust(res, cluster=study, clubSandwich=TRUE)
(could do the same with the models above, but this is less likely to matter if
one actually manages to fit these complex models).
An interesting question is what kind of structures of intermediate complexity
one could consider.
But I'll stop here for now, since this is getting way too long anyway.
Best,
Wolfgang
Of Yuhang Hu via R-sig-meta-analysis
Sent: Wednesday, December 13, 2023 06:21
To: R meta <mailto:r-sig-meta-analysis at r-project.org>
Cc: Yuhang Hu <mailto:yh342 at nau.edu>
Subject: [R-meta] Coding multi-measure correlational studies for multilevel
meta-analysis
Hello Experts,
I'm collecting the correlations between 8 variables from several studies.
If a study has used a single measure for all these 8 variables, I will need
28 rows (assuming no missing) to capture all those correlations i.e.,
var1.var2 = combn(1:8, 2, FUN=\(i)paste(i,collapse = ".")):
study ri var1.var2
1) 1 .1 1.2
...
28) 1 .2 7.8
But if a study has used, say, two measures (e.g., 1, 2) for two of those 8
variables (e.g., variables "1" and "2" in 'var1.var2'), then, I wonder how
**best** to capture the additional 13 correlations arising due to the
additional measure used for "1" and "2" in that study in my data for
multilevel modeling purposes?
One approach might be to add a single column called, say "measure" to add
just those additional rows in that multi-measure study:
study ri var1.var2 measure
1) 1 .1 1.2
...
6) 1 .6 1.7 1
7) 1 .4 1.7 2
...
12) 1 .8 2.7 1
13) 1 .7 2.7 2
...
But this looks messy. For instance, what should be the value of "measure"
for the var1.var2 rows that have used a single measure (e.g., var1.var2 ==
1.2)? And can "measure" coded this way be used in the random part of the
model (metafor::http://rma.mv)?
Thanks,
Yuhang