Skip to content

[R-meta] Multivariate meta-analysis with metafor: Should I adjust sample sizes/variances for multiple groups ('double counting') when combined with multiple endpoints?

3 messages · Emily Finne, James Pustejovsky

#
Dear all,

as I am seemingly the first to post a question on this list, I hope my 
question is not a silly one.

First of all I'd like to thank Wolfgang Viechtbauer for all the 
examples, explanations,  and loads of additional online-material on how 
to conduct different kinds of meta-analyses with metafor.
I've already learned a lot so far!
All these bits of code are really helpful and appreciated, since I am 
relatively new to working with R (and in doing meta-analysis).

There is, however, one point I am still confused about. I try to explain 
my analysis first and then the question:

I have 30 RCTs matching our inclusion criteria and I use Hedges g as 
effect size. The aim is to analyze different intervention techniques 
(coded as present or absent) as potential moderators of effect sizes.
All studies included a self-report measure of the outcome, some 
additionally reported results for an objective measure of the same 
outcome. I would like to include both outcomes in a multivariate model.
There are also a few studies with multiple treatment groups all compared 
to the same control condition. Since the groups differ in the techniques 
they used and are therefore of interest, information from all 
intervention groups should be included.

Initially I wanted to compute two separate univariate models for the two 
outcome measures (subjective and objective), and because of the shared 
control groups within some trials I split the sample size of the 
controls (with two interventions compared to the same group of, say 40 
people, I included two comparions with n=20 each) to avoid double 
counting (that's what the Cochrane Handbook recommends in this case).

But after starting to work through the different options, I came to the 
conclusion that the multivariate model would be more appropriate for 
this analysis.
So, the model I want to fit looks like this:

library(metafor)

MA1 <- rma.mv(yi=Hedgesg, V,  random = ~ Outcome | trial, struct="UN", 
data=datMA, test="t", mods=~Outcome)

or for one overall effect size  (because both outcomes did not differ 
significantly):

MA2 <- rma.mv(yi=Hedgesg, V,  random = ~ Outcome | trial, struct="UN", 
data=datMA, test="t")

for the overall effect and then for the meta-regression model:

MA3 <- rma.mv(yi=Hedgesg, V,  random = ~ Outcome | trial, struct="UN", 
data=datMA, test="t", mods=~ technique1)

My model is most similar to the example given here: 
http://www.metafor-project.org/doku.php/analyses:berkey1998

V is the variance-covariance matrix based on the variances and estimated 
covariances between the effects of both outcome measures within a study 
(as explained in the linked example above).

Trial is the study ID.

BUT besides these 2 outcomes I have these studies with multiple 
intervention groups. There is one trial with even 6 effect sizes (2 
outcomes * 3 interventions).

I wonder, what to do with the splitting up of control groups now. For 
the two outcomes measured within the same persons, I am quite sure that 
I don't have to adjust any sample sizes (i.e., variances), because the 
model 'knows' that these outcomes both are from the same persons .
But what about the multiple groups? They are of course also nested 
within trials, but I didn't estimate a covariance between these effect 
sizes and I did not tell the model anything specific about this 
multilevel variant - or did I? (My idea is to additionally use the 
robust estimation (with cluster = trial)).

Is it right then to use the original sample size/ variance from the 
control groups although some were used in multiple comparisons? Or 
should the affected CGs be splitted up within this model as in the 
univariate model? Will  metafor account for the nesting of different 
interventions within a trial when computing an overall pooled effect 
size with the specified multivariate model?
Which variant would yield the correct pooled effect size, whithout 
'double counting'?

I think his is mainly a question on how the metafor 'rma.mv' weighs the 
effect sizes to arrive at the pooled effect when using the random = ~ 
inner | outer factor argument.


I tried to find out by looking at the results of both variants but I 
couldn't suss it out...


Any help would be appreciated. Many thanks!

Best,
Emily
#
Emily,

I would offer a couple of suggestions for different ways to approach this.
I think the main question is whether, for the studies with multiple
intervention groups, do you really care (scientifically, with respect to
your research questions) about the distinction between treatment
conditions? If not---if they're really just a nuisance that you need to
find a way to smooth over---then two simple approaches to handling them
might be attractive:

1. Pick the single condition that best represents the treatment construct
of interest.
2. Average the treatment conditions together, and then take the difference
between the averaged treatment condition and the single control condition.
Say that you have treatment conditions q, r, s, with sample means yq, yr,
ys, sample standard deviations sq sr, ss, and sample sizes nq, nr, ns.
Calculate the average sample mean y_avg = (nq * yq + nr * yr + ns * ys) /
(nq + nr + ns). Say the control condition has sample mean, sd, and size
given by yc, sc, and nc. You can then calculate a d statistic as

d = (y_avg - yc) / sp,

where sp^2 = ((nq - 1) * sq^2 + (nr - 1) * sr^2 + (ns - 1) * ss^2 + (nc -
1) * sc^2) / (nq + nr + ns + nc - 4)). The variance of d is (approximately)
Vd = 1 / nq + 1 / nr + 1 / ns + 1 / nc + d^2 / (nq + nr + ns + nc - 4).
You can also use a Hedges-g correction with J(nq + nr + ns + nc - 4), where
J(x) = 1 - 3 / (4 x - 1).

Option (2) will give more precise treatment effects (because of increased
sample size), but might muddy the water (or be harder to explain in a
paper) if the treatment conditions are really distinct. But if the
meta-regression model that you want to estimate does not make any
distinction between the treatment conditions, then option (2) is actually
very close or even identical to the more complex option described below.

On the other hand, if you really care about the distinctions between
treatment conditions, as you would if the covariates you are examining have
variation within a given study depending on which treatment condition
you're looking at, then you would probably want to

3. Calculate the full sampling variance-covariance matrix of all
combinations of effects and feed this into metafor as part of the V matrix.

Here's a blog post with the relevant formulas: http://jepusto.github.io/
Correlations-between-SMDs

Cheers,
James


On Fri, Jun 16, 2017 at 7:46 AM, Emily Finne <emily.finne at uni-bielefeld.de>
wrote:

  
  
1 day later
#
Dear James,


thank you so much for your quick reply!

In fact, the distinction between treatment conditions is of interest, 
since my moderators/covariates differ between the different conditions 
within the studies that include multiple groups.

So, I have used the formula(s) from your blogpost to modify the V matrix 
and will run my analysis again using this matrix (with the original 
sample sizes from the control groups).


It seems so simple and logical. However, with my limited math knowledge 
I wouldn't have been able to derive the third formula for myself.

Thanks a lot!


Best,

Emily


.


Am 16.06.2017 um 16:03 schrieb James Pustejovsky: