Skip to content

[R-meta] Meta-Analysis using different correlation coefficients

4 messages · Lena Pollerhoff, Pablo Grassi, Wolfgang Viechtbauer

#
Hello,

We are currently conducting a meta-analysis based on correlation coefficients.
We have received a huge amount of raw datasets, so that we are able to calculate effect sizes/correlations coefficients on our own for many datasets, and  we have other correlations extracted from the original pubs. Therefore I have a couple of questions:

1. If one variable is dichotomous and the other variable is continuous but not normally distributed, what kind of coefficient should be calculated? We?d go for point-biserial if the variable is naturally dichotomous (not artificially dichotomized), and for biserial correlation if the dichotomous variable was artificially dichotomized, but are worried that both require normal distribution of the continuous variable? 

2. We are wondering how to best integrate person?s product moment correlation coefficients (both continuous, normally distributed variables), (point-) biserial correlation coefficients (for 1 (artificial) dichotomous and 1 continuous variable) and spearman rang correlation coefficients (for non-parametric, both continuous variables) in one meta-analysis? Just use the raw values? Or is it better to transform them in a homogenous way (I?ve read Fisher?s z makes less sense for anything else than Pearson?s r as a variance-stabilizing procedure?)? Can spearman rho be converted using fisher?s z transformation? I?ve also read that it is not advisable to include product-moment correlation and point-biserial correlation in one meta-analysis, is there a way to convert the point-biserial correlation to something that can be integrated with Pearson?s r and Spearman?s rho?

3. I have multiple effect sizes within one sample and I want to aggregate them, how do I define rho in the aggregate function from the metafor package? Is it possible to calculate rho based on the raw datasets? Or would it better to think in a conservative way and assume perfect redundancy (i.e., rho = 0.9)?

Thanks in advance for your time and effort!
Best,
Lena
8 days later
#
Dear all,

Maybe some of you can help me out with the following design-conundrum. I 
am currently performing a series of meta-analysis investigating the 
effect on an intervention in closely related outcomes. Most of the 
reviewed studies are within-subject designs (one group of participants), 
for which the standardized mean differences (SMD; Hedge's g) are 
differences of change scores divided by the SD of the difference (with 
change score standardization), as follows:

 ??? SMD = M_diff / SDdiff?? (for simplicity in this E-mail without the 
bias-correction factor)

Unfortunately, there is a huge variability in the control measurements 
used. Roughtly following the nomenclature from Morris 2008, I have the 
following different design-cases:

 ??? Case 1) Within-subject desing, pre-post-control (WS_PPC):

 ?????????????? M_diff_ws_pcc = (post_Treatment - pre_Treatment) - 
(post_Control - pre_Control)

However, some others had no baseline pre-intervention measurement (i.e. 
only report post-intervention measurements), i.e. Post-test only with 
control design (WS_POWC)

 ??? Case 2) M_diff_ws_powc = (post_Treatment - post_Control)

And some few others just have a post-pre measurement but no control 
(i.e. only report a change score, single-group pre-post, SGPP), so that:

 ??? Case 3) M_diff_ws_sgpp = (post_Treatment - pre_Treatment)

Then, while the M_diff if Case 1 and 2 is measuring +/- the same effect, 
their SDdiffs (and thus SMDs) are not really comparable as they reflect 
different things.

Thus:

  * What would be the "standard" approach to include the within-subject
    design studies (Case 1, 2, 3) in the same meta-analysis, if this is
    possible at all? (Please consider that most of the publications
    report ONLY the SD of the change scores and NOT the SD of the pre
    and post conditions separately or in case 2 of the post-interventions).

Best,

Pablo
1 day later
#
Dear Pablo,

I am confused why you call case 2 a 'within-study design'. That looks like a two-group post-test only design to me.

In any case, for all three cases, you want to use 'raw score standardization' to make the various effect size measures comparable, at least under some restrictive assumptions, such as that there is no inherent time effect or time by group interaction.

So, for case 2, you compute standard SMDs, which use the SD at a single timepoint (i.e. at post). This is in essence 'raw score standardization'.

Ananlogously, for case 3, you also use 'raw score standardization', so measure "SMCR" in escalc() lingo.

And finally, for case 1, you again want to standardize based on the SD of a single timepoint, not the SD of the change scores. See:

https://www.metafor-project.org/doku.php/analyses:morris2008

for a discussion of how to do this.

If authors do not report the required raw score SDs, then you will have to get creative getting these (e.g., contacting authors, back-calculating them based on the SD of the change scores and other information provided, guestimating them).

I would also code the type of design that was used in each study (and hence the type of effect size measure that was computed) to examine whether there are still systematic differences between these groups, even if the type of standardization was the same across measures.

Best,
Wolfgang
#
Dear Wolfgang,

Thanks for your detailed answer. To clarify, all cases involve the /same 
group of participants/ and are thus paired data ("within-subject designs"):


 ??? ??? Case 1 = pre-Treament and post_Treatment both in day 1 and 
pre-Control and post-Control both in day 2, in the same participants 
(order of T and C counterbalanced)

 ??? ??? Case 2 = post-Treatment in day 1 vs post-Control in day 2, in 
the same participants (order of T and C counterbalanced).

 ??? ??? Case 3 = pre-Treatment and post-Treatment in day 1.


So, if I understand you correctly, your suggestion to calculate SMDs 
using "raw score standardization" would be for each case:


For Case 1:

SMD1 = ((MpostT - MpreT) - (MpostC-MpreC)) / SD_pre_pooled

 ??? ??? ??? with SD_pre_pooled = sqrt(SDpreT^2 + SDpreC^2 + 
2*r1*SDpreT*SDpreC) / 2

 ??? ??? ??? and r1 = correlation between preT and preC values (as they 
T and C are the same participants)

SE1 =? 2*(1-r2)/n + SMD^2 /(2n)

 ??? ??? ??? n = n of participants

 ??? ??? ??? and r2 = correlation between individual change score 
differences (postT-preT) and (postC-preC)



Then, case 2 is not so clear for me (as I believe you assumed treatment 
and control are different group of participants):

SMD2 = (MpostT-MpostC) / SD_???

 ??? ??? Which SD do you suggest to use?

 ??????? SD_diff =? sqrt(SDpostT^2 + SDpostC^2 - 2*r*SDpostT*SDpostC) or

 ??? ??? SD_pooled = sqrt(SDpostT^2 + SDpostC^2 + 2*r*SDpostT*SDpostC) / 2

 ??? ??? or just one of the SD (SDt or SDc)?

And SE2 = ???


And case 3, you suggest:

 ??? ??? SMD3 = (MpostT-MpreT) / SD_pre

 ??? ??? And SE3 (same as case SE1) = 2*(1-r2)/n + SMD^2 /(2n)


Is this correct? (I am also a bit unsure about the SE formulas)

And unfortunately yes, most of the authors do not report any of the 
required SDs. Paired t-tests and related values (e.g. often reported 
effect sizes as cohens d_z) would be of no help then, as they use the SD 
of the differences/change scores (and would only be of use using a 
change score standardization).


Alternative: Wouldn't it be possible to check if the variance 
post-treatment differs to post-control in those studies were the raw 
data is available and then based on this information, leave or drop the 
change score standardization?


In any case thanks for your suggestions! I'll definitely add the design 
as a moderator in the meta-analysis.


Best,

Pablo
On 18.02.22 11:10, Viechtbauer, Wolfgang (SP) wrote: