Skip to content

[R-meta] Best choice of effect size

5 messages · Luke Martinez, Wolfgang Viechtbauer, James Pustejovsky

#
Dear All,

I'm doing a meta-analysis where the papers report only "mean" and "sd"
of some form of proportion and/or "mean" and "sd" of corresponding raw
frequencies. (For context, the papers ask students to read, find, and
correct the wrong words in a text.)

By some form of proportion, I mean, some papers report actual proportions:

proportion_type1 = # of corrected items / all items needing correction

Some paper report a modified version of proportions:

proportion_type2 = # of corrected items / (all items needing
correction + all wrongly corrected items)

There are other versions of proportions and corresponding raw
frequencies as well. But my question is given that all these studies
only report "mean" and "sd", can I simply use a SMD effect size?

Many thanks,
Luke
#
Dear All,

To further clarify, the proportion types (my previous email) are used
to score each study participant's performance on the text. Then, each
study reports the "mean" and "sd" of a proportion type for control and
experimental groups (to then compare them with t-tests and ANOVAs).

For example, a study using proportion_type1 (see my previous email)
can provide the following for effect size calculation:

               Mean    SD     n
 group1   0.45      0.17  20
 group2   0.17      0.11  19

The same is true for studies that use raw frequencies to score each
study participant's performance on the text. In such studies, often,
"mean" and "sd" of the  # of corrected items (numerator of the
proportions in my previous email) for control and experimental groups
(to then compare them with t-tests and ANOVAs).

For example, a study using (raw) # of corrected items can provide the
following for effect size calculation:

               Mean    SD   n
 group1   4.5      1.12  17
 group2   4.7      1.59  18

My question is that can I calculate SMD across all such studies given
their intent is to measure the same thing?

Thank you,
Luke
On Wed, Sep 29, 2021 at 12:12 PM Luke Martinez <martinezlukerm at gmail.com> wrote:
#
Hi Luke,

Yes, treating the mean proportions as means is ok -- after all, they are means. As long as n is not too small (and the true mean proportion not too close to 0 or 1), then the CLT will also ensure that the sampling distribution of a mean proportion is approximately normal.

We have analayzed such mean proportions in these articles:

McCurdy, M. P., Viechtbauer, W., Sklenar, A. M., Frankenstein, A. N., & Leshikar, E. D. (2020). Theories of the generation effect and the impact of generation constraint: A meta-analytic review. Psychonomic Bulletin & Review, 27(6), 1139-1165. https://doi.org/10.3758/s13423-020-01762-3

Vachon, H., Viechtbauer, W., Rintala, A., & Myin-Germeys, I. (2019). Compliance and retention with the experience sampling method over the continuum of severe mental disorders: Meta-analysis and recommendations. Journal of Medical Internet Research, 21(12), e14475. https://doi.org/10.2196/14475

In these articles, we did not compute standardized mean differences based on the mean proportions, but one could do so.

For the data below:

escalc(measure="SMD", m1i=0.45, m2i=0.17, sd1i=0.17, sd2i=0.11, n1i=20, n2i=19)

If I understand you correctly, the second type are means of counts (i.e., there is a count for each subject and for example 4.5 is the mean of those counts). Again, while an individual count might have other distributional properties (e.g., Poisson or negative binomial), once you take the mean, it's a mean and the CLT 'kicks in'. So I would again say: yes, you can treat these as 'regular' means and compute SMDs based on them.

For the data below:

escalc(measure="SMD", m1i=4.5, m2i=4.7, sd1i=1.12, sd2i=1.59, n1i=17, n2i=18)

I might be inclined to code a moderator that distinguishes these different types, to see if there is some systematic difference between them.

Best,
Wolfgang
#
Dear Wolfgang,

Thank you so much for your response and also the references.

I will compute an SMD from the means and sds of all types of proportions
and the raw counts reported in the papers.

Instead of a moderator, I thought I add a random effect for the variation
in these types of proportions and raw counts, which will be crossed with
studies (I think), because true effects can be correlated (?) due to
sharing a study as well as sharing one of these types of proportions or raw
counts, right?

proportion_type1 = # of corrected items / all items needing correction

proportion_type2 = # of corrected items / (all items needing
correction + all wrongly corrected items)

raw_counts = # of corrected items



On Thu, Sep 30, 2021, 1:33 AM Viechtbauer, Wolfgang (SP) <
wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:

            

  
  
#
Hi Luke,

To add to Wolfgang's comments, I would suggest that you could also consider
other effect measures besides the SMD. For example, the response ratio is
also a scale-free metric that could work with the proportion outcomes that
you've described, and would also be appropriate for raw frequency counts as
long as the total number possible is the same for the groups being compared
within a given study.

Whether the response ratio would be more appropriate than the SMD is hard
to gauge. One would need to know more about how the proportions were
assessed and how the assessment procedures varied from study to study. For
instance, did some studies use passages with many possible errors to be
corrected while other studies used passages with just a few errors? Did the
difficulty of the passages differ from study to study? Were there very low
or very high mean proportions in any studies? Does there seem to be a
relationship between the means and the variances of the proportions of a
given group?

James

On Thu, Sep 30, 2021 at 2:22 AM Luke Martinez <martinezlukerm at gmail.com>
wrote: