Skip to content

[R-meta] Dependant variable in Meta Analysis

2 messages · Tarun Khanna, Wolfgang Viechtbauer

#
Thank you so much for all the insights so far. I am very grateful and looking forward to learning more in the meta analysis course in October.

I wanted to follow up on my question about dependant variable in meta-analysis. Just to summarize the discussion where we last left it. In the meta analysis that I am doing, there are 4 kinds of studies.


  1.  studies that estimate the equation ln (y) = b0 + b1x + e, where x is a dummy variable that distinguishes two groups (e.g., x = 0 for group 1 and x = 1 for group 2)
  2.  studies that estimate the equation y = b0 + b1x + e, x is a dummy variable that distinguishes two groups (e.g., x = 0 for group 1 and x = 1 for group 2)
  3.  studies that report mean and standard deviations of the two groups (mean and sd of y for x = 0 and x = 1)
  4.  studies that report the difference between the means of the two groups and the pooled standard deviation (mean and standard deviation of y at x = 1 -  y at x = 0)

For the purpose of our meta analysis, studies of type 1 are most useful because b1*100 has the nice interpretation of percent change in y when x = 1. Ideally I would like to transform the other studies so that I can retain this interpretation even in case of the aggregated estimated effect size.

You had earlier recommended transforming estimates from studies of type 3 to ROM so that they are comparable to estimates from studies with ln (y) as dependant variable (Jensen's inequality aside). Could you perhaps also recommend a way to transform studies of the type 2 and 4 so that we that we can retain the interpretation of the overall effect size to be "percentage change in y when x = 1"?

Of course if that's not possible I would use the r_coefficients to calculate the aggregate effect size.

Thank for you your help and patience.

Best
Tarun



Tarun Khanna

PhD Researcher

Hertie School


Friedrichstra?e 180

10117 Berlin ? Germany
khanna at hertie-school.org ? www.hertie-school.org<http://www.hertie-school.org/>
6 days later
#
Hi Tarun,

For 1, exp(b1)*100 is the percent change, not b1*100.

For 2, if you know b0 and b1, then you know the mean of y for x=0 (b0) and the mean of y for x=1 (b0+b1). Now you also need the SD for x=0 and the SD for x=1, but this can't be recovered. However, if you know the MSE, then the square-root of that is the pooled within-group SD, so you can also use that. And you need to know the number of observations where x=0 and where x=1 (so those are the two group sizes, n0 and n1). Then you have everything to compute the ROM and its sampling variance.

If you don't know the MSE but the SE of b1 (or t = b1/SE[b1] from which one can easily recover SE[b1] or the p-value which one can transform into t, which then gives you the SE), then one can easily back-calculate the MSE from that (assuming you know n0 and n1), since

MSE = SE[b1]^2 * sum((x_i - mean(x))^2)

The second term can be computed if you know n0 and n1, since:

sum((x_i - mean(x))^2) = n0 * (0 - n1/(n0+n1))^2 + n1 * (1 - n1/(n0+n1))^2.

One can simplify this equation further, but this should make it clear that mean(x) is just the proportion of 1's and x_i can only take on two different values here (0 and 1).

For 3, as discussed, you can use ROM.

For 4, you are out of luck. You need the means of the two groups (to compute ROM and its variance), but if you only know their difference, then this is not sufficient.

Best,
Wolfgang