[R-meta] Dependant variable in Meta Analysis

2 messages · Tarun Khanna, Wolfgang Viechtbauer

Fri, Aug 21, 2020 3:33 AM #

Thank you so much for all the insights so far. I am very grateful and looking forward to learning more in the meta analysis course in October.

I wanted to follow up on my question about dependant variable in meta-analysis. Just to summarize the discussion where we last left it. In the meta analysis that I am doing, there are 4 kinds of studies.

1. studies that estimate the equation ln (y) = b0 + b1x + e, where x is a dummy variable that distinguishes two groups (e.g., x = 0 for group 1 and x = 1 for group 2)
2. studies that estimate the equation y = b0 + b1x + e, x is a dummy variable that distinguishes two groups (e.g., x = 0 for group 1 and x = 1 for group 2)
3. studies that report mean and standard deviations of the two groups (mean and sd of y for x = 0 and x = 1)
4. studies that report the difference between the means of the two groups and the pooled standard deviation (mean and standard deviation of y at x = 1 - y at x = 0)

For the purpose of our meta analysis, studies of type 1 are most useful because b1*100 has the nice interpretation of percent change in y when x = 1. Ideally I would like to transform the other studies so that I can retain this interpretation even in case of the aggregated estimated effect size.

You had earlier recommended transforming estimates from studies of type 3 to ROM so that they are comparable to estimates from studies with ln (y) as dependant variable (Jensen's inequality aside). Could you perhaps also recommend a way to transform studies of the type 2 and 4 so that we that we can retain the interpretation of the overall effect size to be "percentage change in y when x = 1"?

Of course if that's not possible I would use the r_coefficients to calculate the aggregate effect size.

Thank for you your help and patience.

Best
Tarun

Tarun Khanna

PhD Researcher

Hertie School

Friedrichstra?e 180

10117 Berlin ? Germany
khanna at hertie-school.org ? www.hertie-school.org<http://www.hertie-school.org/>

From: Viechtbauer, Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: 07 June 2020 14:32:57
To: Tarun Khanna; r-sig-meta-analysis at r-project.org
Subject: RE: Dependant variable in Meta Analysis

See responses below.

>-----Original Message-----
>From: Tarun Khanna [mailto:khanna at hertie-school.org]
>Sent: Friday, 05 June, 2020 21:48
>To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis at r-project.org
>Subject: Re: Dependant variable in Meta Analysis
>
>Thank you for your clear answer.
>
>As you correctly said, most of the studies in my set use models of the form
>ln(y) = b0 + b1 + e. Can we relax the requirement of units of measurement of
>y in this case because the interpretation of b1 is % change in y for unit
>change in x?

b1 is not % change, exp(b1) is. But yes, one could combine estimates of b1 from different studies even if the units of y differ across studies, as long as they only differ by a multiplicative transformation.

>While most of the studies in my set employ regression models, some employ
>difference of means test (with the group means and standard error reported).
>How can I calculate coefficients in this case that are commensurable to the
>ones coming from studies that employ the regression models? Would converting
>the means to percentage change work? For example if mt is treatment mean and
>ct is control mean, then is the percentage difference mt-ct/ct commensurable
>with estimates coming from the regression? A previous meta analysis in the
>field does this but I am not sure if this is correct.

In the model ln(y) = b0 + b1 x + e, if x is a dummy variable that distinguishes two groups (e.g., x = 0 for group 1 and x = 1 for group 2), then b1 is the estimated mean difference of log(y) for the two groups. That's similar (but not the same -- see below) to using the log-transformed ratio of means as the effect size measure. See help(escalc) and search for "ROM". Using (mt-mc)/mc would not be correct to use, since b1 is not % change, but log-transformed % change. And log((mt-mc)/mc) = log(mt/mc - 1), which is like ROM, but not quite right (due to the -1).

The reason why using ROM isn't quite right is due to Jensen's inequality (https://en.wikipedia.org/wiki/Jensen's_inequality). b1 in the regression model is mean(log(y) for group 1) - mean(log(y) for group 2). However, you have mean(y for group 1) and mean(y for group 2) and when you compute "ROM" based on this, you get log(mean(y for group 1)) - log(mean(y for group 2)). These two mean differences are not the same. They might not differ greatly though. An example:

set.seed(1234)
x <- c(rep(0,50), rep(1,50))
y <- 100 + 5 * x + rnorm(100, 0, 10)
lm(log(y) ~ x)
mean(log(y)[x==1]) - mean(log(y)[x==0])
log(mean(y[x==1])) - log(mean(y[x==0])) # ROM
escalc(measure="ROM", m1i=mean(y[x==1]), m2i=mean(y[x==0]), sd1i=sd(y[x==1]), sd2i=sd(y[x==0]), n1i=50, n2i=50)

So, with this caveat aside (but discussed as part of the limitations), I would use ROM for those studies. You can also code 'b1 used vs ROM used' as a dummy variable and examine empirically via meta-regression if there are systematic differences between these two cases (although those could stem from other things besides Jensen's inequality).

Best,
Wolfgang

>From: Viechtbauer, Wolfgang (SP)
><wolfgang.viechtbauer at maastrichtuniversity.nl>
>Sent: 04 June 2020 15:10:04
>To: Tarun Khanna; r-sig-meta-analysis at r-project.org
>Subject: RE: Dependant variable in Meta Analysis
>
>Assuming that the coefficients are commensurable, you can just meta-analyze
>them directly. The squared standard errors of the coefficients are then the
>sampling variances.
>
>With commensurable, I mean that they measure the same thing and can be
>directly compared. For example, suppose the regression model y = b0 + b1 x +
>e has been examined in multiple studies. Since b1 reflects how many units y
>changes (on average) for a one-unit increase in x, the coefficient b1 is
>only comparable across studies if y has been measured in the same units
>across studies and x has been measured in the same units across studies (or
>if there is a known linear transformation that converts x from one study
>into the x from another study (and the same for y), then one can adjust b1
>to make it commensurable across studies).
>
>In certain models, one can relax the requirement that the units must be the
>same. For example, if the model is ln(y) = b0 + b1 x + e, then the units of
>y can actually differ across studies if they are multiplicative
>transformations of each other. If the model is ln(y) = b0 + b1 ln(x) + e,
>then x can also differ across studies in terms of a multiplicative
>transformation.
>
>I think the latter gets close to (or is?) what people in economics do to
>estimate 'elasticities' and this is in fact what you might be dealing with.
>
>Another complexity comes into play when there are other x's in the model.
>Strictly speaking, all models should include the same set of predictors as
>otherwise the coefficient of interest is 'adjusted for' different sets of
>covariates, which again makes it incommensurable. As a rough approximation
>to deal with different sets of covariates across studies, one could fit a
>meta-regression model (with the coefficient of interest as outcome) where
>one uses dummy variables to indicate for each study which covariates were
>included in the original regression models.
>
>Best,
>Wolfgang
>
>>-----Original Message-----
>>From: Tarun Khanna [mailto:khanna at hertie-school.org]
>>Sent: Thursday, 04 June, 2020 14:16
>>To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis at r-project.org
>>Subject: Re: Dependant variable in Meta Analysis
>>
>>Thank you for your reply Wolfgang.
>>
>>The "beta coefficients" that I refer to are not standardized regression
>>coefficients but the relevant regression coefficients in the original
>>studies. Would it be correct to direcly meta analyze the coefficients even
>>when they are not standardized? How to we take into account the standard
>>error of the coefficients? I have seen meta analysis in the literature that
>>use the tranformation beta coefficient/ (sample size)^1/2 but I don't see
>>how that takes into account the associated standard error.
>>
>>I have instead been calculating r coefficients using the t values of the
>>relevant coefficients and the sample size using the following formula.
>>
>>r = ( t^2 / (t^2 + sample size) )^1/2
>>
>>I have been using the r to Fisher's Z transformation that you
>>mentioned. Unfortunately, like you mentioned most of the studies
>>employ multivariate analysis and so the transformation is not accurate.
>What
>>would be the correct way to handle this?
>>
>>Best
>>Tarun
>>
>>Tarun Khanna
>>PhD Researcher
>>
>>Hertie School
>>
>>Friedrichstra?e 180
>>10117 Berlin ? Germany
>>khanna at hertie-school.org ? www.hertie-school.org<http://www.hertie-school.org>
>>________________________________________
>>From: Viechtbauer, Wolfgang (SP)
>><wolfgang.viechtbauer at maastrichtuniversity.nl>
>>Sent: 04 June 2020 13:56:59
>>To: Tarun Khanna; r-sig-meta-analysis at r-project.org
>>Subject: RE: Dependant variable in Meta Analysis
>>
>>Dear Tarun,
>>
>>What exactly do you mean by 'beta coefficient'? A standardized regression
>>coefficient? In the (very unlikely) case that the model includes no other
>>predictors and is just a standard regression model, then the standardized
>>regression coefficient for that single predictor is actually identical to
>>the correlation beteen the predictor and the outcome and converting this
>>correlation via Fisher's r-to-z transformation is fine (and then 1/(n-3)
>can
>>be used as the corresponding sampling variance). However, if there are
>other
>>predictors in the model, then the standardized regression coefficient is
>not
>>a simple correlation and while one can still apply Fisher's r-to-z
>>transformation to the coefficient, it will not have a variance of 1/(n-3)
>>and assuming so would be wrong.
>>
>>Why don't you just meta-analyze the 'beta coefficients' directly? If these
>>coefficients reflect percentage change, it sounds like they are 'unitless'
>>and comparable across studies. Then you get the pooled estimate of the
>>percentage change directly from the model.
>>
>>Best,
>>Wolfgang
>>
>>>-----Original Message-----
>>>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-
>>project.org]
>>>On Behalf Of Tarun Khanna
>>>Sent: Thursday, 04 June, 2020 13:41
>>>To: r-sig-meta-analysis at r-project.org
>>>Subject: [R-meta] Dependant variable in Meta Analysis
>>>
>>>Dear All,
>>>
>>>I am conducting a meta analysis of reduction in energy consumption in
>>>households that have been exposed to certain behavioural interventions in
>>>trials. The beta coefficients in the regressions in my the original
>studies
>>>can ususally be interpreted as percentage change in electricity
>>consumption.
>>>To do the meta analysis I am converting these beta coefficients to
>Fisher's
>>>Z. My problem is that Fisher's Z is not as easy to interpret as percentage
>>>change in energy consumption.
>>>
>>>Question 1: Is it possible to do the meta anlysis using the beta
>>>coefficients coming from the original studies so that the results remain
>>>easy to interpret?
>>>
>>>Question 2: Is it sensible to convert the final Fisher's Z estimates back
>>to
>>>the dependant variable coming from the studies?
>>>
>>>Sorry if this question sounds too basic.
>>>
>>>Best
>>>
>>>Tarun
>>>Tarun Khanna
>>>PhD Researcher
>>>Hertie School
>>>
>>>Friedrichstra?e 180
>>>10117 Berlin ? Germany
>>>khanna at hertie-school.org ? www.hertie-school.org<http://www.hertie-school.org>

6 days later

Wolfgang Viechtbauer

Thu, Aug 27, 2020 6:23 AM #

Hi Tarun,

For 1, exp(b1)*100 is the percent change, not b1*100.

For 2, if you know b0 and b1, then you know the mean of y for x=0 (b0) and the mean of y for x=1 (b0+b1). Now you also need the SD for x=0 and the SD for x=1, but this can't be recovered. However, if you know the MSE, then the square-root of that is the pooled within-group SD, so you can also use that. And you need to know the number of observations where x=0 and where x=1 (so those are the two group sizes, n0 and n1). Then you have everything to compute the ROM and its sampling variance.

If you don't know the MSE but the SE of b1 (or t = b1/SE[b1] from which one can easily recover SE[b1] or the p-value which one can transform into t, which then gives you the SE), then one can easily back-calculate the MSE from that (assuming you know n0 and n1), since

MSE = SE[b1]^2 * sum((x_i - mean(x))^2)

The second term can be computed if you know n0 and n1, since:

sum((x_i - mean(x))^2) = n0 * (0 - n1/(n0+n1))^2 + n1 * (1 - n1/(n0+n1))^2.

One can simplify this equation further, but this should make it clear that mean(x) is just the proportion of 1's and x_i can only take on two different values here (0 and 1).

For 3, as discussed, you can use ROM.

For 4, you are out of luck. You need the means of the two groups (to compute ROM and its variance), but if you only know their difference, then this is not sufficient.

Best,
Wolfgang

-----Original Message-----
From: Tarun Khanna [mailto:khanna at hertie-school.org]
Sent: Friday, 21 August, 2020 12:33
To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis at r-project.org
Subject: Re: Dependant variable in Meta Analysis

Thank you so much for all the insights so far. I am very grateful and
looking forward to learning more in the meta analysis course in October.

I wanted to follow up on my question about dependant variable in meta-
analysis. Just to summarize the discussion where we last left it. In the
meta analysis that I am doing, there are 4?kinds of studies.

1. studies that estimate the equation ln (y) = b0 + b1x + e, where?x is a
dummy variable that distinguishes two groups (e.g., x = 0 for group 1 and x
= 1 for group 2)
2. studies that estimate the equation y = b0 + b1x + e,?x is a dummy
variable that distinguishes two groups (e.g., x = 0 for group 1 and x = 1
for group 2)
3. studies that report mean and standard deviations of the two groups (mean
and sd of?y for x = 0 and x = 1)
4. studies that report the difference between the means of the?two groups
and the pooled standard deviation (mean and standard?deviation of?y at x = 1
- ?y at x = 0)

For the purpose of our meta analysis, studies of type 1 are most useful
because b1*100 has the nice interpretation of percent change in y when x =
1. Ideally I would like to transform the other studies so that I can retain
this interpretation even in case of the aggregated?estimated effect size.

You had earlier recommended transforming estimates from?studies of type 3 to
ROM so that they are comparable to estimates from studies with ln (y) as
dependant variable (Jensen's inequality aside). Could you perhaps also
recommend a way?to transform?studies of the type 2 and 4 so that we that we
can retain the interpretation of the overall effect size to be "percentage
change in y when x = 1"?

Of course if that's not possible I would use the r_coefficients to calculate
the aggregate effect size.

Thank for you your help and patience.

Best
Tarun

Tarun Khanna
PhD Researcher

Hertie School

Friedrichstra?e 180
10117 Berlin ? Germany
khanna at hertie-school.org ? www.hertie-school.org

________________________________________
From: Viechtbauer, Wolfgang (SP)
<wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: 07 June 2020 14:32:57
To: Tarun Khanna; r-sig-meta-analysis at r-project.org
Subject: RE: Dependant variable in Meta Analysis

See responses below.

-----Original Message-----
From: Tarun Khanna [mailto:khanna at hertie-school.org]
Sent: Friday, 05 June, 2020 21:48
To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis at r-project.org
Subject: Re: Dependant variable in Meta Analysis

Thank you for your clear answer.

As you correctly said, most of the studies in my set use models of the form
ln(y) = b0?+?b1?+?e. Can we relax the requirement of units of measurement
of
y in this case because the interpretation of b1 is %?change in y for unit
change in?x?

b1 is not % change, exp(b1) is. But yes, one could combine estimates of b1
from different studies even if the units of y differ across studies, as long
as they only differ by a multiplicative transformation.

While most of the studies in my set?employ regression models, some employ
difference of means test (with the group means and standard error
reported).
How can I calculate coefficients in this case?that are commensurable to the
ones coming from studies that employ the regression models? Would
converting
the means to percentage change work? For example if mt is treatment mean
and
ct is control mean, then is the percentage difference mt-ct/ct
commensurable
with estimates coming from the regression? A previous meta analysis in the
field does this but I am not sure if this is correct.

In the model ln(y) = b0 + b1 x + e, if x is a dummy variable that
distinguishes two groups (e.g., x = 0 for group 1 and x = 1 for group 2),
then b1 is the estimated mean difference of log(y) for the two groups.
That's similar (but not the same -- see below) to using the log-transformed
ratio of means as the effect size measure. See help(escalc) and search for
"ROM". Using (mt-mc)/mc would not be correct to use, since b1 is not %
change, but log-transformed % change. And log((mt-mc)/mc) = log(mt/mc - 1),
which is like ROM, but not quite right (due to the -1).

The reason why using ROM isn't quite right is due to Jensen's inequality
(https://en.wikipedia.org/wiki/Jensen's_inequality). b1 in the regression
model is mean(log(y) for group 1) - mean(log(y) for group 2). However, you
have mean(y for group 1) and mean(y for group 2) and when you compute "ROM"
based on this, you get log(mean(y for group 1)) - log(mean(y for group 2)).
These two mean differences are not the same. They might not differ greatly
though. An example:

set.seed(1234)
x <- c(rep(0,50), rep(1,50))
y <- 100 + 5 * x + rnorm(100, 0, 10)
lm(log(y) ~ x)
mean(log(y)[x==1]) - mean(log(y)[x==0])
log(mean(y[x==1])) - log(mean(y[x==0])) # ROM
escalc(measure="ROM", m1i=mean(y[x==1]), m2i=mean(y[x==0]),
sd1i=sd(y[x==1]), sd2i=sd(y[x==0]), n1i=50, n2i=50)

So, with this caveat aside (but discussed as part of the limitations), I
would use ROM for those studies. You can also code 'b1 used vs ROM used' as
a dummy variable and examine empirically via meta-regression if there are
systematic differences between these two cases (although those could stem
from other things besides Jensen's inequality).

Best,
Wolfgang

From: Viechtbauer, Wolfgang (SP)
<wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: 04 June 2020 15:10:04
To: Tarun Khanna; r-sig-meta-analysis at r-project.org
Subject: RE: Dependant variable in Meta Analysis

Assuming that the coefficients are commensurable, you can just meta-analyze
them directly. The squared standard errors of the coefficients are then the
sampling variances.

With commensurable, I mean that they measure the same thing and can be
directly compared. For example, suppose the regression model y = b0 + b1 x
+
e has been examined in multiple studies. Since b1 reflects how many units y
changes (on average) for a one-unit increase in x, the coefficient b1 is
only comparable across studies if y has been measured in the same units
across studies and x has been measured in the same units across studies (or
if there is a known linear transformation that converts x from one study
into the x from another study (and the same for y), then one can adjust b1
to make it commensurable across studies).

In certain models, one can relax the requirement that the units must be the
same. For example, if the model is ln(y) = b0 + b1 x + e, then the units of
y can actually differ across studies if they are multiplicative
transformations of each other. If the model is ln(y) = b0 + b1 ln(x) + e,
then x can also differ across studies in terms of a multiplicative
transformation.

I think the latter gets close to (or is?) what people in economics do to
estimate 'elasticities' and this is in fact what you might be dealing with.

Another complexity comes into play when there are other x's in the model.
Strictly speaking, all models should include the same set of predictors as
otherwise the coefficient of interest is 'adjusted for' different sets of
covariates, which again makes it incommensurable. As a rough approximation
to deal with different sets of covariates across studies, one could fit a
meta-regression model (with the coefficient of interest as outcome) where
one uses dummy variables to indicate for each study which covariates were
included in the original regression models.

Best,
Wolfgang

-----Original Message-----
From: Tarun Khanna [mailto:khanna at hertie-school.org]
Sent: Thursday, 04 June, 2020 14:16
To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis at r-project.org
Subject: Re: Dependant variable in Meta Analysis

Thank you for your reply Wolfgang.

The "beta coefficients" that I refer to are not standardized?regression
coefficients but the relevant?regression coefficients in the original
studies. Would it be correct to direcly?meta analyze the coefficients?even
when?they are not standardized? How to we take into account the standard
error of the?coefficients? I have seen meta analysis in the literature
that
use the tranformation beta coefficient/ (sample size)^1/2 but I don't see
how that takes into account the associated standard error.

I have instead been calculating r coefficients using the t values of the
relevant coefficients and the sample size using the following formula.

r = (?t^2 / (t^2 + sample size)?)^1/2

I have been using the r to Fisher's Z transformation that you
mentioned.?Unfortunately, like you mentioned most of the studies
employ?multivariate analysis and so the transformation is not accurate.
What
would be the correct?way to handle this?

Best
Tarun

Tarun Khanna
PhD Researcher

Hertie School

Friedrichstra?e 180
10117 Berlin ? Germany
khanna at hertie-school.org ? www.hertie-school.org
________________________________________
From: Viechtbauer, Wolfgang (SP)
<wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: 04 June 2020 13:56:59
To: Tarun Khanna; r-sig-meta-analysis at r-project.org
Subject: RE: Dependant variable in Meta Analysis

Dear Tarun,

What exactly do you mean by 'beta coefficient'? A standardized regression
coefficient? In the (very unlikely) case that the model includes no other
predictors and is just a standard regression model, then the standardized
regression coefficient for that single predictor is actually identical to
the correlation beteen the predictor and the outcome and converting this
correlation via Fisher's r-to-z transformation is fine (and then 1/(n-3)
can
be used as the corresponding sampling variance). However, if there are
other
predictors in the model, then the standardized regression coefficient is
not
a simple correlation and while one can still apply Fisher's r-to-z
transformation to the coefficient, it will not have a variance of 1/(n-3)
and assuming so would be wrong.

Why don't you just meta-analyze the 'beta coefficients' directly? If these
coefficients reflect percentage change, it sounds like they are 'unitless'
and comparable across studies. Then you get the pooled estimate of the
percentage change directly from the model.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-
project.org]
On Behalf Of Tarun Khanna
Sent: Thursday, 04 June, 2020 13:41
To: r-sig-meta-analysis at r-project.org
Subject: [R-meta] Dependant variable in Meta Analysis

Dear All,

I am conducting a meta analysis of reduction in energy consumption in
households that have been exposed to certain behavioural interventions in
trials. The beta coefficients in the regressions in my the original
studies
can ususally be interpreted as percentage change in electricity
consumption.
To do the meta analysis I am converting these beta coefficients to
Fisher's
Z. My problem is that Fisher's Z is not as easy to interpret as
percentage
change in energy consumption.

Question 1: Is it possible to do the meta anlysis using the beta
coefficients coming from the original studies so that the results remain
easy to interpret?

Question 2: Is it sensible to convert the final Fisher's Z estimates back
to
the dependant variable coming from the studies?

Sorry if this question sounds too basic.

Best

Tarun
Tarun Khanna
PhD Researcher
Hertie School

Friedrichstra?e 180
10117 Berlin ? Germany
khanna at hertie-school.org ? www.hertie-school.org