metafor package: study level variation - R-help

Fri, Sep 7, 2012 7:02 AM #

Hello.  A quick question about incorporating variation due to study in the metafor package.  I'm working with a particular data set for meta-analysis where some studies have multiple measurements.  Others do not.  So, let's say the effect I'm looking at is response to two different kinds of drug treatment - let's call their effect sizes T1 and T2.  Some studies have multiple experiments measuring  T1 and T2.  Some have one of each.  Some only have T1 or T2.

Now, in metafor, I've been using

rma(yi = logRatio, vi=varLogRatio, mods=~ Drug.Type, data=mydata)

This works fine.  Out of curiosity, I ran a quickie model in lme4

lmer(logRatio ~ Drug.Type + (1+studyID), data=mydata, weights=varLogRatio)

and I noticed that the results are quite different, and this appears due to some variation due to study (after inspecting ranef - note, I included Drug.Type as a fixed effect as there were only two levels).

So, I went back to metafor and ran

rma(yi = logRatio, vi=varLogRatio, mods=~ Drug.Type+studyID, data=mydata)

which yielded the error

Error in qr.solve(wX, diag(k)) : singular matrix 'a' in solve
In addition: Warning message:
In rma(yi = logRatio, vi = varLogRatio, data = mydata, mods = ~Drug.Type  :
  Cases with NAs omitted from model fitting.

which appears to be due to the unbalanced nature of the dataset (some studies having T1 and T2, some having multiple measures of T1 and T2).

So, is there a way to properly incorporate studyID in a metafor using rma?  Is there an argument I'm missing, or perhaps should be using a different function?

Thanks!

-Jarrett

Michael Dewey

Sat, Sep 8, 2012 6:33 AM #

At 15:02 07/09/2012, Jarrett Byrnes wrote:

I think that you have the situation which is usually dealt with by 
some sort of network meta-analysis and as far as I know, although I 
am sure I shall quickly be told if I am wrong, you have to organise 
this yourself.

If you ignore the studies which only give rise to one effect size you 
could explore whether mvmeta works for your dataset although you may 
need to have information (on the variance-covariance matrix( which I 
suspect you do not have.

Michael Dewey
info at aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html

Viechtbauer Wolfgang (STAT)

Mon, Sep 10, 2012 1:52 AM #

As usual, Michael was faster than I in responding. Let me add a few thoughts of my own. See comments below in text.

Best,
Wolfgang

--   
Wolfgang Viechtbauer, Ph.D., Statistician   
Department of Psychiatry and Psychology   
School for Mental Health and Neuroscience   
Faculty of Health, Medicine, and Life Sciences   
Maastricht University, P.O. Box 616 (VIJV1)   
6200 MD Maastricht, The Netherlands   
+31 (43) 388-4170 | http://www.wvbauer.com

I assume there is also a control group/condition in each of these studies, so in other words, you have a bunch of studies where some are two-arm studies comparing Trt1 *or* Trt2 to control and some are three-arm studies comparing both Trt1 *and* Trt2 to control.

So, drug.type is a dummy variable (either Trt1 or Trt2), so the code above will fit the model:

yij = beta0 + beta1 Trt2 + uij + eij,

where yij is the jth observed outcome in the ith study, beta0 then corresponds to the (average) outcome for Trt1, beta1 indicates how much higher or lower the (average) outcome is for Trt2 compared to Trt1, uij ~ N(0, tau^2), and eij ~ N(0, varLogRatio). This model will treat three-arm studies as if they were two (independent) two-arm studies. Probably not ideal.

1) Did you use (1+studyID) or (1 | studyID)? The latter is probably what you meant/want to use.
2) You need to specify the *inverse* of the variances as weights.
3) This model assumes that the sampling variances are known up to a proportionality constant, not exactly known. You will therefore get what is sometimes called a multiplicative model for heterogeneity, with heterogeneity reflected in a residual variance estimate larger than 1. This model is different from the additive model (which is typically used), where the sampling variances are assumed to be known exactly and we *add* an additional random effect to reflect heterogeneity.

So, with (1 | studyID) and inverse sampling variance weights, you get the model:

yij = beta0 + beta1 Trt2 + ui + eij,

where ui ~ N(0, tau^2), eij ~ N(0, sigma^2 * varLogRatio). Now tau^2 reflects study-level variability and sigma^2 reflects multiplicative heterogeneity.

I would have to see:

with(mydata, model.matrix(~ Drug.Type + studyID))

to figure out what is going on here. The error indicates that you have some linear dependency between the columns of the design matrix. That should not happen based on what you describe. For example, suppose there are 4 studies, the first and fourth a three-arm studies, the second only examines Trt1 and the third only Trt2. Then:

(Intercept) trt2 study2 study3 study4
1           1    0      0      0      0
2           1    1      0      0      0
3           1    0      1      0      0
4           1    1      0      1      0
5           1    0      0      0      1
6           1    1      0      0      1

which is of full rank. That should be true regardless of how many studies (of each type) I add.

At the moment, metafor is really only a set up for univariate models (that should change in the near? future). The kind of multilevel/multivariate structure you are dealing with will require (at the moment) other tools.

Note that there is an additional issue with your data: If logRatio reflects the difference between Trt1 or Trt2 and Control, then in three-arm studies the two logRatio values are dependent since the data from the control group/condition is used twice. Note that this is statistical dependence over and beyond what is induced by potentially correlated true effects within the three-arm studies. See chapter 19 in the Handbook of Research Synthesis and Meta-Analysis (i.e., the 2nd ed).

As pointed out by Michael, you are essentially in a network meta-analysis type of situation. You may want to take a look at the following article for more details:

Salanti, G., Higgins, J. P. T., Ades, A. E., & Ioannidis, J. P. A. (2008). Evaluation of networks of randomized trials. Statistical Methods in Medical Research, 17(3), 279-301.

Jarrett Byrnes

Mon, Sep 10, 2012 8:08 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120910/dfb1989e/attachment.pl>

Michael Dewey

Tue, Sep 11, 2012 3:52 AM #

At 16:08 10/09/2012, Jarrett Byrnes wrote:

That looks more and more like network meta-analysis. As far as I know 
you have to write your own code for BUGS or JAGS and then use R to 
tidy up the consequences. I think you would benefit from following up 
the reference Wolfgang suggested.

If you have a univariate meta-analysis and need to account for 
clustering you could look at Mike Cheung's metaSEM package. He has 
cast meta-analysis in the structural equation framework. You would 
need to install OpenMX first and then install his package from his 
webpage. OpenMX is not on CRAN and nor is metaSEM.

http://openmx.psyc.virginia.edu/
http://courses.nus.edu.sg/course/psycwlm/Internet/

I'm curious, what other packages in R would be useful for this?  I 
can also change these from log ratios to log odds ratios - or at 
least obtain # with desirable response v. # with negative response 
for all treatment - perhaps this would be another way to go about it 
using meta.bin in the meta package - again, looking at each metric separately.

On Sep 10, 2012, at 4:52 AM, Viechtbauer Wolfgang (STAT) wrote:

As usual, Michael was faster than I in responding. Let me add a

few thoughts of my own. See comments below in text.

Best,
Wolfgang

--
Wolfgang Viechtbauer, Ph.D., Statistician
Department of Psychiatry and Psychology
School for Mental Health and Neuroscience
Faculty of Health, Medicine, and Life Sciences
Maastricht University, P.O. Box 616 (VIJV1)
6200 MD Maastricht, The Netherlands
+31 (43) 388-4170 | http://www.wvbauer.com

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Jarrett Byrnes
Sent: Friday, September 07, 2012 16:02
To: R help
Subject: [R] metafor package: study level variation

Hello.  A quick question about incorporating variation due to

study in the

metafor package.  I'm working with a particular data set for

meta-analysis

where some studies have multiple measurements.  Others do not.  So, let's
say the effect I'm looking at is response to two different kinds of drug
treatment - let's call their effect sizes T1 and T2.  Some studies have
multiple experiments measuring  T1 and T2.  Some have one of each.  Some
only have T1 or T2.

I assume there is also a control group/condition in each of these

studies, so in other words, you have a bunch of studies where some 
are two-arm studies comparing Trt1 *or* Trt2 to control and some 
are three-arm studies comparing both Trt1 *and* Trt2 to control.

Now, in metafor, I've been using

rma(yi = logRatio, vi=varLogRatio, mods=~ Drug.Type, data=mydata)

So, drug.type is a dummy variable (either Trt1 or Trt2), so the

code above will fit the model:

yij = beta0 + beta1 Trt2 + uij + eij,

where yij is the jth observed outcome in the ith study, beta0

then corresponds to the (average) outcome for Trt1, beta1 indicates 
how much higher or lower the (average) outcome is for Trt2 compared 
to Trt1, uij ~ N(0, tau^2), and eij ~ N(0, varLogRatio). This model 
will treat three-arm studies as if they were two (independent) 
two-arm studies. Probably not ideal.

This works fine.  Out of curiosity, I ran a quickie model in lme4

lmer(logRatio ~ Drug.Type + (1+studyID), data=mydata,

weights=varLogRatio)

and I noticed that the results are quite different, and this appears due
to some variation due to study (after inspecting ranef - note, I included
Drug.Type as a fixed effect as there were only two levels).

1) Did you use (1+studyID) or (1 | studyID)? The latter is

probably what you meant/want to use.

2) You need to specify the *inverse* of the variances as weights.
3) This model assumes that the sampling variances are known up to

a proportionality constant, not exactly known. You will therefore 
get what is sometimes called a multiplicative model for 
heterogeneity, with heterogeneity reflected in a residual variance 
estimate larger than 1. This model is different from the additive 
model (which is typically used), where the sampling variances are 
assumed to be known exactly and we *add* an additional random 
effect to reflect heterogeneity.

So, with (1 | studyID) and inverse sampling variance weights, you

get the model:

yij = beta0 + beta1 Trt2 + ui + eij,

where ui ~ N(0, tau^2), eij ~ N(0, sigma^2 * varLogRatio). Now

tau^2 reflects study-level variability and sigma^2 reflects 
multiplicative heterogeneity.

So, I went back to metafor and ran

rma(yi = logRatio, vi=varLogRatio, mods=~ Drug.Type+studyID, data=mydata)

which yielded the error

Error in qr.solve(wX, diag(k)) : singular matrix 'a' in solve
In addition: Warning message:
In rma(yi = logRatio, vi = varLogRatio, data = mydata, mods = ~Drug.Type
:
  Cases with NAs omitted from model fitting.

which appears to be due to the unbalanced nature of the dataset (some
studies having T1 and T2, some having multiple measures of T1 and T2).

I would have to see:

with(mydata, model.matrix(~ Drug.Type + studyID))

to figure out what is going on here. The error indicates that you

have some linear dependency between the columns of the design 
matrix. That should not happen based on what you describe. For 
example, suppose there are 4 studies, the first and fourth a 
three-arm studies, the second only examines Trt1 and the third only Trt2. Then:

study <- factor(c(1,1,2,3,4,4))
trt <- factor(c(1,2,1,2,1,2))
model.matrix(~ trt + study)

  (Intercept) trt2 study2 study3 study4
1           1    0      0      0      0
2           1    1      0      0      0
3           1    0      1      0      0
4           1    1      0      1      0
5           1    0      0      0      1
6           1    1      0      0      1

which is of full rank. That should be true regardless of how many

studies (of each type) I add.

So, is there a way to properly incorporate studyID in a metafor

using rma?

Is there an argument I'm missing, or perhaps should be using a different
function?

At the moment, metafor is really only a set up for univariate

models (that should change in the near? future). The kind of 
multilevel/multivariate structure you are dealing with will require 
(at the moment) other tools.

Note that there is an additional issue with your data: If

logRatio reflects the difference between Trt1 or Trt2 and Control, 
then in three-arm studies the two logRatio values are dependent 
since the data from the control group/condition is used twice. Note 
that this is statistical dependence over and beyond what is induced 
by potentially correlated true effects within the three-arm 
studies. See chapter 19 in the Handbook of Research Synthesis and 
Meta-Analysis (i.e., the 2nd ed).

As pointed out by Michael, you are essentially in a network

meta-analysis type of situation. You may want to take a look at the 
following article for more details:

Salanti, G., Higgins, J. P. T., Ades, A. E., & Ioannidis, J. P.

A. (2008). Evaluation of networks of randomized trials. Statistical 
Methods in Medical Research, 17(3), 279-301.

Thanks!

-Jarrett


        [[alternative HTML version deleted]]

Michael Dewey
info at aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html