Skip to content

[R-meta] Meta-analysis in R when there is no sampling variances

5 messages · Thapa, Resham - ARS, Resham Thapa, Viechtbauer Wolfgang (STAT) +1 more

#
Dear All,

I am conducting a meta-analysis in environmental research. Most of the literatures in my search did not report sufficient information to calculate sampling variances.  I have also seen meta-analysis conducted in my area of research with similar issues; they performed non-parametric bootstrapping procedure to calculate bootstrapped CIs for the weighted effect size. In these papers, they used Metawin software to weight individual effect size following Adams et al., 1997. Resampling tests for meta-analysis for ecological data. Ecology 78(5):1277-1283; where weights (wi) is given as:

Wi=(Nt*Nc)/(Nt+Nc); where Nt and Nc are replicates for treatment and control groups.

I want to use R and perform meta-analysis by following similar approach. Any helps and suggestions will be highly appreciated.

Thanks'
Resham




This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.
#
Dear Resham,

I approved your post as a non-member, but please sign up for the list: https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis 

What kind of outcome/effect size measure are you dealing with? I will venture a guess and assume you are dealing with (log-transformed) ratios of means. The actual sampling variance is then computed with:

vi = sd1i^2/(n1i*m1i^2) + sd2i^2/(n2i*m2i^2)

where m1i and m2i are the means of the first and second group, sd1i and sd2i are the SDs, and n1i and n2i are the group sizes. And the problem is that the SDs are not known (but you do know the means -- as otherwise you would not be able to compute the outcomes in the first place -- and you do know the group sizes).

Here are some thoughts:

Note that the sampling variance can also be written as:

vi = cv1i^2 / n1i + cv2i^2 / n2i

where cv1i and cv2i are the coefficient of variation for the first and second group. Maybe you could make an educated guess how large the CVs are and use that as a rough approximation to the actual sampling variances. Some people have used:

vi = 1 / n1i + 1 / n2i

which actually just implies that we assume that the CVs = 1 within each group.

Then fit the model as usual (e.g., rma(yi, vi) where yi are the log-transformed ratios of means and vi the approximate variances). Then you could follow this up with bootstrapping. Code for this can be found here: http://www.metafor-project.org/doku.php/tips:bootstrapping_with_ma

And/or you could consider using robust estimates of the standard errors for the fixed effects. That can be done using the robust() function (see help(robust) for details).

Best,
Wolfgang
#
Dear Dr. Wolfgang,

Thank you for your quick response.

In my dataset, I have sample size (replications) and means for both
treatment and control groups. With that, I am calculating the log response
ratio (lrr) as an effect size. Yes, most of the articles in my search have
not reported SD or CV. I found that many researchers in my field prefer to
weight effect sizes using Wi=(Nt*Nc)/(Nt+Nc); where Nt and Nc are
replicates for treatment and control groups. Here is one link of a paper:
https://www.researchgate.net/publication/315379724_
Increased_plant_uptake_of_native_soil_nitrogen_
following_fertilizer_addition_-_not_a_priming_effect ( Increased plant
uptake of native soil nitrogen following fertilizer addition ? not a
priming effect?).

My understanding is:
1. We can replace 'vi' with 'wi' and conduct the meta-analysis. Also, I
think multiple effect sizes from the study and multiple studies from the
same site will be more correlated with each other. To account for this, I
need to perform multi-level meta-analysis using random statement as
outlined in your website. So, my code looks like:

library(metafor)

dat <- read.csv('C:/Users/resham.thapa/Desktop/data_eg.csv') dat

# Calculating weighted effect size with multiple, nested
location/study/year as random effects.

res <- rma.mv(lrr, wi, random = ~ 1|site_id/study_id/site_year/com_control,
data=dat, method="REML")

res


*Is this correct?*


2.  Then, I need to perform non-parametric bootstrapping by slightly
modifying the codes from your website. However, I always get error while
running it. Any suggestions.


library(boot)

boot.func <- function(dat, indices) {

  res <- try(rma.mv(lrr, wi, random = ~
1|site_id/study_id/site_year/com_control,
data=dat, method="REML", subset=indices), silent=TRUE)

  if (is.element("try-error", class(res))) {

     NA

  } else {

     c(coef(res), vcov(res), res$sigma)

  }

}



res.boot <- boot(dat, boot.func, R=4999)

*## There were 50 or more warnings (use warnings() to see the first 50)*

warnings()

*# 'V' appears to be not positive definite.*

boot.ci(res.boot, index=1:2)

I have attached a subset of my data to illustrate about the nature of my
data-set more clearly. Thank you in advance.


Regards'
Resham


On Fri, Jun 30, 2017 at 11:54 AM, Viechtbauer Wolfgang (SP) <
wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20170630/308c1e24/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: data_eg.csv
Type: text/csv
Size: 5667 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20170630/308c1e24/attachment-0001.csv>
#
You cannot just replace 'vi' with 'wi'. Sampling variances and weights are different things.

As I suggested, calculate approximate sampling variances if that's the best thing that can be done. Use those for the 'V' argument (the second argument -- the one that you are currently misusing for specifying weights).

Since you have a multilevel structure, the bootstrapping needs to account for this. It would require considerable thought to implement this correctly.

An alternative is (still) to use robust(), using 'site_id' as the clustering variable.

Best,
Wolfgang

-----Original Message-----
From: Resham Thapa [mailto:reshambt1 at gmail.com] 
Sent: Friday, June 30, 2017 19:39
To: Viechtbauer Wolfgang (SP)
Cc: r-sig-meta-analysis at r-project.org
Subject: Re: [R-meta] Meta-analysis in R when there is no sampling variances

ATTACHMENT(S) REMOVED: data_eg.csv 

Dear Dr. Wolfgang,

Thank you for your quick response.

In my dataset, I have sample size (replications) and means for both treatment and control groups. With that, I am calculating the log response ratio (lrr) as an effect size. Yes, most of the articles in my search have not reported SD or CV. I found that many researchers in my field prefer to weight effect sizes using?Wi=(Nt*Nc)/(Nt+Nc); where Nt and Nc are replicates for treatment and control groups. Here is one link of a paper:?
https://www.researchgate.net/publication/315379724_Increased_plant_uptake_of_native_soil_nitrogen_following_fertilizer_addition_-_not_a_priming_effect?(?Increased plant uptake of native soil nitrogen following fertilizer addition ? not a priming effect?).

My understanding?is:
1. We can replace 'vi' with 'wi' and conduct the meta-analysis. Also, I think multiple effect sizes from the study and multiple studies from the same site will be more correlated with each other. To account for this, I need to perform multi-level meta-analysis using random statement as outlined in your website. So, my code looks like:

library(metafor)
dat <- read.csv('C:/Users/resham.thapa/Desktop/data_eg.csv') dat
# Calculating weighted effect size with multiple, nested location/study/year as random effects.
res <- rma.mv(lrr, wi, random = ~ 1|site_id/study_id/site_year/com_control, data=dat, method="REML")
res

Is this correct?

2. ?Then, I need to perform non-parametric bootstrapping by slightly modifying the codes from your website. However, I always get error while running it. Any suggestions.

library(boot)
boot.func <- function(dat, indices) {
??res <- try(rma.mv(lrr, wi, random = ~ 1|site_id/study_id/site_year/com_control, data=dat, method="REML", subset=indices), silent=TRUE)
??if (is.element("try-error", class(res))) {
?????NA
??} else {
?????c(coef(res), vcov(res), res$sigma)
??}
}
?
res.boot <- boot(dat, boot.func, R=4999)
## There were 50 or more warnings (use warnings() to see the first 50)
warnings()
#?'V' appears to be not positive definite.
boot.ci(res.boot, index=1:2)
I have attached a subset of my data to illustrate about the nature of my data-set more clearly. Thank you in advance.

Regards'
Resham?
#
An alternative approach to analyzing these data might be to use a
multi-level model with unknown variances at the lowest level, which are
assumed to be proportional to the inverse of the weights you've described.
Models like this could be fit using the nlme or lme4 packages. Since the
assumption that the sampling variances are proportional to the weights
might not be entirely reasonable, you might also consider using robust
standard errors (clustering on site_id) for the regression coefficient
estimates, as Wolfgang suggested. Cluster-robust variance estimation is
available in the clubSandwich package, so long as the model is fit using
nlme. Unfortunately clubSandwich doesn't yet worrk for models fit in lme4.

Or---even quicker and dirtier---you could fit a simple regression model
with lm() and weights as you've described, then cluster the standard errors
by side_id (again using clubSandwich).

James

On Fri, Jun 30, 2017 at 1:13 PM, Viechtbauer Wolfgang (SP) <
wolfgang.viechtbauer at maastrichtuniversity.nl> wrote: