[R-meta] simulating three-level meta-analysis
Hi Filippo, To add to Wolfgang's notes, here is a blog post where I walk through simulating a meta-analysis of standardized mean differences: https://www.jepusto.com/simulating-correlated-smds/ It covers a more general case than the 3-level meta-analysis, in which effect size estimates from a given study may be correlated at the sampling level, so it might be overkill for what you're doing. With Wolfgang's notes, your code is a reasonable way to simulate a three-level meta-analysis, especially if the purpose of your simulation is exploratory or pedagogical. There are a few ways that I think it could be improved in terms of its realism. These might be important if the simulation is for methodological research--as David Hoaglin (2015; https://doi.org/10.1002/jrsm.1146) argues, for research simulations, it's important to consider the ways in which the meta-analysis model is a simplification or idealization of plausible data-generating models. So, in that spirit, here are a few suggestions: * Drawing the standard errors from a uniform distribution on (0, 0.3) implies a certain distribution of study sizes, which might or might not be plausible for the context that you're thinking about. For a meta-analysis of group differences with equally sized groups, a standard error of 0.3 implies a total sample size of about 44. Using a uniform distribution implies that you will have many quite large studies: study_sizes <- 4 / runif(1000, 0, 0.3)^2 summary(study_sizes) Min. 1st Qu. Median Mean 3rd Qu. Max. 45 77 170 22732 830 11996245 An alternative approach would be to draw the standard errors from the empirical distribution of standard errors in a previous conducted synthesis that resembles the context you're interested in. Or, if there are studies on the distribution of sample sizes or typical power levels of studies in your area, you could use that to find a more realistic distribution of sample sizes/standard errors from which to simulate. * Your simulation involves generating a structure where every study has the same number of effect sizes. This "balanced" structure is a best-case scenario for the performance of the 3-level meta-analysis model. Simulating studies with varying numbers of effect sizes would be more realistic and might be important for performance. A simple way to do that is just studies_per_paper <- sample(1:10, size = ni, replace = TRUE) dat <- data.frame(paper = rep(1:ni, studies_per_paper), study = 1:sum(studies_per_paper)) table(dat$paper) One step better would be to use a distribution that is motivated by empirical data. You could do this by looking at the distribution of effect sizes per study from one or more previously conducted syntheses that have open data. * Your simulation generates effect size estimates that are exactly normally distributed. For most effect size metrics, this is a simplification that works reasonably well when sample sizes are large but less well when sample sizes are small, and which sometimes also depends on other features of the studies (e.g., for odds ratios, the adequacy of the normality approximation depends on the number of positive outcomes, not just the sample size). If you're in a context with small studies or effect sizes that aren't truly normally distributed, it would be more realistic to generate effect size estimates following a three-step process: 1) simulate the true effect size parameters by adding up the random effects at the paper level and study level; 2) simulate any other relevant study features (sample sizes, event rates in the control condition); and then 3) simulate the effect size estimates based on a realistic data-generating process, such as by sampling from a t distribution for standardized mean differences or sampling from binomial distributions for log-odds ratios. The blog post linked above demonstrates this process for correlated standardized mean difference effect sizes. Of course, this might be way more technical detail and way more nuance than you need. I offer it just in case it's useful to you or others working on simulations. James On Thu, Jun 16, 2022 at 3:11 AM Filippo Gambarota <
filippo.gambarota at gmail.com> wrote:
Thank you Wolfgang this is very helpful. With that sentence about sampling variances, I meant that (if I remember correctly) for tau estimation I have to use effect size weights (thus sampling variances) so I was afraid that I need to sample values according to some criteria (and not runif(...) as in my simulation) in order to recover my parameters. On Wed, 15 Jun 2022 at 19:36, Viechtbauer, Wolfgang (NP) < wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
Well, it doesn't want to go through, at least so far both of my replies are not showing up on the mailing list. No idea why. In case you reply, please still include the mailing list. Best, Wolfgang
-----Original Message----- From: Viechtbauer, Wolfgang (NP) Sent: Wednesday, 15 June, 2022 17:33 To: R meta Cc: 'Filippo Gambarota' Subject: RE: [R-meta] simulating three-level meta-analysis (I don't know why, but occasionally my posts to the mailing list do not
go
through. This seems to be happening again, so let me try to resend
this.)
Hi Filippo, Not sure what you mean by "sampling variances that will be associated
with a
certain value of tau2 (level 2) and tau3 (level 3)". But in any case,
your code
isn't quite simulating data according to the three level model. This
does:
dat <- data.frame(paper = rep(1:ni, each=nj), study = 1:nj) dat$b0_i <- rep(rnorm(ni, 0, taui), each=nj) dat$b0_j <- rnorm(nrow(dat), 0, tauj) dat$vi <- runif(nrow(dat), 0, 0.3) dat$yi <- b0 + dat$b0_i + dat$b0_j + rnorm(nrow(dat), 0, sqrt(dat$vi)) fit <- rma.mv(yi, vi, random = ~ study | paper, data = dat) fit Best, Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:
r-sig-meta-analysis-bounces at r-project.org] On
Behalf Of Filippo Gambarota
Sent: Wednesday, 15 June, 2022 13:44
To: R meta
Subject: [R-meta] simulating three-level meta-analysis
Hello!
I'm trying to simulate a three-level meta-analysis model but I'm stuck
on understanding how to generate sampling variances that will be
associated with a certain value of tau2 (level 2) and tau3 (level 3).
My approach so far is setting values for population parameters and
generating sampling variances for the effect ij:
```
library(metafor)
b0 <- 0.3 # effect
ni <- 100 # number of papers
nj <- 5 # number of effects within each paper
taui <- 0.3 # level 3 tau
tauj <- 0.1 # level 2 tau
icc <- taui^2 / sum(taui^2 + tauj^2) # real ICC
# i-level data
dati <- data.frame(paper = 1:ni,
b0_i = rnorm(ni, 0, taui))
# j-level data
datj <- data.frame(paper = rep(1:ni, nj),
study = 1:sum(nj),
b0_j = rnorm(sum(nj), 0, tauj))
dat <- merge(dati, datj, by = "paper") # combine
dat$vi <- runif(nrow(dat), 0, 0.3) # sampling variances at level 1
dat$yi <- b0 + dat$b0_i + dat$b0_j + rnorm(nrow(dat), 0, dat$vi)
fit <- rma.mv(yi, vi, random = ~study|paper, data = dat)
```
However, I'm sure that I'm doing wrong because taus are estimated
using vi and the vis are not sampled with specific criteria.
Does someone have some hints? Or in general suggestions about to setup
a simulation like this?
Thanks!
--
Filippo Gambarota
PhD Student - University of Padova
Department of Developmental and Social Psychology
Website: filippogambarota
Research Group: Colab Psicostat
-- *Filippo Gambarota* PhD Student - University of Padova Department of Developmental and Social Psychology Website: filippogambarota <https://filippogambarota.netlify.app/> Research Group: Colab <http://colab.psy.unipd.it/> Psicostat <https://psicostat.dpss.psy.unipd.it/> [[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org To manage your subscription to this mailing list, go to: https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis