[R-meta] Multi-level model accounting for within-cluster correlation
Hi Guido, The difference between your res1 and res2 is the assumption about the sampling correlation of effect size estimates nested within clusters. Say that you've got effect size estimate ES_ij for effect i nested in cluster j, which is an estimate of the true effect delta_ij. Your res1 makes the assumption that cor(ES_hj - delta_hj, ES_ij - delta_ij) = 0.6 for all h,i in a given cluster and then for all clusters. Your res2 makes the assumption that this correlation is zero. We can think of these as two different "working models," neither of which is necessarily correct. Here is my understanding of the properties of the estimators from each working model: 1. Under either assumption, the point-estimate of the average effect across clusters will be unbiased. The point-estimate of the average effect will be more efficient to the extent that the working model is closer to the true data-generating process. 2. The model-based standard errors generated by metafor might be off (to some extent) if the working model is incorrect, but this can be fixed by using robust variance estimation methods: robust(res1, cluster = cluster_id, clubSandwich = TRUE) robust(res2, cluster = cluster_id, clubSandwich = TRUE) 3. The variance component estimates of either model can be biased to some extent if the working model is incorrect. Roughly speaking, the variance component estimates will be less biased when the working model is closer to correct. In light of the above, it seems to me that the thing to do is pick whichever working model you find more plausible as a representation of the real data-generating process. If it's more plausible that there should be positive sampling correlation between the effect size estimates than that there is no correlation, then I would go with res1 plus robust variance estimation. All that said, it might be good to explain a bit more about the data structure here. It looks like you've got effect size estimates nested in study IDs nested in cluster IDs. Are there multiple effect size estimates per study ID? Or only one? If only one, then why would you expect there to be correlation between effect size estimates from different studies (even if from the same cluster)? James On Fri, Jun 16, 2023 at 9:57?AM Dr. Guido Schwarzer via R-sig-meta-analysis
<r-sig-meta-analysis at r-project.org> wrote:
Hi, In statistical consulting, a Master's student asked me whether the following R code is correct to conduct a multi-level meta-analysis: ## assume that the effect sizes within studies are correlated with rho = 0.6 V <- vcalc(vi, cluster = cluster_id, obs = study_id, data = dat, rho = 0.6) ## fit multilevel model using this approximate V matrix res1 <- rma.mv(yi, V, random = ~ 1 | cluster_id / study_id, data = dat) To my understanding, the advantage of a multi-level model is that no assumption on the within-cluster correlation is required / the correlation must no be specified, i.e., the model would be res2 <- rma.mv(yi, vi, random = ~ 1 | cluster_id / study_id, data = dat) Am I correct? And, if so, does the above model using the block diagonal covariance matrix V make any sense? Best, Guido
_______________________________________________ R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org To manage your subscription to this mailing list, go to: https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis