Hi Filippo, Please keep the listserv cc'd. Responses below. James On Tue, Jul 6, 2021 at 3:57 PM Filippo Gambarota <
filippo.gambarota at gmail.com> wrote:
thank you James, this is extremely clear. If I understand correctly, my model under some situations could be not appropriate for variance estimation. Do you think that a bayesian model could partially mitigate the variance estimation? especially for the within-cluster variance?
A Bayesian model would still require making some assumption about the correlation between effect size estimates, so it doesn't necessarily solve the problem (although, it could be a good approach for all the usual reasons that Bayesian inference is useful).
In general yes, I have let's say 10 studies where the same sample is tested with several response variables. The three-level seemed the most appropriate to me (especially compared to the multivariate approach).
To get improved estimates of variance components, I think the best thing to do would be to try and collect information about the correlations between outcomes and use that to specify a variance-covariance matrix for effect size estimates, as has been discussed in several previous exchanges on the listserv. Short of that, you could specify an approximate variance-covariance matrix, making some simplified assumption about the unknown correlations, and then do sensitivity analysis across a range of correlations.
Do you suggest the robust variance estimation approach?
RVE can be helpful for improving the coverage properties of confidence intervals and the calibration of hypothesis tests *for average effect sizes*, especially when there is concern about potential model mis-specification. Fernandez-Castilla and colleagues (citation below) also found that it can be helpful to use RVE in combination with 3LMA for this purpose. But RVE does not help with estimation of variance components. James
On Tue, Jul 6, 2021, 11:11 PM James Pustejovsky <jepusto at gmail.com> wrote:
Hi Filippo, To add to Wolfgang's response, one note of caution regarding interpreting the variance components in the three-level meta-analysis (3LMA) model is that the variance component estimates are somewhat sensitive to assumptions. If your data structure involves multiple, correlated effect size estimates (i.e., estimates based on the same sample of participants, so that the sampling errors of the estimates are correlated), then the 3LMA model involves some degree of model mis-specification. Currently available evidence suggests that the 3LMA may be fairly robust with respect to inferences about *average effect sizes*---that is, even though the model is mis-specified, hypothesis tests and confidence intervals based on the model still have calibration rates that are close to correct. This robustness property does NOT extend to estimation of variance components. If the model is mis-specified, then there will generally be some degree of systematic bias in the variance component estimates. For instance, say that the true correlation between effect size estimates from the same sample is around r = 0.6. Using the 3LMA is equivalent to assuming r = 0.0. As far as I understand, this will lead to estimates of within-study heterogeneity that are systematically *too small* and estimates of between-study heterogeneity that are systematically *too large*. How strong the biases are depends on the structure of your data, so it's hard to say much further here. To your other question:
Do we interpret it as an average variability within each cluster among clusters? Or we are assuming that each cluster has the same within-cluster variability?
I would say that the answer is "both." As formulated, the 3LMA model does make the assumption that each cluster has the same within-cluster variance component (i.e., homogeneity of variance within clusters). But even if this assumption is incorrect, the estimated within-cluster variance will be some sort of weighted average of the within-cluster variances, at least at an approximate level. In principle, you could estimate cluster-specific variances using the following (assuming that every value of outcome is unique across studies): ``` rma.mv(yi, vi, random = list (~1|study, ~ study | outcome, struct = "DIAG") ``` But this probably isn't a good idea unless you have a lot of estimates from every cluster. And the comments above regarding model mis-specification apply here as well. Kind Regards, James