[R-meta] RVE with rma: different heterogeneity for different values of the constant correlation
Dear Andreas, It is always useful to indicate if the same question was asked elsewhere -- in this case: https://stats.stackexchange.com/q/645446/1934 That way, those responding know what has already been said. Please see below for my responses (similiar to what I answered on CV). Best, Wolfgang
-----Original Message----- From: R-sig-meta-analysis <r-sig-meta-analysis-bounces at r-project.org> On Behalf Of Andreas Voldstad via R-sig-meta-analysis Sent: Wednesday, April 24, 2024 14:59 To: r-sig-meta-analysis at r-project.org Cc: Andreas Voldstad <andreas.voldstad at kellogg.ox.ac.uk> Subject: [R-meta] RVE with rma: different heterogeneity for different values of the constant correlation Dear meta-analysts, I am conducting a meta-analysis of standardised mean differences with many sources of dependency (multiple outcomes measuring the same construct, multiple control groups, outcomes from dyads (e.g., separate patient-carer and husband- wife scores), and multiple timepoints (post and followups)). I used metafor::rma.mv to construct 3-level models (random = ~1|study/effectsize) in combination with RVE (metafor::robust). I have conducted sensitivity analyses with different assumptions for the constant correlation rho. As expected, the pooled effect and standard error is nearly identical across a range of values of rho, with the same inference of a significant effect. However, the variance components and heterogeneity are not affected by RVE and are different for different values of the correlation. For the full dataset, which includes an extreme outlier study, which also has extreme differences between the effect sizes within the study, total I2 was practically identical across values of rho, but the levels varied. I2 level 3 was inversely proportional to rho, and varied from 80.45 to 88.42. Removing the outlier, total I2 varied from 40.73, 95% CI = [3.47, 73.86] to 55.92 [24.04, 79.38], with I2 level 3 from 6.81 to 55.92. Q varied, with some models showing highly significant heterogeneity (e.g., p<.001) and some models showing non-significant heterogeneity (p>.05). Based on this, I have the following questions: Question 1: It seems I do not know how large the proportion of heterogeneity actually is, and it is sensitive to my imputed constant correlation. I was wondering if you have any suggestions regarding cluster-robust inferences on heterogeneity.
Indeed - the variance components are not affected by cluster-robust inference methods, it only affects how the SEs of the fixed effects are computed.
Question 2: I was wondering if the ICC that can be calculated after fitting the model can be used as an indication of how "right" the initial guess of the constant correlation is? E.g., rma_mv_model$sigma2[1]/ sum(rma_mv_model$sigma2) or the "rho" value that is produced by rma.mv when reparametrizing the model as random = ~factor(effectsize)|Study
No, because the 'constant correlation' (which is presumably used in constructing some approximate V matrix) has nothing to do with the ICC of the model.
Based on a suggestion from James Pustejovsky, I compared the log likelihood and other information criteria of models fitted with different values for rho. For the full dataset with the extreme outlier study, I got the curious result that the loglikelihood was better at lower values of rho and the best-fitting model was the one with the smallest rho tested (.05). Removing the outliers, a reasonable value of rho (.5) gave the best fit. This was smaller than my initial guess of .7. However, that guess was based on known correlations between partners in dyads, correlations between outcomes, correlations between timepoints.
There is presumably a lot of uncertainly attached to this estimate. Also, assuming a constant correlation of course does not reflect reality, but there is often little else that can be done.
There are other pairs of effects that might be expected to be less correlated than this (e.g., cases such as: the correlation between partner A's effect size in comparison with control group A at time 2, and partner B's effect size in comparison to control group B at time 3.)
One could try to finetune the construction of the V matrix to reflect this, but this is often more trouble than worth the effort.
Question 3: As mentioned, some correlations are known, at least from some large studies in the dataset. For studies with multiple control groups, I know there are ways to calculate the covariance between effect sizes using the sample size. However, I am not sure how to build the whole covariance matrix based on this information and using clubsandwich::pattern_covariance_matrix or impute_covariance_matrix, since the dataset contains all of these sources of dependency at the same time, and sometimes within the same study (e.g., example at the end of question 2). Is there any guidance available for this situation?
The vcalc() function from metafor provides more flexibility: https://wviechtb.github.io/metafor/reference/vcalc.html so you could check that out.
Best wishes, Andreas Voldstad (he/him) PhD student in Psychiatry University of Oxford