Dear List Members,
We employed the rma.mv function from the metafor package to perform a
meta-analysis where effect sizes were nested within samples, and samples
were nested within countries. The total number of effect sizes exceeded
8,000. Below, I provide a toy example, in which I randomly sampled 626
effect sizes from 351 samples across 87 countries.
We specified a variance-covariance matrix (vcov_mat) to account for the
observed effect sizes within each sample. The corresponding code was as
follows:
M1 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random = list(~ 1
| COUNTRY / SampleID / ESID), sparse = FALSE)
Here are the results:
Multivariate Meta-Analysis Model (k = 626; method: REML)
logLik ? Deviance? ? ? ? AIC? ? ? ? BIC? ? ? ? AICc?
728.1443 -1456.2886 -1448.2886? -1430.5376? -1448.2241?
Variance Components:
? ? ? estim? ? sqrt? nlvls? fixed? ? ? ? ? ? ? ? factor
sigma^2.1 0.0042 0.0648 ? 87? ? no? ? ? ? ? ? ? ? COUNTRY
sigma^2.2 0.0037 0.0610 ? 351? ? no? ? ? COUNTRY/SampleID
sigma^2.3 0.0021 0.0459 ? 626? ? no? COUNTRY/SampleID/ESID
Test for Heterogeneity:
Q(df = 625) = 23584.2025, p-val < .0001
Model Results:
estimate se? ? ? zval? ? pval? ? ci.lb? ? ci.ub? ? ?
-0.2620 0.0085 -30.7263 <.0001? -0.2788? -0.2453? ***
In addition to I? and the variance components at various levels (effect
sizes, samples, and countries), we used the Q-test statistic to assess the
heterogeneity of effect sizes.
An expert reviewer of our meta-analysis pointed out potential ambiguities in
how we interpreted the Q-test statistic. Specifically, the reviewer said
that the Q-test statistic is "the test of the between-clusters variation
(whatever the clusters are in the model)."
However, I am unsure how to apply this interpretation to the Q-test
statistic included in the metafor output. I learned from the help section of
the rma.mv function that the Q "is the generalized/weighted least squares
extension of Cochran's Q-test, which tests whether the variability in the
observed effect sizes or outcomes is larger than one would expect based on
sampling variability (and the given covariances among the sampling errors)
alone. A significant test suggests that the true effects/outcomes are
heterogeneous."
In our case, the Q suggests that the observed effect sizes vary
significantly (p < .0001) around the average effect size (r = -0.26).
Furthermore, the Q provided by metafor points to statistically significant
heterogeneity, with heterogeneity referring to the total variance
encompassing all potential sources of variance, including effect sizes,
samples, and countries. However, I am unsure whether this is what the
reviewer meant by interpreting the Q as "between-clusters variation."
I would highly appreciate any help in clarifying the interpretation of the
Q-test statistic.
Thank you!
Best regards,
Martin
PS: I apologize for the poor formatting of the metafor output, but my email
program does not support better formatting options
.
[R-meta] Interpretation of the Q-test statistic in a multilevel meta-analysis
3 messages · Wolfgang Viechtbauer, Prof. Dr. Martin Brunner
Dear Martin, First of all: Over 8,000 effect sizes?!? Wow, you might be breaking some kind of record there. A sidenote: Given the model below, I would suspect that 'sparse=TRUE' would help to speed up model fitting. Now for your actual question: No, the Q-test does not test for "between-clusters variation" (at least not in the sense that it tests for variation between the units of the highest level in the multilevel structure, which seems to be what the reviewer is implying). The docs, which you read (thanks!), correct spell out what the Q-test is testing. In essence, it is testing the given model against one without any random effects. In your case, this would be: M1 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random = ~ 1 | COUNTRY / SampleID / ESID) M0 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat) anova(M0, M1) except that this will give you a likelihood ratio test of the random effects, while the Q-test is comparing M0 against a model where every effect size is allowed to have its own fixed effect. So the test statistics are not the same, but conceptually, the two approaches are comparable. If you want to test for between-country variation, then one can do a LRT comparing model M1 above against one where the country-level variance component is constrained to 0: M0a <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random = ~ 1 | COUNTRY / SampleID / ESID, sigma2=c(0,NA,NA)) anova(M0a, M1) Model M0a assumes that there is no between-country variation, but it does allow for between-sample (within country) variation and between-effect-size (within sample) variation. So this is quite different than what the Q-test does (and hence the comparison between M0 and M1). I hope this clarifies things. Best, Wolfgang
-----Original Message-----
From: R-sig-meta-analysis <r-sig-meta-analysis-bounces at r-project.org> On Behalf
Of Martin Brunner via R-sig-meta-analysis
Sent: Wednesday, September 11, 2024 10:23
To: r-sig-meta-analysis at r-project.org
Cc: Martin Brunner <martin.brunner at uni-potsdam.de>
Subject: [R-meta] Interpretation of the Q-test statistic in a multilevel meta-
analysis
Dear List Members,
We employed the rma.mv function from the metafor package to perform a
meta-analysis where effect sizes were nested within samples, and samples
were nested within countries. The total number of effect sizes exceeded
8,000. Below, I provide a toy example, in which I randomly sampled 626
effect sizes from 351 samples across 87 countries.
We specified a variance-covariance matrix (vcov_mat) to account for the
observed effect sizes within each sample. The corresponding code was as
follows:
M1 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random = list(~ 1
| COUNTRY / SampleID / ESID), sparse = FALSE)
Here are the results:
Multivariate Meta-Analysis Model (k = 626; method: REML)
logLik Deviance AIC BIC AICc
728.1443 -1456.2886 -1448.2886 -1430.5376 -1448.2241
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0042 0.0648 87 no COUNTRY
sigma^2.2 0.0037 0.0610 351 no COUNTRY/SampleID
sigma^2.3 0.0021 0.0459 626 no COUNTRY/SampleID/ESID
Test for Heterogeneity:
Q(df = 625) = 23584.2025, p-val < .0001
Model Results:
estimate se zval pval ci.lb ci.ub
-0.2620 0.0085 -30.7263 <.0001 -0.2788 -0.2453 ***
In addition to I? and the variance components at various levels (effect
sizes, samples, and countries), we used the Q-test statistic to assess the
heterogeneity of effect sizes.
An expert reviewer of our meta-analysis pointed out potential ambiguities in
how we interpreted the Q-test statistic. Specifically, the reviewer said
that the Q-test statistic is "the test of the between-clusters variation
(whatever the clusters are in the model)."
However, I am unsure how to apply this interpretation to the Q-test
statistic included in the metafor output. I learned from the help section of
the rma.mv function that the Q "is the generalized/weighted least squares
extension of Cochran's Q-test, which tests whether the variability in the
observed effect sizes or outcomes is larger than one would expect based on
sampling variability (and the given covariances among the sampling errors)
alone. A significant test suggests that the true effects/outcomes are
heterogeneous."
In our case, the Q suggests that the observed effect sizes vary
significantly (p < .0001) around the average effect size (r = -0.26).
Furthermore, the Q provided by metafor points to statistically significant
heterogeneity, with heterogeneity referring to the total variance
encompassing all potential sources of variance, including effect sizes,
samples, and countries. However, I am unsure whether this is what the
reviewer meant by interpreting the Q as "between-clusters variation."
I would highly appreciate any help in clarifying the interpretation of the
Q-test statistic.
Thank you!
Best regards,
Martin
PS: I apologize for the poor formatting of the metafor output, but my email
program does not support better formatting options.
Dear Wolfgang, thank you so much for this enlightening clarification and the further suggestions to test key assumptions of our model. Best, Martin On Mi, 11 Sep 2024 10:51:42 +0000 Viechtbauer, Wolfgang (NP) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
Dear Martin, First of all: Over 8,000 effect sizes?!? Wow, you might be breaking some kind of record there. A sidenote: Given the model below, I would suspect that 'sparse=TRUE' would help to speed up model fitting. Now for your actual question: No, the Q-test does not test for "between-clusters variation" (at least not in the sense that it tests for variation between the units of the highest level in the multilevel structure, which seems to be what the reviewer is implying). The docs, which you read (thanks!), correct spell out what the Q-test is testing. In essence, it is testing the given model against one without any random effects. In your case, this would be: M1 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random = ~ 1 | COUNTRY / SampleID / ESID) M0 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat) anova(M0, M1) except that this will give you a likelihood ratio test of the random effects, while the Q-test is comparing M0 against a model where every effect size is allowed to have its own fixed effect. So the test statistics are not the same, but conceptually, the two approaches are comparable. If you want to test for between-country variation, then one can do a LRT comparing model M1 above against one where the country-level variance component is constrained to 0: M0a <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random = ~ 1 | COUNTRY / SampleID / ESID, sigma2=c(0,NA,NA)) anova(M0a, M1) Model M0a assumes that there is no between-country variation, but it does allow for between-sample (within country) variation and between-effect-size (within sample) variation. So this is quite different than what the Q-test does (and hence the comparison between M0 and M1). I hope this clarifies things. Best, Wolfgang
-----Original Message-----
From: R-sig-meta-analysis
<r-sig-meta-analysis-bounces at r-project.org> On Behalf
Of Martin Brunner via R-sig-meta-analysis
Sent: Wednesday, September 11, 2024 10:23
To: r-sig-meta-analysis at r-project.org
Cc: Martin Brunner <martin.brunner at uni-potsdam.de>
Subject: [R-meta] Interpretation of the Q-test statistic in a
multilevel meta-
analysis
Dear List Members,
We employed the rma.mv function from the metafor package to perform
a
meta-analysis where effect sizes were nested within samples, and
samples
were nested within countries. The total number of effect sizes
exceeded
8,000. Below, I provide a toy example, in which I randomly sampled
626
effect sizes from 351 samples across 87 countries.
We specified a variance-covariance matrix (vcov_mat) to account for
the
observed effect sizes within each sample. The corresponding code was
as
follows:
M1 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random =
list(~ 1
| COUNTRY / SampleID / ESID), sparse = FALSE)
Here are the results:
Multivariate Meta-Analysis Model (k = 626; method: REML)
logLik Deviance AIC BIC AICc
728.1443 -1456.2886 -1448.2886 -1430.5376 -1448.2241
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0042 0.0648 87 no COUNTRY
sigma^2.2 0.0037 0.0610 351 no COUNTRY/SampleID
sigma^2.3 0.0021 0.0459 626 no COUNTRY/SampleID/ESID
Test for Heterogeneity:
Q(df = 625) = 23584.2025, p-val < .0001
Model Results:
estimate se zval pval ci.lb ci.ub
-0.2620 0.0085 -30.7263 <.0001 -0.2788 -0.2453 ***
In addition to I? and the variance components at various levels
(effect
sizes, samples, and countries), we used the Q-test statistic to
assess the
heterogeneity of effect sizes.
An expert reviewer of our meta-analysis pointed out potential
ambiguities in
how we interpreted the Q-test statistic. Specifically, the reviewer
said
that the Q-test statistic is "the test of the between-clusters
variation
(whatever the clusters are in the model)."
However, I am unsure how to apply this interpretation to the Q-test
statistic included in the metafor output. I learned from the help
section of
the rma.mv function that the Q "is the generalized/weighted least
squares
extension of Cochran's Q-test, which tests whether the variability
in the
observed effect sizes or outcomes is larger than one would expect
based on
sampling variability (and the given covariances among the sampling
errors)
alone. A significant test suggests that the true effects/outcomes
are
heterogeneous."
In our case, the Q suggests that the observed effect sizes vary
significantly (p < .0001) around the average effect size (r =
-0.26).
Furthermore, the Q provided by metafor points to statistically
significant
heterogeneity, with heterogeneity referring to the total variance
encompassing all potential sources of variance, including effect
sizes,
samples, and countries. However, I am unsure whether this is what
the
reviewer meant by interpreting the Q as "between-clusters
variation."
I would highly appreciate any help in clarifying the interpretation
of the
Q-test statistic.
Thank you!
Best regards,
Martin
PS: I apologize for the poor formatting of the metafor output, but
my email
program does not support better formatting options.