[R-meta] Descriptive Multilevel Meta-analysis with C-Statistics

3 messages · Mika Manninen, Wolfgang Viechtbauer

Mon, May 19, 2025 3:31 AM #

Dear meta-analysis community

We aim to explore whether, on average across various settings and
samples, Random Forests, XGBoost, and Logistic Regression differ in
their ability to discriminate the same medical outcome. In our example
dataset, the c-statistics (AUCs) have been logit-transformed (yi) and
are accompanied by their variances (vi).

Specifically:

a) We are pooling internally validated c-statistics.

b) We are comparing different machine learning model types (e.g.,
Logistic Regression, Random Forest, XGBoost), typically applied to the
same sample within each study.

c) We are analyzing highly nested data, where predictor sets may
differ across models, even within the same study (e.g., some models
may include blood markers while others do not).

d) We are not comparing identical models across studies, and within
studies, model comparisons do not always use the same predictor sets.

First of all, would this sort of a descriptive meta-analysis be
sensible given the complexity of data. Secondly, is the current
modelling approach sensible or should it be made simpler or more
complex in principle (e.g., subsetting the outcomes or having more
levels of nesting)?

Thank you very much in advance,
Mika Manninen

***

Data:

structure(list(cluster_id = c(1.1, 10.1, 11.1, 11.2, 12.1, 12.2,
12.3, 13.1, 13.1, 13.1, 14.1, 14.1, 14.1, 15.1, 16.1, 16.1, 16.1,
16.2, 16.2), row_id = c(1, 95, 96, 97, 98, 101, 104, 107, 108,
110, 113, 114, 116, 119, 120, 121, 122, 128, 129), study_id = c(1,
10, 11, 11, 12, 12, 12, 13, 13, 13, 13, 13, 13, 14, 15, 15, 15,
15, 15), sample_id = c(1, 10, 11, 11, 12, 12, 12, 13, 13, 13,
14, 14, 14, 15, 16, 16, 16, 16, 16), predictor_id = c(1, 1, 1,
2, 1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2), model_id = c(1,
1, 1, 2, 1, 1, 1, 1, 2, 4, 1, 2, 4, 1, 1, 2, 3, 1, 2), model_type =
c("Logistic Regression",
"Logistic Regression", "Logistic Regression", "XGBoost", "XGBoost",
"XGBoost", "XGBoost", "Random Forest", "Logistic Regression",
"XGBoost", "Random Forest", "Logistic Regression", "XGBoost",
"Logistic Regression", "Logistic Regression", "Random Forest",
"XGBoost", "Logistic Regression", "Random Forest"), yi = c(0.78,
0.94, 1.07, 1.09, 0.90,
0.63, 0.94, 0.10, -0.61,
0.20, 0.55, 0.28, 1.00,
1.11, 0.71, 1.13, 1.14,
1.62, 2.48), vi = c(0.02,
0.002, 0.073, 0.011, 0.00055,
0.00055, 0.056, 0.13,
0.15, 0.13, 0.12, 0.11,
0.15, 0.012, 0.0036, 0.005,
0.009, 0.02, 0.023)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -19L))

R-code:

# 9.1 Vcalc
# construct correlation matrix for logistic regression, xbboost, and
random forest (fake correlations here)

A <- matrix(0.7, nrow=3, ncol=3)
A[3,1] <- A[1,3] <- 0.8
A[2,3] <- A[3,2] <- 0.9
diag(A) <- 1
rownames(A) <- colnames(A) <- c("Logistic Regression", "Random
Forest", "XGBoost")
A

# construct the variance covariance matrix
V <- vcalc(vi = vi, cluster = cluster_id, type = model_type,  data =
dat.ml, rho = A)
V

# Multivariate Meta-analysis (perhaps should be mods = ~
factor(model_type) - 1, random = ~ 1 | sample_id/predictor_id, data =
dat.ml)

res.ml <- rma.mv(yi, V, mods = ~ factor(model_type)-1, random =
~factor(model_type) | cluster_id, struct = "UN",  data = dat.ml,
method = "REML")
res.ml

Wolfgang Viechtbauer

Mon, May 19, 2025 4:44 AM #

Dear Mika,

I think this is quite sensible. Some things to consider:

1) I would not only consider 'model_type' as a predictor, but also what predictors were included in the models. I think this relates to your 'predictor_id' variable.

2) As has been discussed before on this list, there are two types of dependencies to consider. If the same data are analyzed using different methods/models, then the sampling errors of the corresponding yi values are not independent. I think your 'sample_id' variable indicates whether this is the case. This kind of dependency should ideally be reflected in the V matrix and one can also add random effects accordingly. However, even two yi values are based on different samples (and hence their sampling errors are independent), if they come from the same study, then there may be dependeny because results reported for the same study may be more alike than results reported from different studies. I think 'study_id' is the relevant variable here. This kind of dependency is accounted for via random effects.

So in your construction of the V matrix, I would consider using:

V <- vcalc(vi = vi, cluster = sample_id, type = model_type, data = dat.ml, rho = A)

But this induces a correlation of 1 for yi values analyzed with the same model but different predictor_id values (which applies to sample_id values 12 and 16 hence you get a warning about that).

So you could expand the A matrix to also consider the predictor_id values and your 'cluster' variable is then a combiantion of model_type and predictor_id. Something like:

A1 <- matrix(0.7, nrow=3, ncol=3)
A1[3,1] <- A1[1,3] <- 0.8
A1[2,3] <- A1[3,2] <- 0.9
diag(A1) <- 1
rownames(A1) <- colnames(A1) <- c("Logistic Regression", "Random Forest", "XGBoost")
A1

A2 <- matrix(0.9, nrow=3, ncol=3)
diag(A2) <- 1
rownames(A2) <- colnames(A2) <- 1:3

A <- A1 %x% A2
rownames(A) <- colnames(A) <- sapply(rownames(A1), function(x) paste0(x, "_", rownames(A2)))
A

dat.ml$model_type_predictor_id <- paste0(dat.ml$model_type, "_", dat.ml$predictor_id)

V <- vcalc(vi = vi, cluster = cluster_id, type = model_type_predictor_id,  data = dat.ml, rho = A)
V

Of course construction of A and hence V is guesswork, but we always have cluster-robust inference methods in the end to account for the fact that V (and hence our model) is just an approximation.

As for the random effects structure, it depends a bit on how much data you have. A start may be:

res.ml <- rma.mv(yi, V, mods = ~ factor(model_type_predictor_id)-1,
                 random = ~ 1 | study_id/sample_id/model_type/predictor_id, data = dat.ml)
res.ml

This implicitly assumes a constant correlation for sample_id within study_id (which is sensible), a different but constant correlation for model_type within sample_id (which is a bit less sensible), and a different but constant correlation for predictor_id within model_type (again a bit overly simplistic). To relax the constant correlation part, one would have to go to 'inner ~ outer' random effects, but the number of parameters quickly increases then.

In the end, you want to do:

robust(res.ml, cluster=study_id, clubSandwich=TRUE)

anyway.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis <r-sig-meta-analysis-bounces at r-project.org> On Behalf
Of Mika Manninen via R-sig-meta-analysis
Sent: Monday, May 19, 2025 12:32
To: r-sig-meta-analysis at r-project.org
Cc: Mika Manninen <mixu89 at gmail.com>
Subject: [R-meta] Descriptive Multilevel Meta-analysis with C-Statistics

Dear meta-analysis community

We aim to explore whether, on average across various settings and
samples, Random Forests, XGBoost, and Logistic Regression differ in
their ability to discriminate the same medical outcome. In our example
dataset, the c-statistics (AUCs) have been logit-transformed (yi) and
are accompanied by their variances (vi).

Specifically:

a) We are pooling internally validated c-statistics.

b) We are comparing different machine learning model types (e.g.,
Logistic Regression, Random Forest, XGBoost), typically applied to the
same sample within each study.

c) We are analyzing highly nested data, where predictor sets may
differ across models, even within the same study (e.g., some models
may include blood markers while others do not).

d) We are not comparing identical models across studies, and within
studies, model comparisons do not always use the same predictor sets.

First of all, would this sort of a descriptive meta-analysis be
sensible given the complexity of data. Secondly, is the current
modelling approach sensible or should it be made simpler or more
complex in principle (e.g., subsetting the outcomes or having more
levels of nesting)?

Thank you very much in advance,
Mika Manninen

***

Data:

structure(list(cluster_id = c(1.1, 10.1, 11.1, 11.2, 12.1, 12.2,
12.3, 13.1, 13.1, 13.1, 14.1, 14.1, 14.1, 15.1, 16.1, 16.1, 16.1,
16.2, 16.2), row_id = c(1, 95, 96, 97, 98, 101, 104, 107, 108,
110, 113, 114, 116, 119, 120, 121, 122, 128, 129), study_id = c(1,
10, 11, 11, 12, 12, 12, 13, 13, 13, 13, 13, 13, 14, 15, 15, 15,
15, 15), sample_id = c(1, 10, 11, 11, 12, 12, 12, 13, 13, 13,
14, 14, 14, 15, 16, 16, 16, 16, 16), predictor_id = c(1, 1, 1,
2, 1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2), model_id = c(1,
1, 1, 2, 1, 1, 1, 1, 2, 4, 1, 2, 4, 1, 1, 2, 3, 1, 2), model_type =
c("Logistic Regression",
"Logistic Regression", "Logistic Regression", "XGBoost", "XGBoost",
"XGBoost", "XGBoost", "Random Forest", "Logistic Regression",
"XGBoost", "Random Forest", "Logistic Regression", "XGBoost",
"Logistic Regression", "Logistic Regression", "Random Forest",
"XGBoost", "Logistic Regression", "Random Forest"), yi = c(0.78,
0.94, 1.07, 1.09, 0.90,
0.63, 0.94, 0.10, -0.61,
0.20, 0.55, 0.28, 1.00,
1.11, 0.71, 1.13, 1.14,
1.62, 2.48), vi = c(0.02,
0.002, 0.073, 0.011, 0.00055,
0.00055, 0.056, 0.13,
0.15, 0.13, 0.12, 0.11,
0.15, 0.012, 0.0036, 0.005,
0.009, 0.02, 0.023)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -19L))

R-code:

# 9.1 Vcalc
# construct correlation matrix for logistic regression, xbboost, and
random forest (fake correlations here)

A <- matrix(0.7, nrow=3, ncol=3)
A[3,1] <- A[1,3] <- 0.8
A[2,3] <- A[3,2] <- 0.9
diag(A) <- 1
rownames(A) <- colnames(A) <- c("Logistic Regression", "Random
Forest", "XGBoost")
A

# construct the variance covariance matrix
V <- vcalc(vi = vi, cluster = cluster_id, type = model_type,  data =
dat.ml, rho = A)
V

# Multivariate Meta-analysis (perhaps should be mods = ~
factor(model_type) - 1, random = ~ 1 | sample_id/predictor_id, data =
dat.ml)

res.ml <- rma.mv(yi, V, mods = ~ factor(model_type)-1, random =
~factor(model_type) | cluster_id, struct = "UN",  data = dat.ml,
method = "REML")
res.ml

Mika Manninen

Mon, May 19, 2025 8:34 AM #

Dear Wolfgang,

Thank you very much for your insightful reply, it is greatly appreciated.

The A <- A1 %x% A2 -- a product of two matrices is something that I
did not think about at all.

Out of curiosity, we have reasonable estimates to construct the
matrices, but when the model includes the product of two matrices, are
there any guidelines for conducting sensitivity analyses --
particularly regarding how many alternative correlation structures
should be tested or how would you think about this issue in general?

Kind regards,
Mika

ma 19.5.2025 klo 12.44 Viechtbauer, Wolfgang (NP)
(wolfgang.viechtbauer at maastrichtuniversity.nl) kirjoitti:

Dear Mika,

I think this is quite sensible. Some things to consider:

1) I would not only consider 'model_type' as a predictor, but also what predictors were included in the models. I think this relates to your 'predictor_id' variable.

2) As has been discussed before on this list, there are two types of dependencies to consider. If the same data are analyzed using different methods/models, then the sampling errors of the corresponding yi values are not independent. I think your 'sample_id' variable indicates whether this is the case. This kind of dependency should ideally be reflected in the V matrix and one can also add random effects accordingly. However, even two yi values are based on different samples (and hence their sampling errors are independent), if they come from the same study, then there may be dependeny because results reported for the same study may be more alike than results reported from different studies. I think 'study_id' is the relevant variable here. This kind of dependency is accounted for via random effects.

So in your construction of the V matrix, I would consider using:

V <- vcalc(vi = vi, cluster = sample_id, type = model_type, data = dat.ml, rho = A)

But this induces a correlation of 1 for yi values analyzed with the same model but different predictor_id values (which applies to sample_id values 12 and 16 hence you get a warning about that).

So you could expand the A matrix to also consider the predictor_id values and your 'cluster' variable is then a combiantion of model_type and predictor_id. Something like:

A1 <- matrix(0.7, nrow=3, ncol=3)
A1[3,1] <- A1[1,3] <- 0.8
A1[2,3] <- A1[3,2] <- 0.9
diag(A1) <- 1
rownames(A1) <- colnames(A1) <- c("Logistic Regression", "Random Forest", "XGBoost")
A1

A2 <- matrix(0.9, nrow=3, ncol=3)
diag(A2) <- 1
rownames(A2) <- colnames(A2) <- 1:3

A <- A1 %x% A2
rownames(A) <- colnames(A) <- sapply(rownames(A1), function(x) paste0(x, "_", rownames(A2)))
A

dat.ml$model_type_predictor_id <- paste0(dat.ml$model_type, "_", dat.ml$predictor_id)

V <- vcalc(vi = vi, cluster = cluster_id, type = model_type_predictor_id,  data = dat.ml, rho = A)
V

Of course construction of A and hence V is guesswork, but we always have cluster-robust inference methods in the end to account for the fact that V (and hence our model) is just an approximation.

As for the random effects structure, it depends a bit on how much data you have. A start may be:

res.ml <- rma.mv(yi, V, mods = ~ factor(model_type_predictor_id)-1,
                 random = ~ 1 | study_id/sample_id/model_type/predictor_id, data = dat.ml)
res.ml

This implicitly assumes a constant correlation for sample_id within study_id (which is sensible), a different but constant correlation for model_type within sample_id (which is a bit less sensible), and a different but constant correlation for predictor_id within model_type (again a bit overly simplistic). To relax the constant correlation part, one would have to go to 'inner ~ outer' random effects, but the number of parameters quickly increases then.

In the end, you want to do:

robust(res.ml, cluster=study_id, clubSandwich=TRUE)

anyway.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis <r-sig-meta-analysis-bounces at r-project.org> On Behalf
Of Mika Manninen via R-sig-meta-analysis
Sent: Monday, May 19, 2025 12:32
To: r-sig-meta-analysis at r-project.org
Cc: Mika Manninen <mixu89 at gmail.com>
Subject: [R-meta] Descriptive Multilevel Meta-analysis with C-Statistics

Dear meta-analysis community

We aim to explore whether, on average across various settings and
samples, Random Forests, XGBoost, and Logistic Regression differ in
their ability to discriminate the same medical outcome. In our example
dataset, the c-statistics (AUCs) have been logit-transformed (yi) and
are accompanied by their variances (vi).

Specifically:

a) We are pooling internally validated c-statistics.

b) We are comparing different machine learning model types (e.g.,
Logistic Regression, Random Forest, XGBoost), typically applied to the
same sample within each study.

c) We are analyzing highly nested data, where predictor sets may
differ across models, even within the same study (e.g., some models
may include blood markers while others do not).

d) We are not comparing identical models across studies, and within
studies, model comparisons do not always use the same predictor sets.

First of all, would this sort of a descriptive meta-analysis be
sensible given the complexity of data. Secondly, is the current
modelling approach sensible or should it be made simpler or more
complex in principle (e.g., subsetting the outcomes or having more
levels of nesting)?

Thank you very much in advance,
Mika Manninen

***

Data:

structure(list(cluster_id = c(1.1, 10.1, 11.1, 11.2, 12.1, 12.2,
12.3, 13.1, 13.1, 13.1, 14.1, 14.1, 14.1, 15.1, 16.1, 16.1, 16.1,
16.2, 16.2), row_id = c(1, 95, 96, 97, 98, 101, 104, 107, 108,
110, 113, 114, 116, 119, 120, 121, 122, 128, 129), study_id = c(1,
10, 11, 11, 12, 12, 12, 13, 13, 13, 13, 13, 13, 14, 15, 15, 15,
15, 15), sample_id = c(1, 10, 11, 11, 12, 12, 12, 13, 13, 13,
14, 14, 14, 15, 16, 16, 16, 16, 16), predictor_id = c(1, 1, 1,
2, 1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2), model_id = c(1,
1, 1, 2, 1, 1, 1, 1, 2, 4, 1, 2, 4, 1, 1, 2, 3, 1, 2), model_type =
c("Logistic Regression",
"Logistic Regression", "Logistic Regression", "XGBoost", "XGBoost",
"XGBoost", "XGBoost", "Random Forest", "Logistic Regression",
"XGBoost", "Random Forest", "Logistic Regression", "XGBoost",
"Logistic Regression", "Logistic Regression", "Random Forest",
"XGBoost", "Logistic Regression", "Random Forest"), yi = c(0.78,
0.94, 1.07, 1.09, 0.90,
0.63, 0.94, 0.10, -0.61,
0.20, 0.55, 0.28, 1.00,
1.11, 0.71, 1.13, 1.14,
1.62, 2.48), vi = c(0.02,
0.002, 0.073, 0.011, 0.00055,
0.00055, 0.056, 0.13,
0.15, 0.13, 0.12, 0.11,
0.15, 0.012, 0.0036, 0.005,
0.009, 0.02, 0.023)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -19L))

R-code:

# 9.1 Vcalc
# construct correlation matrix for logistic regression, xbboost, and
random forest (fake correlations here)

A <- matrix(0.7, nrow=3, ncol=3)
A[3,1] <- A[1,3] <- 0.8
A[2,3] <- A[3,2] <- 0.9
diag(A) <- 1
rownames(A) <- colnames(A) <- c("Logistic Regression", "Random
Forest", "XGBoost")
A

# construct the variance covariance matrix
V <- vcalc(vi = vi, cluster = cluster_id, type = model_type,  data =
dat.ml, rho = A)
V

# Multivariate Meta-analysis (perhaps should be mods = ~
factor(model_type) - 1, random = ~ 1 | sample_id/predictor_id, data =
dat.ml)

res.ml <- rma.mv(yi, V, mods = ~ factor(model_type)-1, random =
~factor(model_type) | cluster_id, struct = "UN",  data = dat.ml,
method = "REML")
res.ml