Dear all, I have a basic question about the output of my (gu)estimation of the variance-covariance matrix. I have extracted results from very heterogeneous studies with OR as effect size (sample sizes between 20 and 300,000). Since some of the results come from the same study, I decided to try to use the VCOV as an input and estimated values according to the following formula V_mat? <- vcalc(vi=vi, cluster=shared_variance, data=df_complete, rho=.7) res_meta ??? <- rma.mv(yi, vi, V=V_mat, ?? ???? ??? ??? ??? ??? random = ~ 1 | number, mods = ~ hospitalbeds + ltcbeds, verbose=TRUE, data=df_complete) Interestingly, in this case the weighting is reversed, so that most of the weight is given to studies with the smallest sample size; something that does not happen when using this formula: res_meta ??? <- rma(yi, vi, ?? ???? ??? ??? ??? ??? random = ~ 1 | number, mods = ~ hospitalbeds + ltcbeds, verbose=TRUE, data=df_complete) I have tried to understand what is going on, but I am at kind of lost. Could someone please give me some advice? Thanks in advance, David
[R-meta] Inverse weighting after estimation of VCOV
4 messages · pedros@c m@iii@g oii st@ii@u@i-m@rburg@de, James Pustejovsky
Hi David, I don't entirely understand the models that you're looking at, so clarifying the following would help in getting good feedback: * What is the variable `shared_variance` used in the vcalc call? * What is the variable `number` used in the random effects argument of rma.mv? * How are these variables related? Additionally, it would be good to check that the vcov matrix created by vcalc() is as you intend it to be. Could you pull out the blocks of this matrix for a few studies and just verify that they give you covariance matrices with a correlation of 0.7? I mean something like: vcov_study_k <- V_mat[i:j, i:j] cov2cor(vcov_study_k) where the indices i:j are the rows in your data corresponding to a given study k. James On Fri, May 24, 2024 at 10:00?AM David Pedrosa via R-sig-meta-analysis <
r-sig-meta-analysis at r-project.org> wrote:
Dear all,
I have a basic question about the output of my (gu)estimation of the
variance-covariance matrix. I have extracted results from very
heterogeneous studies with OR as effect size (sample sizes between 20
and 300,000). Since some of the results come from the same study, I
decided to try to use the VCOV as an input and estimated values
according to the following formula
V_mat <- vcalc(vi=vi, cluster=shared_variance, data=df_complete, rho=.7)
res_meta <- rma.mv(yi, vi, V=V_mat,
random = ~ 1 | number, mods = ~ hospitalbeds +
ltcbeds, verbose=TRUE, data=df_complete)
Interestingly, in this case the weighting is reversed, so that most of
the weight is given to studies with the smallest sample size; something
that does not happen when using this formula:
res_meta <- rma(yi, vi,
random = ~ 1 | number, mods = ~ hospitalbeds +
ltcbeds, verbose=TRUE, data=df_complete)
I have tried to understand what is going on, but I am at kind of lost.
Could someone please give me some advice?
Thanks in advance,
David
_______________________________________________ R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org To manage your subscription to this mailing list, go to: https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
2 days later
Hi James,
apologies, my question was not??seasoned enough.
I have a dataframe with 16 studies, all of which provide some odds
ratios for hospitalisation. 8 studies are from the same publication but
on different countries. To me there is still reason to believe they
?share more variance? than the rest. Besides, I want to weigh the total
number??of subjects from each of the studies. To make it a bit more
complex, we have digged out the miner of hospital beds and long term
beds for every country, both of which we consider potential moderators.
I ran the random effects model
res_metaRE <- rma(yi, vi,
?random = ~ 1 | number, mods = ~ hospitalbeds +
ltcbeds, verbose=TRUE, data=df_complete)
to which weights(res_metaRE) provides accurate results. If I try to
estimate the VCOV matrix, the results show correct diagonal values, that
is identical to df_conplete$vi. But sticking the resulting V_mat
V_mat <- vcalc(vi=vi, cluster=shared_variance, data=df_complete, rho=.7)
to rma.mv provides results that are too high but especially the studies
with lower number of subjects are higher weighted. I am assuming that
it?s just somehow inverted but I cannot understand if I?m missing
something or if there is some other mistake in the way I?m estimating
the VCOV. Number is just the study id.
I?m not entirely sure I understand your point with the subsection of the
matrix.
Thanks for your help!
Best,
David
P.S.: Here are the relevant parts of df_complete
structure(list(number = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15), author = c("Aamodt", "Ceylan", "Krause", "Kumar",
"Moens, Belgium", "Moens, France"Moens, Italy", "Moens, Canada",
"Moens, Mexiko", "Moens, New Zeeland", "Moens, Spain", "Moens, South
Corea",
"Moens, Czech Rep.", "Moens, Hungary", "Moens, USA"), year = c(2023,
2022, 2021, 2021, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,
2015, 2015, 2015), n_ges = c(53279, 27, 40, 346141, 837, 4599,
4034, 1381, 1062, 202, 352, 1565, 92, 241, 20065), OR = c(1.06,
1.43, 8.25, 1.454, 2.3, 1.5, 1.4, 1.7, 0.95, 1.97, 1.09, 0.95,
0.97, 1.44, 1.4), hospitalbeds = c(2.77, 3.02, 7.76, 2.77, 5.47,
5.65, 3.12, 2.58, 1, 2.57, 2.96, 12.77, 6.66, 6.79, 2.77), ltcbeds =
c(32.3,
9.5, 54.2, 53.9, 66.8, 47.4, 21.3, 46.7, 0, 50.4, 43.4, 25, 34.9,
42.6, 28.9), p_values = c(0.106809128205467, 0.706331045003814,
0.0281267337718951, 0, 2.43772276381116e-05, 2.76746355676653e-22,
1.01260208850919e-05, 1.19251123951374e-10, 0.772759462747246,
0.0741077696800058, 0.74088983860122, 0.68164335922065, 1,
0.183303852299051,
3.20176730771634e-26), shared_variance = c(0, 0, 0, 0, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1), yi = structure(c(0.0582689081239758,
0.357674444271816, 2.11021320034659, 0.374318379111328, 0.832909122935104,
0.405465108108164, 0.336472236621213, 0.53062825106217,
-0.0512932943875506,
0.678033542749897, 0.0861776962410524, -0.0512932943875506,
-0.0304592074847086,
0.364643113587909, 0.336472236621213), ni = c(53279, 27, 40,
346141, 837, 4599, 4034, 1381, 1062, 202, 352, 1565, 92, 241,
20065), measure = "GEN"), vi = c(0.000835840725678602, 0.638632983584221,
0.604067037193667, 0.000435509388232691, 0.0467214213223696,
0.00468347897652763, 0.00538603813506437, 0.0132951153208062,
0.0214123920152818, 0.142112789690683, 0.0489441998392354,
0.0138688993962097,
0.186242249276727, 0.0702159732616764, 0.00133268716433697)), row.names
= c(NA,
-15L), class = c("escalc", "data.frame"), yi.names = "yi", vi.names =
"vi", digits = c(est = 4,
se = 4, test = 4, pval = 4, ci = 4, var = 4, sevar = 4, fit = 4,
het = 4))
Am 24.05.2024 um 19:06 schrieb James Pustejovsky:
Hi David, I don't entirely understand the models that you're looking at, so clarifying the following would help in getting good feedback: * What is the variable `shared_variance` used in the vcalc call? * What is the variable `number` used in the random effects argument of rma.mv <http://rma.mv>? * How are these variables related? Additionally, it would be good to check that the vcov matrix created by vcalc() is as you intend it to be. Could you pull out the blocks of this matrix for a few studies and just verify that they give you covariance matrices with a correlation of 0.7? I mean something like: vcov_study_k <- V_mat[i:j, i:j] cov2cor(vcov_study_k) where the indices i:j are the rows in your data corresponding to a given study k. James On Fri, May 24, 2024 at 10:00?AM David Pedrosa via R-sig-meta-analysis <r-sig-meta-analysis at r-project.org> wrote: Dear all, I have a basic question about the output of my (gu)estimation of the variance-covariance matrix. I have extracted results from very heterogeneous studies with OR as effect size (sample sizes between 20 and 300,000). Since some of the results come from the same study, I decided to try to use the VCOV as an input and estimated values according to the following formula V_mat? <- vcalc(vi=vi, cluster=shared_variance, data=df_complete, rho=.7) res_meta ??? <- rma.mv <http://rma.mv>(yi, vi, V=V_mat, ??? ???? ??? ??? ??? ??? random = ~ 1 | number, mods = ~ hospitalbeds + ltcbeds, verbose=TRUE, data=df_complete) Interestingly, in this case the weighting is reversed, so that most of the weight is given to studies with the smallest sample size; something that does not happen when using this formula: res_meta ??? <- rma(yi, vi, ??? ???? ??? ??? ??? ??? random = ~ 1 | number, mods = ~ hospitalbeds + ltcbeds, verbose=TRUE, data=df_complete) I have tried to understand what is going on, but I am at kind of lost. Could someone please give me some advice? Thanks in advance, David
_______________________________________________
R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org
To manage your subscription to this mailing list, go to:
https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
Uni Marburg Siegel <https://www.uni-marburg.de/de/fb20/bereiche/kopfz/neurologie/forschung/agbun> Prof. Dr. David Pedrosa Leitender Oberarzt der Klinik f?r Neurologie, Leiter der Sektion Bewegungsst?rungen und Neuromodulation, Universit?tsklinikum Gie?en und Marburg Tel. (+49) 6421-58 65299 Fax. (+49) 6421-58 67055 Address. Baldingerstr., 35043 Marburg Web. https://www.ukgm.de/ugm_2/deu/umr_neu/index.html Web. https://www.uni-marburg.de/de/fb20/bereiche/kopfz/neurologie/forschung/agbun [[alternative HTML version deleted]]
3 days later
Hi David, Thanks for clarifying your data structure. Based on what you've described, I don't think it makes sense to use vcalc(). The point of vcalc() is to build in covariance between the sampling errors of the effect size estimates. For your one publication that reports 8 studies, each effect size estimate is based on a separate sample of participants (because each estimate comes from a different country). So there's no reason to expect that there would be covariance in the sampling errors. Instead, one might suspect that there would be covariance between the country-specific effect size parameters (i.e., the "true" effect sizes) from this publication. This would be plausible if the same operational procedures (e.g., same recruitment approach, same measurement instrumentation, same follow-up window) were used across the samples in this publication. The conventional way to model this would be to 1) specify effect size estimates as independent but 2) include publication-level random effects in the model to capture shared operational variance within publications. The syntax would be something like: res_metaRE <- rma( yi, V = vi, random = ~ 1 | publicationID / number, mods = ~ hospitalbeds + ltcbeds, verbose=TRUE, data=df_complete, sparse = TRUE ) You'll need to create a publicationID variable if you don't already have that on the data. The difficulty with this approach in your case is that there's only one publication that has multiple samples nested within it, so there's not a lot of information available to parse out the variance at the publication level from the variance at the sample level (across countries). You could try using the model fit statistics to compare the model above versus a model that only has random effects at the sample level. James On Mon, May 27, 2024 at 8:54?AM David Pedrosa via R-sig-meta-analysis <
r-sig-meta-analysis at r-project.org> wrote:
Hi James,
apologies, my question was not seasoned enough.
I have a dataframe with 16 studies, all of which provide some odds
ratios for hospitalisation. 8 studies are from the same publication but
on different countries. To me there is still reason to believe they
?share more variance? than the rest. Besides, I want to weigh the total
number of subjects from each of the studies. To make it a bit more
complex, we have digged out the miner of hospital beds and long term
beds for every country, both of which we consider potential moderators.
I ran the random effects model
res_metaRE <- rma(yi, vi,
random = ~ 1 | number, mods = ~ hospitalbeds +
ltcbeds, verbose=TRUE, data=df_complete)
to which weights(res_metaRE) provides accurate results. If I try to
estimate the VCOV matrix, the results show correct diagonal values, that
is identical to df_conplete$vi. But sticking the resulting V_mat
V_mat <- vcalc(vi=vi, cluster=shared_variance, data=df_complete, rho=.7)
to rma.mv provides results that are too high but especially the studies
with lower number of subjects are higher weighted. I am assuming that
it?s just somehow inverted but I cannot understand if I?m missing
something or if there is some other mistake in the way I?m estimating
the VCOV. Number is just the study id.
I?m not entirely sure I understand your point with the subsection of the
matrix.
Thanks for your help!
Best,
David
P.S.: Here are the relevant parts of df_complete
structure(list(number = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15), author = c("Aamodt", "Ceylan", "Krause", "Kumar",
"Moens, Belgium", "Moens, France"Moens, Italy", "Moens, Canada",
"Moens, Mexiko", "Moens, New Zeeland", "Moens, Spain", "Moens, South
Corea",
"Moens, Czech Rep.", "Moens, Hungary", "Moens, USA"), year = c(2023,
2022, 2021, 2021, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,
2015, 2015, 2015), n_ges = c(53279, 27, 40, 346141, 837, 4599,
4034, 1381, 1062, 202, 352, 1565, 92, 241, 20065), OR = c(1.06,
1.43, 8.25, 1.454, 2.3, 1.5, 1.4, 1.7, 0.95, 1.97, 1.09, 0.95,
0.97, 1.44, 1.4), hospitalbeds = c(2.77, 3.02, 7.76, 2.77, 5.47,
5.65, 3.12, 2.58, 1, 2.57, 2.96, 12.77, 6.66, 6.79, 2.77), ltcbeds =
c(32.3,
9.5, 54.2, 53.9, 66.8, 47.4, 21.3, 46.7, 0, 50.4, 43.4, 25, 34.9,
42.6, 28.9), p_values = c(0.106809128205467, 0.706331045003814,
0.0281267337718951, 0, 2.43772276381116e-05, 2.76746355676653e-22,
1.01260208850919e-05, 1.19251123951374e-10, 0.772759462747246,
0.0741077696800058, 0.74088983860122, 0.68164335922065, 1,
0.183303852299051,
3.20176730771634e-26), shared_variance = c(0, 0, 0, 0, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1), yi = structure(c(0.0582689081239758,
0.357674444271816, 2.11021320034659, 0.374318379111328, 0.832909122935104,
0.405465108108164, 0.336472236621213, 0.53062825106217,
-0.0512932943875506,
0.678033542749897, 0.0861776962410524, -0.0512932943875506,
-0.0304592074847086,
0.364643113587909, 0.336472236621213), ni = c(53279, 27, 40,
346141, 837, 4599, 4034, 1381, 1062, 202, 352, 1565, 92, 241,
20065), measure = "GEN"), vi = c(0.000835840725678602, 0.638632983584221,
0.604067037193667, 0.000435509388232691, 0.0467214213223696,
0.00468347897652763, 0.00538603813506437, 0.0132951153208062,
0.0214123920152818, 0.142112789690683, 0.0489441998392354,
0.0138688993962097,
0.186242249276727, 0.0702159732616764, 0.00133268716433697)), row.names
= c(NA,
-15L), class = c("escalc", "data.frame"), yi.names = "yi", vi.names =
"vi", digits = c(est = 4,
se = 4, test = 4, pval = 4, ci = 4, var = 4, sevar = 4, fit = 4,
het = 4))
Am 24.05.2024 um 19:06 schrieb James Pustejovsky:
Hi David, I don't entirely understand the models that you're looking at, so clarifying the following would help in getting good feedback: * What is the variable `shared_variance` used in the vcalc call? * What is the variable `number` used in the random effects argument of rma.mv <http://rma.mv>? * How are these variables related? Additionally, it would be good to check that the vcov matrix created by vcalc() is as you intend it to be. Could you pull out the blocks of this matrix for a few studies and just verify that they give you covariance matrices with a correlation of 0.7? I mean something like: vcov_study_k <- V_mat[i:j, i:j] cov2cor(vcov_study_k) where the indices i:j are the rows in your data corresponding to a given study k. James On Fri, May 24, 2024 at 10:00?AM David Pedrosa via R-sig-meta-analysis <r-sig-meta-analysis at r-project.org> wrote: Dear all, I have a basic question about the output of my (gu)estimation of the variance-covariance matrix. I have extracted results from very heterogeneous studies with OR as effect size (sample sizes between 20 and 300,000). Since some of the results come from the same study, I decided to try to use the VCOV as an input and estimated values according to the following formula V_mat <- vcalc(vi=vi, cluster=shared_variance, data=df_complete, rho=.7) res_meta <- rma.mv <http://rma.mv>(yi, vi, V=V_mat, random = ~ 1 | number, mods = ~ hospitalbeds + ltcbeds, verbose=TRUE, data=df_complete) Interestingly, in this case the weighting is reversed, so that most of the weight is given to studies with the smallest sample size; something that does not happen when using this formula: res_meta <- rma(yi, vi, random = ~ 1 | number, mods = ~ hospitalbeds + ltcbeds, verbose=TRUE, data=df_complete) I have tried to understand what is going on, but I am at kind of lost. Could someone please give me some advice? Thanks in advance, David
_______________________________________________
R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org
To manage your subscription to this mailing list, go to:
https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
-- Uni Marburg Siegel < https://www.uni-marburg.de/de/fb20/bereiche/kopfz/neurologie/forschung/agbun> Prof. Dr. David Pedrosa Leitender Oberarzt der Klinik f?r Neurologie, Leiter der Sektion Bewegungsst?rungen und Neuromodulation, Universit?tsklinikum Gie?en und Marburg Tel. (+49) 6421-58 65299 Fax. (+49) 6421-58 67055 Address. Baldingerstr., 35043 Marburg Web. https://www.ukgm.de/ugm_2/deu/umr_neu/index.html Web. https://www.uni-marburg.de/de/fb20/bereiche/kopfz/neurologie/forschung/agbun [[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org To manage your subscription to this mailing list, go to: https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis