Dear Metafor?s users, This is the first time that I post on this mailing list. Our team is currently planning an individual patient data meta-analysis of prospective cohorts. We adopt an IPD approach because no prospective study has yet reported this association while many should have the data to assess it (i.e., they have access to the targeted variables but did not report the association). Unfortunately, we anticipate that most of the included studies will not have an ethics committee that gives the right to share their data with us. To overcome this, we plan to ask the authors of the included studies to perform the analyses on their own data and to share only the results of the analyses with us. Our dependent variable is binary (DV: yes/no) and our independent variable is an ordered variable (IV: a scale variable in 12 points [from 1 to 12]) treated as a continuous variable. We ask authors to perform a logistic regression. Based on their results (log odds ratio and associated variance), we adopt a classic two-stage approach. I show the R code for a particular study to highlight our approach. #R code for study 1 study1<-glm(DV~IV, data=datastudy1) yi1<- summary(study1)$coefficients[2,1] #extract the log odds ratio vi1<- summary(study1)$coefficients[2,2]^2 #extract the squared standard error # then, we repeat the same process for each included study # Once all the effect sizes and their variance are collected, we can store them within a dataset and run a standard two stage meta-analysis dat<-data.frame( yi=c(yi1, yi2, yi3?), vi=c(vi1, vi2, vi3?), study=c(1, 2, 3?)) model<-rma(yi,vi, dat) This is the code for our primary analysis. In an exploratory analysis, we would like to perform a moderation analysis with a patient-level moderator. We understand how to perform a moderation analysis for a study-level moderator but we are not sure on how to implement it with a patient-level moderator. The aim of this moderation analysis will be to obtain the estimated average effects for each level of a moderator. I describe here the approach we have envisaged: # example of R code for study 1 #let VM denote a participant-level moderator with 3 categories (a,b,c) study1<-glm(DV~IV*VM, data=datastudy1) EM1<-emmeans::emtrends(study1, ~VM, var=" IV") yi1<- as.data.frame(EM1$emtrends)[,2] #extract log odds ratio for each level of the VM for study 1 (contains 3 values) V1<-vcov(EM1) # extract the variance/covariance matrix for study 1 (a 3x3 matrix) # then, we can build a dataset which will look like this... dat<-data.frame( yi=c(yi1, yi2, yi3?), VM=c(a,b,c,a,b,c,a,b,c?), study=c(1,1,1, 2,2,2, 3,3,3?)) # ...and a variance-covariance matrix using the bldiag function V<-bldiag(list(V1,V2,V3?)) # Last, we plan to perform a multivariate model in which we leave out the model intercept and in which we use an unstructured variance structure (if the model converges). model<-rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN", data=dat) We were wondering if you could give us some feedback on the correctness of our approach. We have read in several textbooks that two-stage meta-analyses are not designed to assess patient-level moderator but, given that asking for raw data would probably decrease the likelihood of getting return from authors of primary studies, we would prefer staying at a two-stage approach. Thank you very much for your help and for this amazing mailing list! Corentin J Gosling Charlotte Pinabiaux Serge Caparos Richard Delorme Samuele Cortese
[R-meta] Moderation analysis in IPD meta-analysis
6 messages · Wolfgang Viechtbauer, Michael Dewey, Gerta Ruecker +1 more
Dear Corentin,
Overall, your approach seems sound. But a few notes:
1) study1<-glm(DV~IV, data=datastudy1) is not logistic regression. You need:
study1 <- glm(DV ~ IV, data=datastudy1, family=binomial)
2) I've only played around with emmeans a little bit, so can't comment on that part. But I don't think you even need it. You can just fit the model in such a way that you directly get the three log odds ratios for the three levels of IV. In fact, the estimates of the three log odds ratios are independent, so one could even just fit three simple logistic regression models that will give you the same results. An example:
dat <- data.frame(DV = c(1,0,0,1,0,1,1,1,0,1,1,1),
IV = c(1,3,2,3,5,3,7,7,4,9,6,3),
VM = rep(c("a","b","c"),each=4))
# parameterize logistic regression model so we get the three log odds ratios directly
res <- glm(DV ~ VM + IV:VM - 1, data=dat, family=binomial)
summary(res)
# the covariance between the three estimates is 0
round(vcov(res), 5)
# show that the simple logistic regression model for a subset gives the same results
res.a <- glm(DV ~ IV, data=dat, family=binomial, subset=VM=="a")
summary(res.a)
So actually the V matrix corresponding to the three log odds ratios is diagonal. But you still would want to account for potential dependency in the underlying true log odds ratios, so the model
model <- rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN", data=dat)
is still appropriate (with V being diagonal, so you can also just pass a vector with the sampling variances to rma.mv).
3) The statement that 2-stage approaches cannot be used to analyze patient-level moderators isn't quite true. If one actually analyzes the patient-level moderator in stage 1 (as you describe), then the 2-stage approach definitely allows you to examine such a patient-level moderator.
Best,
Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of GOSLING Corentin Sent: Friday, 07 August, 2020 20:03 To: r-sig-meta-analysis at r-project.org Subject: [R-meta] Moderation analysis in IPD meta-analysis Dear Metafor?s users, This is the first time that I post on this mailing list. Our team is currently planning an individual patient data meta-analysis of prospective cohorts. We adopt an IPD approach because no prospective study has yet reported this association while many should have the data to assess it (i.e., they have access to the targeted variables but did not report the association). Unfortunately, we anticipate that most of the included studies will not have an ethics committee that gives the right to share their data with us. To overcome this, we plan to ask the authors of the included studies to perform the analyses on their own data and to share only the results of the analyses with us. Our dependent variable is binary (DV: yes/no) and our independent variable is an ordered variable (IV: a scale variable in 12 points [from 1 to 12]) treated as a continuous variable. We ask authors to perform a logistic regression. Based on their results (log odds ratio and associated variance), we adopt a classic two-stage approach. I show the R code for a particular study to highlight our approach. #R code for study 1 study1<-glm(DV~IV, data=datastudy1) yi1<- summary(study1)$coefficients[2,1] #extract the log odds ratio vi1<- summary(study1)$coefficients[2,2]^2 #extract the squared standard error # then, we repeat the same process for each included study # Once all the effect sizes and their variance are collected, we can store them within a dataset and run a standard two stage meta-analysis dat<-data.frame( yi=c(yi1, yi2, yi3?), vi=c(vi1, vi2, vi3?), study=c(1, 2, 3?)) model<-rma(yi,vi, dat) This is the code for our primary analysis. In an exploratory analysis, we would like to perform a moderation analysis with a patient-level moderator. We understand how to perform a moderation analysis for a study-level moderator but we are not sure on how to implement it with a patient-level moderator. The aim of this moderation analysis will be to obtain the estimated average effects for each level of a moderator. I describe here the approach we have envisaged: # example of R code for study 1 #let VM denote a participant-level moderator with 3 categories (a,b,c) study1<-glm(DV~IV*VM, data=datastudy1) EM1<-emmeans::emtrends(study1, ~VM, var=" IV") yi1<- as.data.frame(EM1$emtrends)[,2] #extract log odds ratio for each level of the VM for study 1 (contains 3 values) V1<-vcov(EM1) # extract the variance/covariance matrix for study 1 (a 3x3 matrix) # then, we can build a dataset which will look like this... dat<-data.frame( yi=c(yi1, yi2, yi3?), VM=c(a,b,c,a,b,c,a,b,c?), study=c(1,1,1, 2,2,2, 3,3,3?)) # ...and a variance-covariance matrix using the bldiag function V<-bldiag(list(V1,V2,V3?)) # Last, we plan to perform a multivariate model in which we leave out the model intercept and in which we use an unstructured variance structure (if the model converges). model<-rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN", data=dat) We were wondering if you could give us some feedback on the correctness of our approach. We have read in several textbooks that two-stage meta-analyses are not designed to assess patient-level moderator but, given that asking for raw data would probably decrease the likelihood of getting return from authors of primary studies, we would prefer staying at a two-stage approach. Thank you very much for your help and for this amazing mailing list! Corentin J Gosling Charlotte Pinabiaux Serge Caparos Richard Delorme Samuele Cortese
Dear Pr Viechtbauer, Thank you very much for your answer! 1) Sorry for the family argument, I suppressed it when I copy/paste the code. 2) I was not aware of this solution in the glm function. I have compared it with the initial solution using emmeans and it gives similar results. Since your solution is definitively more elegant, we are going to use it. 3) Great, it is very reassuring to have your confirmation! We had the feeling that this was feasible but we were afraid to miss the reason preventing us from applying our approach to a patient-level moderator. Thank you so much for your help! Best Corentin J Gosling Le ven. 7 ao?t 2020 ? 20:34, Viechtbauer, Wolfgang (SP) < wolfgang.viechtbauer at maastrichtuniversity.nl> a ?crit :
Dear Corentin,
Overall, your approach seems sound. But a few notes:
1) study1<-glm(DV~IV, data=datastudy1) is not logistic regression. You
need:
study1 <- glm(DV ~ IV, data=datastudy1, family=binomial)
2) I've only played around with emmeans a little bit, so can't comment on
that part. But I don't think you even need it. You can just fit the model
in such a way that you directly get the three log odds ratios for the three
levels of IV. In fact, the estimates of the three log odds ratios are
independent, so one could even just fit three simple logistic regression
models that will give you the same results. An example:
dat <- data.frame(DV = c(1,0,0,1,0,1,1,1,0,1,1,1),
IV = c(1,3,2,3,5,3,7,7,4,9,6,3),
VM = rep(c("a","b","c"),each=4))
# parameterize logistic regression model so we get the three log odds
ratios directly
res <- glm(DV ~ VM + IV:VM - 1, data=dat, family=binomial)
summary(res)
# the covariance between the three estimates is 0
round(vcov(res), 5)
# show that the simple logistic regression model for a subset gives the
same results
res.a <- glm(DV ~ IV, data=dat, family=binomial, subset=VM=="a")
summary(res.a)
So actually the V matrix corresponding to the three log odds ratios is
diagonal. But you still would want to account for potential dependency in
the underlying true log odds ratios, so the model
model <- rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN",
data=dat)
is still appropriate (with V being diagonal, so you can also just pass a
vector with the sampling variances to rma.mv).
3) The statement that 2-stage approaches cannot be used to analyze
patient-level moderators isn't quite true. If one actually analyzes the
patient-level moderator in stage 1 (as you describe), then the 2-stage
approach definitely allows you to examine such a patient-level moderator.
Best,
Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:
r-sig-meta-analysis-bounces at r-project.org]
On Behalf Of GOSLING Corentin Sent: Friday, 07 August, 2020 20:03 To: r-sig-meta-analysis at r-project.org Subject: [R-meta] Moderation analysis in IPD meta-analysis Dear Metafor?s users, This is the first time that I post on this mailing list. Our team is currently planning an individual patient data meta-analysis of prospective cohorts. We adopt an IPD approach because no prospective study has yet reported
this
association while many should have the data to assess it (i.e., they have access to the targeted variables but did not report the association). Unfortunately, we anticipate that most of the included studies will not have an ethics committee that gives the right to share their data with us. To overcome this, we plan to ask the authors of the included studies to perform the analyses on their own data and to share only the results of
the
analyses with us. Our dependent variable is binary (DV: yes/no) and our independent variable is an ordered variable (IV: a scale variable in 12 points [from 1 to 12]) treated as a continuous variable. We ask authors to perform a logistic regression. Based on their results (log odds ratio and associated variance), we adopt a classic two-stage approach. I show the R code for a particular study to highlight our approach. #R code for study 1 study1<-glm(DV~IV, data=datastudy1) yi1<- summary(study1)$coefficients[2,1] #extract the log odds ratio vi1<- summary(study1)$coefficients[2,2]^2 #extract the squared standard error # then, we repeat the same process for each included study # Once all the effect sizes and their variance are collected, we can store them within a dataset and run a standard two stage meta-analysis dat<-data.frame( yi=c(yi1, yi2, yi3?), vi=c(vi1, vi2, vi3?), study=c(1, 2, 3?)) model<-rma(yi,vi, dat) This is the code for our primary analysis. In an exploratory analysis, we would like to perform a moderation analysis with a patient-level
moderator.
We understand how to perform a moderation analysis for a study-level moderator but we are not sure on how to implement it with a patient-level moderator. The aim of this moderation analysis will be to obtain the estimated average effects for each level of a moderator. I describe here the approach we have envisaged: # example of R code for study 1 #let VM denote a participant-level moderator with 3 categories (a,b,c) study1<-glm(DV~IV*VM, data=datastudy1) EM1<-emmeans::emtrends(study1, ~VM, var=" IV") yi1<- as.data.frame(EM1$emtrends)[,2] #extract log odds ratio for each level of the VM for study 1 (contains 3 values) V1<-vcov(EM1) # extract the variance/covariance matrix for study 1 (a 3x3 matrix) # then, we can build a dataset which will look like this... dat<-data.frame( yi=c(yi1, yi2, yi3?), VM=c(a,b,c,a,b,c,a,b,c?), study=c(1,1,1, 2,2,2, 3,3,3?)) # ...and a variance-covariance matrix using the bldiag function V<-bldiag(list(V1,V2,V3?)) # Last, we plan to perform a multivariate model in which we leave out the model intercept and in which we use an unstructured variance structure (if the model converges). model<-rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN",
data=dat)
We were wondering if you could give us some feedback on the correctness of our approach. We have read in several textbooks that two-stage meta-analyses are not designed to assess patient-level moderator but,
given
that asking for raw data would probably decrease the likelihood of getting return from authors of primary studies, we would prefer staying at a two-stage approach. Thank you very much for your help and for this amazing mailing list! Corentin J Gosling Charlotte Pinabiaux Serge Caparos Richard Delorme Samuele Cortese
Dear Corentin I have not investigated this in detail but there are two packages on CRAN you might want to look at. https://cran.r-project.org/package=multinma claims to be able to integrate IPD and aggregate data in one analysis https://cran.r-project.org/package=metagam which claims to be able to get the other researchers to run the analysis and share it with you when they are not allowed to share the data. As I say I have not looked at any of them in detail so this may be wide of the mark but worth a brief look. Michael
On 08/08/2020 09:14, GOSLING Corentin wrote:
Dear Pr Viechtbauer, Thank you very much for your answer! 1) Sorry for the family argument, I suppressed it when I copy/paste the code. 2) I was not aware of this solution in the glm function. I have compared it with the initial solution using emmeans and it gives similar results. Since your solution is definitively more elegant, we are going to use it. 3) Great, it is very reassuring to have your confirmation! We had the feeling that this was feasible but we were afraid to miss the reason preventing us from applying our approach to a patient-level moderator. Thank you so much for your help! Best Corentin J Gosling Le ven. 7 ao?t 2020 ? 20:34, Viechtbauer, Wolfgang (SP) < wolfgang.viechtbauer at maastrichtuniversity.nl> a ?crit :
Dear Corentin,
Overall, your approach seems sound. But a few notes:
1) study1<-glm(DV~IV, data=datastudy1) is not logistic regression. You
need:
study1 <- glm(DV ~ IV, data=datastudy1, family=binomial)
2) I've only played around with emmeans a little bit, so can't comment on
that part. But I don't think you even need it. You can just fit the model
in such a way that you directly get the three log odds ratios for the three
levels of IV. In fact, the estimates of the three log odds ratios are
independent, so one could even just fit three simple logistic regression
models that will give you the same results. An example:
dat <- data.frame(DV = c(1,0,0,1,0,1,1,1,0,1,1,1),
IV = c(1,3,2,3,5,3,7,7,4,9,6,3),
VM = rep(c("a","b","c"),each=4))
# parameterize logistic regression model so we get the three log odds
ratios directly
res <- glm(DV ~ VM + IV:VM - 1, data=dat, family=binomial)
summary(res)
# the covariance between the three estimates is 0
round(vcov(res), 5)
# show that the simple logistic regression model for a subset gives the
same results
res.a <- glm(DV ~ IV, data=dat, family=binomial, subset=VM=="a")
summary(res.a)
So actually the V matrix corresponding to the three log odds ratios is
diagonal. But you still would want to account for potential dependency in
the underlying true log odds ratios, so the model
model <- rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN",
data=dat)
is still appropriate (with V being diagonal, so you can also just pass a
vector with the sampling variances to rma.mv).
3) The statement that 2-stage approaches cannot be used to analyze
patient-level moderators isn't quite true. If one actually analyzes the
patient-level moderator in stage 1 (as you describe), then the 2-stage
approach definitely allows you to examine such a patient-level moderator.
Best,
Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:
r-sig-meta-analysis-bounces at r-project.org]
On Behalf Of GOSLING Corentin Sent: Friday, 07 August, 2020 20:03 To: r-sig-meta-analysis at r-project.org Subject: [R-meta] Moderation analysis in IPD meta-analysis Dear Metafor?s users, This is the first time that I post on this mailing list. Our team is currently planning an individual patient data meta-analysis of prospective cohorts. We adopt an IPD approach because no prospective study has yet reported
this
association while many should have the data to assess it (i.e., they have access to the targeted variables but did not report the association). Unfortunately, we anticipate that most of the included studies will not have an ethics committee that gives the right to share their data with us. To overcome this, we plan to ask the authors of the included studies to perform the analyses on their own data and to share only the results of
the
analyses with us. Our dependent variable is binary (DV: yes/no) and our independent variable is an ordered variable (IV: a scale variable in 12 points [from 1 to 12]) treated as a continuous variable. We ask authors to perform a logistic regression. Based on their results (log odds ratio and associated variance), we adopt a classic two-stage approach. I show the R code for a particular study to highlight our approach. #R code for study 1 study1<-glm(DV~IV, data=datastudy1) yi1<- summary(study1)$coefficients[2,1] #extract the log odds ratio vi1<- summary(study1)$coefficients[2,2]^2 #extract the squared standard error # then, we repeat the same process for each included study # Once all the effect sizes and their variance are collected, we can store them within a dataset and run a standard two stage meta-analysis dat<-data.frame( yi=c(yi1, yi2, yi3?), vi=c(vi1, vi2, vi3?), study=c(1, 2, 3?)) model<-rma(yi,vi, dat) This is the code for our primary analysis. In an exploratory analysis, we would like to perform a moderation analysis with a patient-level
moderator.
We understand how to perform a moderation analysis for a study-level moderator but we are not sure on how to implement it with a patient-level moderator. The aim of this moderation analysis will be to obtain the estimated average effects for each level of a moderator. I describe here the approach we have envisaged: # example of R code for study 1 #let VM denote a participant-level moderator with 3 categories (a,b,c) study1<-glm(DV~IV*VM, data=datastudy1) EM1<-emmeans::emtrends(study1, ~VM, var=" IV") yi1<- as.data.frame(EM1$emtrends)[,2] #extract log odds ratio for each level of the VM for study 1 (contains 3 values) V1<-vcov(EM1) # extract the variance/covariance matrix for study 1 (a 3x3 matrix) # then, we can build a dataset which will look like this... dat<-data.frame( yi=c(yi1, yi2, yi3?), VM=c(a,b,c,a,b,c,a,b,c?), study=c(1,1,1, 2,2,2, 3,3,3?)) # ...and a variance-covariance matrix using the bldiag function V<-bldiag(list(V1,V2,V3?)) # Last, we plan to perform a multivariate model in which we leave out the model intercept and in which we use an unstructured variance structure (if the model converges). model<-rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN",
data=dat)
We were wondering if you could give us some feedback on the correctness of our approach. We have read in several textbooks that two-stage meta-analyses are not designed to assess patient-level moderator but,
given
that asking for raw data would probably decrease the likelihood of getting return from authors of primary studies, we would prefer staying at a two-stage approach. Thank you very much for your help and for this amazing mailing list! Corentin J Gosling Charlotte Pinabiaux Serge Caparos Richard Delorme Samuele Cortese
[[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
Dear Corentin, you may also be interested in the DATAShield project, see their website https://www.datashield.ac.uk/ Best, Gerta Am 08.08.2020 um 12:21 schrieb Michael Dewey:
Dear Corentin I have not investigated this in detail but there are two packages on CRAN you might want to look at. https://cran.r-project.org/package=multinma claims to be able to integrate IPD and aggregate data in one analysis https://cran.r-project.org/package=metagam which claims to be able to get the other researchers to run the analysis and share it with you when they are not allowed to share the data. As I say I have not looked at any of them in detail so this may be wide of the mark but worth a brief look. Michael On 08/08/2020 09:14, GOSLING Corentin wrote:
Dear Pr Viechtbauer, Thank you very much for your answer! 1) Sorry for the family argument, I suppressed it when I copy/paste the code. 2) I was not aware of this solution in the glm function. I have compared it with the initial solution using emmeans and it gives similar results. Since your solution is definitively more elegant, we are going to use it. 3) Great, it is very reassuring to have your confirmation! We had the feeling that this was feasible but we were afraid to miss the reason preventing us from applying our approach to a patient-level moderator. Thank you so much for your help! Best Corentin J Gosling Le ven. 7 ao?t 2020 ? 20:34, Viechtbauer, Wolfgang (SP) < wolfgang.viechtbauer at maastrichtuniversity.nl> a ?crit :
Dear Corentin,
Overall, your approach seems sound. But a few notes:
1) study1<-glm(DV~IV, data=datastudy1) is not logistic regression. You
need:
study1 <- glm(DV ~ IV, data=datastudy1, family=binomial)
2) I've only played around with emmeans a little bit, so can't
comment on
that part. But I don't think you even need it. You can just fit the
model
in such a way that you directly get the three log odds ratios for
the three
levels of IV. In fact, the estimates of the three log odds ratios are
independent, so one could even just fit three simple logistic
regression
models that will give you the same results. An example:
dat <- data.frame(DV = c(1,0,0,1,0,1,1,1,0,1,1,1),
?????????????????? IV = c(1,3,2,3,5,3,7,7,4,9,6,3),
?????????????????? VM = rep(c("a","b","c"),each=4))
# parameterize logistic regression model so we get the three log odds
ratios directly
res <- glm(DV ~ VM + IV:VM - 1, data=dat, family=binomial)
summary(res)
# the covariance between the three estimates is 0
round(vcov(res), 5)
# show that the simple logistic regression model for a subset gives the
same results
res.a <- glm(DV ~ IV, data=dat, family=binomial, subset=VM=="a")
summary(res.a)
So actually the V matrix corresponding to the three log odds ratios is
diagonal. But you still would want to account for potential
dependency in
the underlying true log odds ratios, so the model
model <- rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN",
data=dat)
is still appropriate (with V being diagonal, so you can also just
pass a
vector with the sampling variances to rma.mv).
3) The statement that 2-stage approaches cannot be used to analyze
patient-level moderators isn't quite true. If one actually analyzes the
patient-level moderator in stage 1 (as you describe), then the 2-stage
approach definitely allows you to examine such a patient-level
moderator.
Best,
Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:
r-sig-meta-analysis-bounces at r-project.org]
On Behalf Of GOSLING Corentin Sent: Friday, 07 August, 2020 20:03 To: r-sig-meta-analysis at r-project.org Subject: [R-meta] Moderation analysis in IPD meta-analysis Dear Metafor?s users, This is the first time that I post on this mailing list. Our team is currently planning an individual patient data meta-analysis of prospective cohorts. We adopt an IPD approach because no prospective study has yet reported
this
association while many should have the data to assess it (i.e., they have access to the targeted variables but did not report the association). Unfortunately, we anticipate that most of the included studies will not have an ethics committee that gives the right to share their data with us. To overcome this, we plan to ask the authors of the included studies to perform the analyses on their own data and to share only the results of
the
analyses with us. Our dependent variable is binary (DV: yes/no) and our independent variable is an ordered variable (IV: a scale variable in 12 points [from 1 to 12]) treated as a continuous variable. We ask authors to perform a logistic regression. Based on their results (log odds ratio and associated variance), we adopt a classic two-stage approach. I show the R code for a particular study to highlight our approach. #R code for study 1 study1<-glm(DV~IV, data=datastudy1) yi1<- summary(study1)$coefficients[2,1] #extract the log odds ratio vi1<- summary(study1)$coefficients[2,2]^2 #extract the squared standard error # then, we repeat the same process for each included study # Once all the effect sizes and their variance are collected, we can store them within a dataset and run a standard two stage meta-analysis dat<-data.frame( yi=c(yi1, yi2, yi3?), vi=c(vi1, vi2, vi3?), study=c(1, 2, 3?)) model<-rma(yi,vi, dat) This is the code for our primary analysis. In an exploratory analysis, we would like to perform a moderation analysis with a patient-level
moderator.
We understand how to perform a moderation analysis for a study-level moderator but we are not sure on how to implement it with a patient-level moderator. The aim of this moderation analysis will be to obtain the estimated average effects for each level of a moderator. I describe here the approach we have envisaged: # example of R code for study 1 #let VM denote a participant-level moderator with 3 categories (a,b,c) study1<-glm(DV~IV*VM, data=datastudy1) EM1<-emmeans::emtrends(study1, ~VM, var=" IV") yi1<- as.data.frame(EM1$emtrends)[,2] #extract log odds ratio for each level of the VM for study 1 (contains 3 values) V1<-vcov(EM1) # extract the variance/covariance matrix for study 1 (a 3x3 matrix) # then, we can build a dataset which will look like this... dat<-data.frame( yi=c(yi1, yi2, yi3?), VM=c(a,b,c,a,b,c,a,b,c?), study=c(1,1,1, 2,2,2, 3,3,3?)) # ...and a variance-covariance matrix using the bldiag function V<-bldiag(list(V1,V2,V3?)) # Last, we plan to perform a multivariate model in which we leave out the model intercept and in which we use an unstructured variance structure (if the model converges). model<-rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN",
data=dat)
We were wondering if you could give us some feedback on the correctness of our approach. We have read in several textbooks that two-stage meta-analyses are not designed to assess patient-level moderator but,
given
that asking for raw data would probably decrease the likelihood of getting return from authors of primary studies, we would prefer staying at a two-stage approach. Thank you very much for your help and for this amazing mailing list! Corentin J Gosling Charlotte Pinabiaux Serge Caparos Richard Delorme Samuele Cortese
????[[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
Hi everyone, Thank you so much much for taking the time to reply to us. @Pr Dewey We have never heard of the ?multinma? package. I will take a closer look to the package in the next few weeks. If it allows us to fit both a one-stage and two-stage meta-analysis in our situation, I will refresh this post so that every member of the list could have the information. I have already heard from metagam (thanks to this mailing list if I remember well). However, when I have read the R code underlying the critical function handling individual patient data at github, I found (if I understood it well) that it extracts only parts of the model containing aggregated data. Therefore, I think that our approach is almost identical to metagam except that we use a ?home-made? function to extract critical information and that we target glm and not gam objects. Again, if I misunderstood something, do not hesitate to correct me. Thank you very much for your advice! @ Pr R?cker Thank you very much for sharing this initiative, none of us knew it. I am going to present this approach to the DPO of my university, but it looks extremely promising for our study: the authors of the primary studies would not have to do the analyses themselves, and this could encourage them to participate in our project. Thank you all for your help! Best Corentin Le sam. 8 ao?t 2020 ? 12:42, Dr. Gerta R?cker <ruecker at imbi.uni-freiburg.de> a ?crit :
Dear Corentin, you may also be interested in the DATAShield project, see their website https://www.datashield.ac.uk/ Best, Gerta Am 08.08.2020 um 12:21 schrieb Michael Dewey:
Dear Corentin I have not investigated this in detail but there are two packages on CRAN you might want to look at. https://cran.r-project.org/package=multinma claims to be able to integrate IPD and aggregate data in one analysis https://cran.r-project.org/package=metagam which claims to be able to get the other researchers to run the analysis and share it with you when they are not allowed to share the data. As I say I have not looked at any of them in detail so this may be wide of the mark but worth a brief look. Michael On 08/08/2020 09:14, GOSLING Corentin wrote:
Dear Pr Viechtbauer, Thank you very much for your answer! 1) Sorry for the family argument, I suppressed it when I copy/paste the code. 2) I was not aware of this solution in the glm function. I have compared it with the initial solution using emmeans and it gives similar results. Since your solution is definitively more elegant, we are going to use it. 3) Great, it is very reassuring to have your confirmation! We had the feeling that this was feasible but we were afraid to miss the reason preventing us from applying our approach to a patient-level moderator. Thank you so much for your help! Best Corentin J Gosling Le ven. 7 ao?t 2020 ? 20:34, Viechtbauer, Wolfgang (SP) < wolfgang.viechtbauer at maastrichtuniversity.nl> a ?crit :
Dear Corentin,
Overall, your approach seems sound. But a few notes:
1) study1<-glm(DV~IV, data=datastudy1) is not logistic regression. You
need:
study1 <- glm(DV ~ IV, data=datastudy1, family=binomial)
2) I've only played around with emmeans a little bit, so can't
comment on
that part. But I don't think you even need it. You can just fit the
model
in such a way that you directly get the three log odds ratios for
the three
levels of IV. In fact, the estimates of the three log odds ratios are
independent, so one could even just fit three simple logistic
regression
models that will give you the same results. An example:
dat <- data.frame(DV = c(1,0,0,1,0,1,1,1,0,1,1,1),
IV = c(1,3,2,3,5,3,7,7,4,9,6,3),
VM = rep(c("a","b","c"),each=4))
# parameterize logistic regression model so we get the three log odds
ratios directly
res <- glm(DV ~ VM + IV:VM - 1, data=dat, family=binomial)
summary(res)
# the covariance between the three estimates is 0
round(vcov(res), 5)
# show that the simple logistic regression model for a subset gives the
same results
res.a <- glm(DV ~ IV, data=dat, family=binomial, subset=VM=="a")
summary(res.a)
So actually the V matrix corresponding to the three log odds ratios is
diagonal. But you still would want to account for potential
dependency in
the underlying true log odds ratios, so the model
model <- rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN",
data=dat)
is still appropriate (with V being diagonal, so you can also just
pass a
vector with the sampling variances to rma.mv).
3) The statement that 2-stage approaches cannot be used to analyze
patient-level moderators isn't quite true. If one actually analyzes the
patient-level moderator in stage 1 (as you describe), then the 2-stage
approach definitely allows you to examine such a patient-level
moderator.
Best,
Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:
r-sig-meta-analysis-bounces at r-project.org]
On Behalf Of GOSLING Corentin Sent: Friday, 07 August, 2020 20:03 To: r-sig-meta-analysis at r-project.org Subject: [R-meta] Moderation analysis in IPD meta-analysis Dear Metafor?s users, This is the first time that I post on this mailing list. Our team is currently planning an individual patient data meta-analysis of prospective cohorts. We adopt an IPD approach because no prospective study has yet reported
this
association while many should have the data to assess it (i.e., they have access to the targeted variables but did not report the association). Unfortunately, we anticipate that most of the included studies will not have an ethics committee that gives the right to share their data with us. To overcome this, we plan to ask the authors of the included studies to perform the analyses on their own data and to share only the results of
the
analyses with us. Our dependent variable is binary (DV: yes/no) and our independent variable is an ordered variable (IV: a scale variable in 12 points [from 1 to 12]) treated as a continuous variable. We ask authors to perform a logistic regression. Based on their results (log odds ratio and associated variance), we adopt a classic two-stage approach. I show the R code for a particular study to highlight our approach. #R code for study 1 study1<-glm(DV~IV, data=datastudy1) yi1<- summary(study1)$coefficients[2,1] #extract the log odds ratio vi1<- summary(study1)$coefficients[2,2]^2 #extract the squared standard error # then, we repeat the same process for each included study # Once all the effect sizes and their variance are collected, we can store them within a dataset and run a standard two stage meta-analysis dat<-data.frame( yi=c(yi1, yi2, yi3?), vi=c(vi1, vi2, vi3?), study=c(1, 2, 3?)) model<-rma(yi,vi, dat) This is the code for our primary analysis. In an exploratory analysis, we would like to perform a moderation analysis with a patient-level
moderator.
We understand how to perform a moderation analysis for a study-level moderator but we are not sure on how to implement it with a patient-level moderator. The aim of this moderation analysis will be to obtain the estimated average effects for each level of a moderator. I describe here the approach we have envisaged: # example of R code for study 1 #let VM denote a participant-level moderator with 3 categories (a,b,c) study1<-glm(DV~IV*VM, data=datastudy1) EM1<-emmeans::emtrends(study1, ~VM, var=" IV") yi1<- as.data.frame(EM1$emtrends)[,2] #extract log odds ratio for each level of the VM for study 1 (contains 3 values) V1<-vcov(EM1) # extract the variance/covariance matrix for study 1 (a 3x3 matrix) # then, we can build a dataset which will look like this... dat<-data.frame( yi=c(yi1, yi2, yi3?), VM=c(a,b,c,a,b,c,a,b,c?), study=c(1,1,1, 2,2,2, 3,3,3?)) # ...and a variance-covariance matrix using the bldiag function V<-bldiag(list(V1,V2,V3?)) # Last, we plan to perform a multivariate model in which we leave out the model intercept and in which we use an unstructured variance structure (if the model converges). model<-rma.mv(yi, V, mods = ~ VM-1, random=~VM|study, struct="UN",
data=dat)
We were wondering if you could give us some feedback on the correctness of our approach. We have read in several textbooks that two-stage meta-analyses are not designed to assess patient-level moderator but,
given
that asking for raw data would probably decrease the likelihood of getting return from authors of primary studies, we would prefer staying at a two-stage approach. Thank you very much for your help and for this amazing mailing list! Corentin J Gosling Charlotte Pinabiaux Serge Caparos Richard Delorme Samuele Cortese
[[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis