Dear all,
I noticed that the results of my rma.mv() model strongly changes when the
dataframe rows are arranged in a different order. This is new to me and I
can?t really understand why. I thought it was an issue related to how the V
matrix get computed (?), but they are actually identical independently from
rows order in the dataframe? how is a dataframe supposed to be arranged
before you can trust rma.mv() results then? Thanks.
EXAMPLE:
calc.v <- function(x) {
v <- matrix((x$sdC_imputed[1]^2 / (x$nC[1] * x$C[1]^2)) , nrow=nrow(x),
ncol=nrow(x))
diag(v) <- x$vi_Bracken
v
}
random = ~ 1 | study/ID
If I run the following:
a<-df%>%filter(TRAIT=="LA")
dat<-a%>%mutate(sdC_imputed=sdC)%>%ungroup()
dat<-metagear::impute_SD(dat, "sdC_imputed", "C", method = "Bracken1992")
V <- metafor::bldiag(lapply(split(dat, dat$ctrl_id), calc.v))
m<-rma.mv(yi~dTf,V,data=dat,random = random)
m
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0607 0.2464 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 1024.8720, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 1.6018, p-val = 0.2056
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0786 0.0538 -1.4617 0.1438 -0.1841 0.0268
dTf -0.0233 0.0184 -1.2656 0.2056 -0.0593 0.0128
If I run the same code but arrange the dataframe by e.g. the ID of the
observation via dplyr::arrange(),
a<-df%>%filter(TRAIT=="LA")%>%arrange(ID)
Then I get the following output? i.e. a completely different result:
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0336 0.1834 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 334.3701, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 13.9681, p-val = 0.0002
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0740 0.0416 -1.7808 0.0749 -0.1554 0.0074 .
dTf -0.0668 0.0179 -3.7374 0.0002 -0.1019 -0.0318 ***
[R-meta] Do the results of rma.mv() depend on how dataframe rows are ordered/arranged?
5 messages · Michael Dewey, Gabriele Midolo, James Pustejovsky
Dear Gabriele In the code you supply it appears you re-ordered the data but not the V matrix. Is that the issue? If not can you post a reproducible example with a minimal data-set and, ideally, using just plain R to manipulate the data-set so we can see more clearly what the code does. Michael
On 22/08/2019 10:29, Gabriele Midolo wrote:
Dear all,
I noticed that the results of my rma.mv() model strongly changes when the
dataframe rows are arranged in a different order. This is new to me and I
can?t really understand why. I thought it was an issue related to how the V
matrix get computed (?), but they are actually identical independently from
rows order in the dataframe? how is a dataframe supposed to be arranged
before you can trust rma.mv() results then? Thanks.
EXAMPLE:
calc.v <- function(x) {
v <- matrix((x$sdC_imputed[1]^2 / (x$nC[1] * x$C[1]^2)) , nrow=nrow(x),
ncol=nrow(x))
diag(v) <- x$vi_Bracken
v
}
random = ~ 1 | study/ID
If I run the following:
a<-df%>%filter(TRAIT=="LA")
dat<-a%>%mutate(sdC_imputed=sdC)%>%ungroup()
dat<-metagear::impute_SD(dat, "sdC_imputed", "C", method = "Bracken1992")
V <- metafor::bldiag(lapply(split(dat, dat$ctrl_id), calc.v))
m<-rma.mv(yi~dTf,V,data=dat,random = random)
m
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0607 0.2464 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 1024.8720, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 1.6018, p-val = 0.2056
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0786 0.0538 -1.4617 0.1438 -0.1841 0.0268
dTf -0.0233 0.0184 -1.2656 0.2056 -0.0593 0.0128
If I run the same code but arrange the dataframe by e.g. the ID of the
observation via dplyr::arrange(),
a<-df%>%filter(TRAIT=="LA")%>%arrange(ID)
Then I get the following output? i.e. a completely different result:
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0336 0.1834 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 334.3701, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 13.9681, p-val = 0.0002
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0740 0.0416 -1.7808 0.0749 -0.1554 0.0074 .
dTf -0.0668 0.0179 -3.7374 0.0002 -0.1019 -0.0318 ***
[[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis --- This email has been checked for viruses by AVG. https://www.avg.com
Dear Michael, That shouldn't be the issue, because I re-compute V after I sorted the rows and before I fitted the model (not showed in my previous email). Please find attached a minimal data-set and reproducible R script illustrating what is going on... Thank you very much, Gabri. PS: yi and vi_Bracken are the effect size and sampling variance of log-transformed response ratios obtained via escalc(), respectively.
On Thu, 22 Aug 2019 at 13:07, Michael Dewey <lists at dewey.myzen.co.uk> wrote:
Dear Gabriele In the code you supply it appears you re-ordered the data but not the V matrix. Is that the issue? If not can you post a reproducible example with a minimal data-set and, ideally, using just plain R to manipulate the data-set so we can see more clearly what the code does. Michael On 22/08/2019 10:29, Gabriele Midolo wrote:
Dear all, I noticed that the results of my rma.mv() model strongly changes when
the
dataframe rows are arranged in a different order. This is new to me and I can?t really understand why. I thought it was an issue related to how
the V
matrix get computed (?), but they are actually identical independently
from
rows order in the dataframe? how is a dataframe supposed to be arranged
before you can trust rma.mv() results then? Thanks.
EXAMPLE:
calc.v <- function(x) {
v <- matrix((x$sdC_imputed[1]^2 / (x$nC[1] * x$C[1]^2)) ,
nrow=nrow(x),
ncol=nrow(x))
diag(v) <- x$vi_Bracken
v
}
random = ~ 1 | study/ID
If I run the following:
a<-df%>%filter(TRAIT=="LA")
dat<-a%>%mutate(sdC_imputed=sdC)%>%ungroup()
dat<-metagear::impute_SD(dat, "sdC_imputed", "C", method = "Bracken1992")
V <- metafor::bldiag(lapply(split(dat, dat$ctrl_id), calc.v))
m<-rma.mv(yi~dTf,V,data=dat,random = random)
m
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0607 0.2464 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 1024.8720, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 1.6018, p-val = 0.2056
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0786 0.0538 -1.4617 0.1438 -0.1841 0.0268
dTf -0.0233 0.0184 -1.2656 0.2056 -0.0593 0.0128
If I run the same code but arrange the dataframe by e.g. the ID of the
observation via dplyr::arrange(),
a<-df%>%filter(TRAIT=="LA")%>%arrange(ID)
Then I get the following output? i.e. a completely different result:
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0336 0.1834 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 334.3701, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 13.9681, p-val = 0.0002
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0740 0.0416 -1.7808 0.0749 -0.1554 0.0074 .
dTf -0.0668 0.0179 -3.7374 0.0002 -0.1019 -0.0318 ***
[[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis --- This email has been checked for viruses by AVG. https://www.avg.com
-- Michael http://www.dewey.myzen.co.uk/home.html
-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20190822/cc95ecbf/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: dat_LA.csv Type: application/vnd.ms-excel Size: 4309 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20190822/cc95ecbf/attachment-0001.xlb> -------------- next part -------------- A non-text attachment was scrubbed... Name: rma_problem_GM.R Type: application/octet-stream Size: 1020 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20190822/cc95ecbf/attachment-0001.obj>
Gabriele, The problem is that the variance matrix gets re-ordered when you do split() %>% lapply() %>% bldiag(). Try running the following to see the issue: # data frame, not sorted by ID dat <- data.frame(ID = rep(LETTERS[1:3], 4), v = 1:12) split(dat$v, dat$ID) bldiag(lapply(split(dat$v, dat$ID), diag, nrow = 4)) To use the split() %>% lapply() %>% bldiag() technique and have the matrix come out so that the entries of the V matrix correspond to the rows of the data, the data needs to be sorted by the same ID variable that is used in split(). James On Thu, Aug 22, 2019 at 7:30 AM Gabriele Midolo <gabriele.midolo at gmail.com> wrote:
Dear Michael, That shouldn't be the issue, because I re-compute V after I sorted the rows and before I fitted the model (not showed in my previous email). Please find attached a minimal data-set and reproducible R script illustrating what is going on... Thank you very much, Gabri. PS: yi and vi_Bracken are the effect size and sampling variance of log-transformed response ratios obtained via escalc(), respectively. On Thu, 22 Aug 2019 at 13:07, Michael Dewey <lists at dewey.myzen.co.uk> wrote:
Dear Gabriele In the code you supply it appears you re-ordered the data but not the V matrix. Is that the issue? If not can you post a reproducible example with a minimal data-set and, ideally, using just plain R to manipulate the data-set so we can see more clearly what the code does. Michael On 22/08/2019 10:29, Gabriele Midolo wrote:
Dear all, I noticed that the results of my rma.mv() model strongly changes when
the
dataframe rows are arranged in a different order. This is new to me and
I
can?t really understand why. I thought it was an issue related to how
the V
matrix get computed (?), but they are actually identical independently
from
rows order in the dataframe? how is a dataframe supposed to be arranged
before you can trust rma.mv() results then? Thanks.
EXAMPLE:
calc.v <- function(x) {
v <- matrix((x$sdC_imputed[1]^2 / (x$nC[1] * x$C[1]^2)) ,
nrow=nrow(x),
ncol=nrow(x)) diag(v) <- x$vi_Bracken v } random = ~ 1 | study/ID If I run the following: a<-df%>%filter(TRAIT=="LA") dat<-a%>%mutate(sdC_imputed=sdC)%>%ungroup() dat<-metagear::impute_SD(dat, "sdC_imputed", "C", method =
"Bracken1992")
V <- metafor::bldiag(lapply(split(dat, dat$ctrl_id), calc.v))
m<-rma.mv(yi~dTf,V,data=dat,random = random)
m
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0607 0.2464 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 1024.8720, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 1.6018, p-val = 0.2056
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0786 0.0538 -1.4617 0.1438 -0.1841 0.0268
dTf -0.0233 0.0184 -1.2656 0.2056 -0.0593 0.0128
If I run the same code but arrange the dataframe by e.g. the ID of the
observation via dplyr::arrange(),
a<-df%>%filter(TRAIT=="LA")%>%arrange(ID)
Then I get the following output? i.e. a completely different result:
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0336 0.1834 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 334.3701, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 13.9681, p-val = 0.0002
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0740 0.0416 -1.7808 0.0749 -0.1554 0.0074 .
dTf -0.0668 0.0179 -3.7374 0.0002 -0.1019 -0.0318 ***
[[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis --- This email has been checked for viruses by AVG. https://www.avg.com
-- Michael http://www.dewey.myzen.co.uk/home.html
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
Dear James, So, concerning the code I have put above, one simply need to sort the ctrl_id with ascending order before computing V? Something like: data1<-data1%>%arrange(ctrl_id) # ...using dplyr That way the dataframe should be ordered in the same way as split(data1$vi_Bracken,data1$ctrl_id)...? Thanks for this hint btw - very appreciated. Gabri.
On Thu, 22 Aug 2019 at 15:58, James Pustejovsky <jepusto at gmail.com> wrote:
Gabriele, The problem is that the variance matrix gets re-ordered when you do split() %>% lapply() %>% bldiag(). Try running the following to see the issue: # data frame, not sorted by ID dat <- data.frame(ID = rep(LETTERS[1:3], 4), v = 1:12) split(dat$v, dat$ID) bldiag(lapply(split(dat$v, dat$ID), diag, nrow = 4)) To use the split() %>% lapply() %>% bldiag() technique and have the matrix come out so that the entries of the V matrix correspond to the rows of the data, the data needs to be sorted by the same ID variable that is used in split(). James On Thu, Aug 22, 2019 at 7:30 AM Gabriele Midolo <gabriele.midolo at gmail.com> wrote:
Dear Michael, That shouldn't be the issue, because I re-compute V after I sorted the rows and before I fitted the model (not showed in my previous email). Please find attached a minimal data-set and reproducible R script illustrating what is going on... Thank you very much, Gabri. PS: yi and vi_Bracken are the effect size and sampling variance of log-transformed response ratios obtained via escalc(), respectively. On Thu, 22 Aug 2019 at 13:07, Michael Dewey <lists at dewey.myzen.co.uk> wrote:
Dear Gabriele In the code you supply it appears you re-ordered the data but not the V matrix. Is that the issue? If not can you post a reproducible example with a minimal data-set and, ideally, using just plain R to manipulate the data-set so we can see more clearly what the code does. Michael On 22/08/2019 10:29, Gabriele Midolo wrote:
Dear all, I noticed that the results of my rma.mv() model strongly changes when
the
dataframe rows are arranged in a different order. This is new to me
and I
can?t really understand why. I thought it was an issue related to how
the V
matrix get computed (?), but they are actually identical independently
from
rows order in the dataframe? how is a dataframe supposed to be arranged
before you can trust rma.mv() results then? Thanks.
EXAMPLE:
calc.v <- function(x) {
v <- matrix((x$sdC_imputed[1]^2 / (x$nC[1] * x$C[1]^2)) ,
nrow=nrow(x),
ncol=nrow(x)) diag(v) <- x$vi_Bracken v } random = ~ 1 | study/ID If I run the following: a<-df%>%filter(TRAIT=="LA") dat<-a%>%mutate(sdC_imputed=sdC)%>%ungroup() dat<-metagear::impute_SD(dat, "sdC_imputed", "C", method =
"Bracken1992")
V <- metafor::bldiag(lapply(split(dat, dat$ctrl_id), calc.v))
m<-rma.mv(yi~dTf,V,data=dat,random = random)
m
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0607 0.2464 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 1024.8720, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 1.6018, p-val = 0.2056
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0786 0.0538 -1.4617 0.1438 -0.1841 0.0268
dTf -0.0233 0.0184 -1.2656 0.2056 -0.0593 0.0128
If I run the same code but arrange the dataframe by e.g. the ID of the
observation via dplyr::arrange(),
a<-df%>%filter(TRAIT=="LA")%>%arrange(ID)
Then I get the following output? i.e. a completely different result:
Multivariate Meta-Analysis Model (k = 34; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0000 0.0000 7 no study
sigma^2.2 0.0336 0.1834 34 no study/ID
Test for Residual Heterogeneity:
QE(df = 32) = 334.3701, p-val < .0001
Test of Moderators (coefficient 2):
QM(df = 1) = 13.9681, p-val = 0.0002
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt -0.0740 0.0416 -1.7808 0.0749 -0.1554 0.0074 .
dTf -0.0668 0.0179 -3.7374 0.0002 -0.1019 -0.0318 ***
[[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis --- This email has been checked for viruses by AVG. https://www.avg.com
-- Michael http://www.dewey.myzen.co.uk/home.html
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis