Skip to content

[R-meta] Do the results of rma.mv() depend on how dataframe rows are ordered/arranged?

5 messages · Michael Dewey, Gabriele Midolo, James Pustejovsky

#
Dear all,

I noticed that the results of my rma.mv() model strongly changes when the
dataframe rows are arranged in a different order. This is new to me and I
can?t really understand why. I thought it was an issue related to how the V
matrix get computed (?), but they are actually identical independently from
rows order in the dataframe? how is a dataframe supposed to be arranged
before you can trust rma.mv() results then? Thanks.


EXAMPLE:

calc.v <- function(x) {

  v <- matrix((x$sdC_imputed[1]^2 / (x$nC[1] * x$C[1]^2)) , nrow=nrow(x),
ncol=nrow(x))

  diag(v) <- x$vi_Bracken

  v

}

random = ~ 1 | study/ID


If I run the following:



a<-df%>%filter(TRAIT=="LA")

dat<-a%>%mutate(sdC_imputed=sdC)%>%ungroup()

dat<-metagear::impute_SD(dat, "sdC_imputed", "C", method = "Bracken1992")

V <- metafor::bldiag(lapply(split(dat, dat$ctrl_id), calc.v))

m<-rma.mv(yi~dTf,V,data=dat,random = random)

m



Multivariate Meta-Analysis Model (k = 34; method: REML)



Variance Components:



            estim    sqrt  nlvls  fixed    factor

sigma^2.1  0.0000  0.0000      7     no     study

sigma^2.2  0.0607  0.2464     34     no  study/ID



Test for Residual Heterogeneity:

QE(df = 32) = 1024.8720, p-val < .0001



Test of Moderators (coefficient 2):

QM(df = 1) = 1.6018, p-val = 0.2056



Model Results:



         estimate      se     zval    pval    ci.lb   ci.ub

intrcpt   -0.0786  0.0538  -1.4617  0.1438  -0.1841  0.0268

dTf       -0.0233  0.0184  -1.2656  0.2056  -0.0593  0.0128



If I run the same code but arrange the dataframe by e.g. the ID of the
observation via dplyr::arrange(),

a<-df%>%filter(TRAIT=="LA")%>%arrange(ID)


Then I get the following output? i.e. a completely different result:

Multivariate Meta-Analysis Model (k = 34; method: REML)



Variance Components:



            estim    sqrt  nlvls  fixed    factor

sigma^2.1  0.0000  0.0000      7     no     study

sigma^2.2  0.0336  0.1834     34     no  study/ID



Test for Residual Heterogeneity:

QE(df = 32) = 334.3701, p-val < .0001



Test of Moderators (coefficient 2):

QM(df = 1) = 13.9681, p-val = 0.0002



Model Results:



         estimate      se     zval    pval    ci.lb    ci.ub

intrcpt   -0.0740  0.0416  -1.7808  0.0749  -0.1554   0.0074    .

dTf       -0.0668  0.0179  -3.7374  0.0002  -0.1019  -0.0318  ***
#
Dear Gabriele

In the code you supply it appears you re-ordered the data but not the V 
matrix. Is that the issue? If not can you post a reproducible example 
with a minimal data-set and, ideally, using just plain R to manipulate 
the data-set so we can see more clearly what the code does.

Michael
On 22/08/2019 10:29, Gabriele Midolo wrote:

  
    
  
#
Dear Michael,

That shouldn't be the issue, because I re-compute V after I sorted the rows
and before I fitted the model (not showed in my previous email). Please
find attached a minimal data-set and reproducible R script illustrating
what is going on...

Thank you very much,
Gabri.

PS: yi and vi_Bracken are the effect size and sampling variance of
log-transformed response ratios obtained via escalc(), respectively.
On Thu, 22 Aug 2019 at 13:07, Michael Dewey <lists at dewey.myzen.co.uk> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20190822/cc95ecbf/attachment-0001.html>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: dat_LA.csv
Type: application/vnd.ms-excel
Size: 4309 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20190822/cc95ecbf/attachment-0001.xlb>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: rma_problem_GM.R
Type: application/octet-stream
Size: 1020 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20190822/cc95ecbf/attachment-0001.obj>
#
Gabriele,

The problem is that the variance matrix gets re-ordered when you do split()
%>% lapply() %>% bldiag(). Try running the following to see the issue:

# data frame, not sorted by ID
dat <- data.frame(ID = rep(LETTERS[1:3], 4), v = 1:12)
split(dat$v, dat$ID)
bldiag(lapply(split(dat$v, dat$ID), diag, nrow = 4))

To use the split() %>% lapply() %>% bldiag() technique and have the matrix
come out so that the entries of the V matrix correspond to the rows of the
data, the data needs to be sorted by the same ID variable that is used in
split().

James

On Thu, Aug 22, 2019 at 7:30 AM Gabriele Midolo <gabriele.midolo at gmail.com>
wrote:

  
  
#
Dear James,

So, concerning the code I have put above, one simply need to sort the
ctrl_id with ascending order before computing V?
Something like:
data1<-data1%>%arrange(ctrl_id) # ...using dplyr
That way the dataframe should be ordered in the same way as
split(data1$vi_Bracken,data1$ctrl_id)...?

Thanks for this hint btw - very appreciated.

Gabri.
On Thu, 22 Aug 2019 at 15:58, James Pustejovsky <jepusto at gmail.com> wrote: