Hi all, I am fitting a multivariate multilevel meta-analysis in metafor and having trouble computing outlier and influential case diagnostics (i.e., cook?s distances per https://wviechtb.github.io/metafor/reference/influence.rma.mv.html). This a large dataset of 3360 Pearson?s correlations (converted to Fisher?s z) nested within 600 subsamples that are nested within 311 studies. Below is the code I used for the model and for computing Cook?s distances, and the problem is that it takes it a lot of time to run (I ran it overnight and it only reached 6%). I am assuming it is related to the size of the dataset and to the complex model structure, but I am not sure how to go about and speed up the processing. I should note that I am computing the distances based on the simplest possible model (i.e., no moderators and without considering dependencies among effect sizes within clusters). I was hoping someone could help with some suggestions of how best to move forward. Thanks, Yogev NoMods <- rma.mv(yi, vi, random = ~ 1 | StudyID/GroupID/EffectSizeID, data=Data,sparse=TRUE) summary(NoMods) NoModsCooksDistance <- cooks.distance(NoMods,progbar = T,cluster = StudyID, reestimate=FALSE,parallel="snow") NoModsCooksDistance plot(NoModsCooksDistance, type="o", pch=19) -- Yogev Kivity, Ph.D. Postdoctoral Fellow Department of Psychology The Pennsylvania State University Bruce V. Moore Building University Park, PA 16802 Office Phone: (814) 867-2330
[R-meta] Influential case diagnostics in a multivariate multilevel meta-analysis in metafor
2 messages · Yogev Kivity, Wolfgang Viechtbauer
Dear Yogev, Since you use 'cluster=StudyID', cooks.distance() is doing 311 model fits. But you use 'reestimate=FALSE', which should speed things up a lot. Also, 'sparse=TRUE' probably makes a lot of sense here, since the marginal var-cov structure is probably quite sparse. So, for the most part, you are already using features that should help to speed things up. But a few things: 1) You used 'cluster = StudyID', but unless you used attach(Data) or have 'StudyID' as a separate object in your workspace, this should not work. It should be 'cluster = Data$StudyID'. 2) If you use 'parallel="snow"', then no progress bar will be shown, so I wonder how you got the '6%' then. Or did you run this once without 'parallel="snow"'? 3) If you use 'parallel="snow"', then this won't give you any speed increase unless you actually make use of multiple cores. You can do this with the 'ncpus' argument. But first check how many cores you actually have available with parallel::detectCores() Note that this also counts 'logical' cores. If you are on MacOS or Windows, then detectCores(logical=FALSE) is a better indicator of how many cores to specify under 'ncpus'. Best, Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r- project.org] On Behalf Of Yogev Kivity Sent: Tuesday, 15 January, 2019 21:20 To: r-sig-meta-analysis at r-project.org Subject: [R-meta] Influential case diagnostics in a multivariate multilevel meta-analysis in metafor Hi all, I am fitting a multivariate multilevel meta-analysis in metafor and having trouble computing outlier and influential case diagnostics (i.e., cook?s distances per https://wviechtb.github.io/metafor/reference/influence.rma.mv.html). This a large dataset of 3360 Pearson?s correlations (converted to Fisher?s z) nested within 600 subsamples that are nested within 311 studies. Below is the code I used for the model and for computing Cook?s distances, and the problem is that it takes it a lot of time to run (I ran it overnight and it only reached 6%). I am assuming it is related to the size of the dataset and to the complex model structure, but I am not sure how to go about and speed up the processing. I should note that I am computing the distances based on the simplest possible model (i.e., no moderators and without considering dependencies among effect sizes within clusters). I was hoping someone could help with some suggestions of how best to move forward. Thanks, Yogev NoMods <- rma.mv(yi, vi, random = ~ 1 | StudyID/GroupID/EffectSizeID, data=Data,sparse=TRUE) summary(NoMods) NoModsCooksDistance <- cooks.distance(NoMods,progbar = T,cluster = StudyID, reestimate=FALSE,parallel="snow") NoModsCooksDistance plot(NoModsCooksDistance, type="o", pch=19) -- Yogev Kivity, Ph.D. Postdoctoral Fellow Department of Psychology The Pennsylvania State University Bruce V. Moore Building University Park, PA 16802 Office Phone: (814) 867-2330