[R-meta] Importing Correlations from PDF to table format

Thu, Mar 3, 2022 12:46 AM

Hi Kiet,

Thanks for providing this feedback -- this is likely to be quite useful for others looking into automating this process (including me)!

Best,
Wolfgang

-----Original Message-----
From: Kiet Huynh [mailto:kietduchuynh at gmail.com]
Sent: Wednesday, 02 March, 2022 4:23
To: Viechtbauer, Wolfgang (SP)
Cc: James Pustejovsky; R meta
Subject: Re: [R-meta] Importing Correlations from PDF to table format

Hi Wolfgang,

Thank you for your recommendation. Using both the tabulizer package and and rcalc
function has done exactly what I was hoping for.

I found the tabulizer package to be much more accurate than the pdftools package.
The tabulizer package is mostly accurate, but sometimes it struggles with
correctly identifying negative numbers in the correlation table. So I still have
to do some data cleaning in R to fix incorrect values. Despite these issues, my
process for coding meta-analysis is much more efficient and accurate now.

Big thanks to you and James for your help!

Kiet

On Feb 28, 2022, at 11:04 AM, Viechtbauer, Wolfgang (SP)
<wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:

Hi Kiet,

The rcalc() function from metafor could be used for this. It even computes the
var-cov matrix of the elements in the correlation matrix for you:

library(metafor)
R <- matrix(c(1, .3, .5, .3, 1, .6, .5, .4, 1), 3, 3)
R
rcalc(R, ni=50)

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On
Behalf Of Kiet Huynh
Sent: Monday, 28 February, 2022 19:45
To: James Pustejovsky
Cc: R meta
Subject: Re: [R-meta] Importing Correlations from PDF to table format

Hi James,

Thank you for recommending these helpful packages. I was able to import the pdf
correlation table into a dataframe format in R. Are you aware of any R code that
could convert that correlation matrix dataframe into a meta-analysis type
dataframe (i.e., a column for variable 1, a column for variable 2, and a column
for correlation effect size)?

Best,

Kiet

On Feb 25, 2022, at 10:58 AM, James Pustejovsky <jepusto at gmail.com> wrote:

The pdftools package might be helpful:
https://github.com/ropensci/pdftools?<https://github.com/ropensci/pdftools>
It has very low-level utilities for extracting text from pdf. You'd still have
to do some data clean-up to get the correlations into the form needed for
analysis.

The tabulizer package is meant to provide tools customized for working with pdf
tables:

https://github.com/ropensci/tabulizer?<https://github.com/ropensci/tabulizer>
But it requires Java and it appears to be archived on CRAN. I'm not sure what
its development status is. Caveat emptor, I guess.

James

[R-meta] Importing Correlations from PDF to table format

Thread (2 messages)