Skip to content

[R-meta] Importing Correlations from PDF to table format

4 messages · Kiet Huynh, James Pustejovsky, Wolfgang Viechtbauer

#
Hello,

I was wondering if anyone knows of a way to automate in R (or any software) the process of importing correlation values from PDF to usable data in a table format that can be used in meta-analysis? My process has been to copy the correlations manually one-by-one from the PDF to excel (which takes a lifetime!), and then import the excel data into R. I'm sure there must be a better, faster, and less error-prone way to do this.

Thank you,

Kiet

----

Kiet D. Huynh, Ph.D.
Pronouns: he/him
CLEAR Goldblum-Carr Postdoctoral Fellow
Palo Alto University
1791 Arastradero Rd.
Palo Alto, CA 94304
#
The pdftools package might be helpful:
https://github.com/ropensci/pdftools
It has very low-level utilities for extracting text from pdf. You'd still
have to do some data clean-up to get the correlations into the form needed
for analysis.

The tabulizer package is meant to provide tools customized for working with
pdf tables:
https://github.com/ropensci/tabulizer
But it requires Java and it appears to be archived on CRAN. I'm not sure
what its development status is. Caveat emptor, I guess.

James
On Fri, Feb 25, 2022 at 12:20 PM Kiet Huynh <kietduchuynh at gmail.com> wrote:

            

  
  
2 days later
#
Hi James,

Thank you for recommending these helpful packages. I was able to import the pdf correlation table into a dataframe format in R. Are you aware of any R code that could convert that correlation matrix dataframe into a meta-analysis type dataframe (i.e., a column for variable 1, a column for variable 2, and a column for correlation effect size)? 

Best,

Kiet

  
  
#
Hi Kiet,

The rcalc() function from metafor could be used for this. It even computes the var-cov matrix of the elements in the correlation matrix for you:

library(metafor)
R <- matrix(c(1, .3, .5, .3, 1, .6, .5, .4, 1), 3, 3)
R
rcalc(R, ni=50)

Best,
Wolfgang