Skip to content

Changing many csv files using apply?

2 messages · Chang, Emily@OEHHA, R. Michael Weylandt

#
Dear all,

I have many csv files whose contents I want to change a bit en masse. So far, I've written code that can change them in a for loop, like so:

# Subset of files in the folder I want to change
subset = "somestring"
# Retrieve list of files to change
filelist=list.files()
filelist = filelist[grep(subset, filelist)]

for(i in 1:length(filelist)){
        setwd(readdir)
        temp = read.csv(filelist[i],as.is = T, strip.white = T)
        >whatever I want to do to temp

	setwd(writedir)
	write.table(temp, file = filelist[i], sep = ",", col.names=NA)
	}


It's a little slow though, so I would like to get rid of the for loop but preserve its function. Would it be possible to use sapply() or something similar? Any insight would be appreciated!

Best regards,
Emily
#
sapply() won't be notably faster.

What you might try (no guarantee) is wrapping things up as if you were
going to use sapply() but instead use mclapply() from the parallel
package -- that will parallelize the results and should be faster by
roughly as many cores as you use. The "no guarantee" disclaimer comes
from me not being able to guarantee that I/O is nicely parallelized
having never tried it myself

(it probably is though, those R folks are pretty good at what they do)

Michael

On Mon, Jun 18, 2012 at 4:38 PM, Chang, Emily at OEHHA
<emily.chang at oehha.ca.gov> wrote: