Reading large files quickly; resolved
Rob Steele wrote:
I'm finding that readLines() and read.fwf() take nearly two hours to work through a 3.5 GB file, even when reading in large (100 MB) chunks. The unix command wc by contrast processes the same file in three minutes. Is there a faster way to read files in R? Thanks!
readChar() is fast. I use strsplit(..., fixed = TRUE) to separate the input data into lines and then use substr() to separate the lines into fields. I do a little light processing and write the result back out with writeChar(). The whole thing takes thirty minutes where read.fwf() took nearly two hours just to read the data. Thanks for the help!