Skip to content
Prev 174701 / 398506 Next

read in large data file (tsv) with inline filter?

On Mon, 23 Mar 2009, David Reiss wrote:

            
You certainly don't want to use repeated reads from the start of the file with skip=,  but if you set up a file connection
    fileconnection <- file("my.tsv", open="r")
you can read from it incrementally with readLines() or read.delim() without going back to the start each time.

The speed of approach should be within a reasonable constant factor of anything else, since reading the file once is unavoidable and should be the bottleneck.

       -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle