SQLite: When reading a table, a "\r" is padded onto the last column. Why?
Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:
I would be surprised if read.table used carefully took a significant part of the time of a total analysis. I hesitate to do timings without knowing the sort of table you are discussing: does it have many columns or many rows or both, and what variable types? Having some real-life examples to think about would be very helpful (as it would be for some of the efficiency issues we have been working on with data frames).
We've been working with annotation data for Affymetrix Mapping arrays (SNP chips). This translates to many rows (6M) and a handful of columns (6-10) with a mix of integer, double, and character columns. As you wrote, if one needs to do any manipulation of the data before loading into the DB, then read.table will most likely not be the bottleneck.
a. use SQLite directly and skip R. b. use R and make a system call to the sqlite command line.
Or
c. Send a suitable SQL query from the R package. I've done that in
RMySQL and RODBC in the past (e.g. using LOAD DATA INFILE in
MySQL).
Unfortunately, AFAIK, SQLite does not provide a SQL syntax to achieve that. I think it did at one time and then it went away with one of the newer versions. + seth