Skip to content
Prev 260 / 1559 Next

SQLite: When reading a table, a "\r" is padded onto the last column. Why?

ronggui <ronggui.huang at gmail.com> writes:
You are right that an na.strings argument is missing.  You will find
that if you use '\N' in your text files, it will be recognized as NA.

This file import feature is implemented by reading the file in C and
borrows heavily from the SQLite command line tool's .import command.
With this implementation, changes such as adding a flexible na.strings
argument will not be trivial to implement.

Now that dbWriteTable (using data.frame) is more efficient, it can be
used in a straight forward way to load very large text files.  I
prefer this approach.  And a possibly easier patch is to refactor
dbWriteTable (file path) such that it does something like the code
below (and remove the C code entirely):

(untested, approx code)

    con <- file(fname, open="r")
    on.exit(close(con))

    df <- read.table(con, sep=sep, stringsAsFactors=FALSE, nrows=10,
                     na.strings=na.strings, header=TRUE)
    # use DBI helper function here instead
    header <- gsub(".", "_", names(df), fixed=TRUE) 
    names(df) <- header

    dbWriteTable(db, tablename, df)

    ## Now do the rest in batches
    done <- FALSE
    while (!done) {
        df <- read.table(con, sep=sep, stringsAsFactors=FALSE,
                         nrows=batch_size, na.strings=na.strings,
                         header=FALSE)
        if (nrow(df) < batch_size) {
            done <- TRUE
            if (nrow(df) == 0)
              break
        }
        names(df) <- header
        dbWriteTable(db, tablename, df, append=TRUE)
    }

+ seth

Thread (19 messages)

Seth Falcon SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 3 ronggui SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 3 Seth Falcon SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 4 Seth Falcon SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 4 Seth Falcon SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 4 ronggui SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 4 Dirk Eddelbuettel SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 4 ronggui SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 4 Dirk Eddelbuettel SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 4 Seth Falcon SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 5 ronggui SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 5 David James SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 5 Seth Falcon SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 5 Brian Ripley SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 5 David James SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 5 Seth Falcon SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 5 Seth Falcon SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 5 ronggui SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 6 Seth Falcon SQLite: When reading a table, a "\r" is padded onto the last column. Why? Jan 6