Skip to content

Compress string memCompress/Decompress

7 messages · Matt Shotwell, Seth Falcon, Erik Wright

#
Hello,

I would like to compress a long string (character vector), store the compressed string in the text field of a SQLite database (using RSQLite), and then load the text back into memory and decompress it back into the the original string.  My character vector can be compressed considerably using standard gzip/bzip2 compression.  In theory it should be much faster for me to compress/decompress a long string than to write the whole string to the hard drive and then read it back (not to mention the saved hard drive space).

I have tried accomplishing this task using memCompress() and memDecompress() without success.  It seems memCompress can only convert a character vector to raw type which cannot be treated as a string.  Does anyone have ideas on how I can go about doing this, especially using the standard base packages?

Thanks!,
Erik
R version 2.11.0 (2010-04-22) 
x86_64-apple-darwin9.8.0 

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_2.11.0
#
Erik, 

Can you store the data as a blob? For example:
Loading required package: DBI
NULL
+ string.sqlite, ");", sep="")
NULL
[1] "gzip this string, store as blob in SQLite database"


-Matt
On Fri, 2010-07-09 at 12:51 -0400, Erik Wright wrote:
#
Hi Matt,

This works great, thanks!

At first I got an error message saying BLOB is not implemented in RSQLite.  When I updated to the latest version it worked.

Is there any reason the string needs to be stored as type BLOB?  It seems to work the same when I swap "BLOB" with "TEXT" in the CREATE TABLE command.

Thanks again!,
Erik
On Jul 9, 2010, at 3:21 PM, Matt Shotwell wrote:

            
1 day later
#
On Fri, 2010-07-09 at 20:02 -0400, Erik Wright wrote:
SQLite began to support BLOBs from version 3.0.
SQLite has a dynamic-type system. That is, data types are associated
with values rather than with their container (column). This means that
most columns in a table can store more than just the type (or
'affinity') it is declared with. I think that's what happens when you
use TEXT rather than BLOB. If you use something like x'FFFFA9' to insert
data into a column with TEXT affinity, I believe it is stored as a BLOB
regardless.

-Matt
#
On Sun, Jul 11, 2010 at 11:31 AM, Matt Shotwell <shotwelm at musc.edu> wrote:
And RSQLite began supporting BLOBs only just recently :-)
See the NEWS file for details.

Below is a minimal example of how you might use BLOBs:

    db <- dbConnect(SQLite(), dbname = ":memory:")
    dbGetQuery(db, "CREATE TABLE t1 (name TEXT, data BLOB)")

    z <- paste("hello", 1:10)
    df <- data.frame(a = letters[1:10],
                     z = I(lapply(z, charToRaw)))
    dbGetPreparedQuery(db, "insert into t1 values (:a, :z)", df)
    a <- dbGetQuery(db, "select name from t1")
    checkEquals(10, nrow(a))
    a <- dbGetQuery(db, "select data from t1")
    checkEquals(10, nrow(a))
    a <- dbGetQuery(db, "select * from t1")
    checkEquals(10, nrow(a))
    checkEquals(2, ncol(a))

    checkEquals(z, sapply(a$data, rawToChar))
    dbDisconnect(db)
#
Hi Seth,

Can you recreate the example below using dbWriteTable?

Thanks!,
Erik
On Jul 11, 2010, at 6:13 PM, Seth Falcon wrote:

            
#
On Mon, Jul 12, 2010 at 9:17 AM, Erik Wright <eswright at wisc.edu> wrote:
Not sure if that is possible with the current dbWriteTable code (don't
have time to explore that right now).  You are welcome to poke around.
You could wrap the example in a helper function to provide your own
BLOB respecting write table function if you can't get dbWriteTable to
work for your case.

+ seth