Someone solved the problem of saving R-objects to a database or can give me an answer like "this is not possible at the moment"? All my attempts runned into problems with conversion between blob, raw and character representations. Saving the R-object to a file and then getting it into the db leads to similar problems. Is saving to the file system the only practicable solution? In my special case I'm using a MySQL database with RMySQL package, but suggestions for other combinations are welcome. Greetings, Christian Ruckert
Saving R-objects to a database
9 messages · Christian Ruckert, Sean Davis, Brian Ripley +2 more
On Wed, Oct 1, 2008 at 5:53 AM, Christian Ruckert
<cruckert at uni-muenster.de> wrote:
Someone solved the problem of saving R-objects to a database or can give me an answer like "this is not possible at the moment"? All my attempts runned into problems with conversion between blob, raw and character representations. Saving the R-object to a file and then getting it into the db leads to similar problems. Is saving to the file system the only practicable solution?
Look at the serialize function. You can serialize any R object which can be stored into the database of your choice. Give that a try and let us know with a code example if you have problems. Sean
In my special case I'm using a MySQL database with RMySQL package, but suggestions for other combinations are welcome. Greetings, Christian Ruckert
_______________________________________________ R-sig-DB mailing list -- R Special Interest Group R-sig-DB at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-db
Someone solved the problem of saving R-objects to a database or can give me an answer like "this is not possible at the moment"? All my attempts runned into problems with conversion between blob, raw and character representations. Saving the R-object to a file and then getting it into the db leads to similar problems. Is saving to the file system the only practicable solution?
Look at the serialize function. You can serialize any R object which can be stored into the database of your choice. Give that a try and let us know with a code example if you have problems.
This is what I posted on r-mailing list before:
> ser = serialize(obj, NULL)
> dbSendQuery(link, paste("insert into table values(1, '",ser,"')",sep=''))
>
> The field to save the object in the MySQL Database is of type blob.
>
> result = dbGetQuery(link,"select * from table where id=1")
>
> unser = unserialize(result[1,"object_column"], NULL)
> Error in unserialize(result[1, "object_column"], NULL) : unknown
input format
> In addition: Warning message:
> In unserialize(result[1, "object_column"], NULL) :
> unserialize()from a character string is deprecated and will be
withdrawn in R 2.8.0
Brian Ripley answered:
"DBI needs to convert blobs to raw not character."
So saving to the db works fine, but function dbGetQuery seems to convert
the result to a character string. Can I avoid this convertion or is this
a problem with DBI package and will it get fixed?
On Wed, Oct 1, 2008 at 6:42 AM, Christian Ruckert
<cruckert at uni-muenster.de> wrote:
Someone solved the problem of saving R-objects to a database or can give me an answer like "this is not possible at the moment"? All my attempts runned into problems with conversion between blob, raw and character representations. Saving the R-object to a file and then getting it into the db leads to similar problems. Is saving to the file system the only practicable solution?
Look at the serialize function. You can serialize any R object which can be stored into the database of your choice. Give that a try and let us know with a code example if you have problems.
This is what I posted on r-mailing list before:
ser = serialize(obj, NULL)
dbSendQuery(link, paste("insert into table values(1, '",ser,"')",sep=''))
The field to save the object in the MySQL Database is of type blob.
result = dbGetQuery(link,"select * from table where id=1")
unser = unserialize(result[1,"object_column"], NULL)
Error in unserialize(result[1, "object_column"], NULL) : unknown input
format
In addition: Warning message:
In unserialize(result[1, "object_column"], NULL) :
unserialize()from a character string is deprecated and will be withdrawn
in R 2.8.0
Brian Ripley answered: "DBI needs to convert blobs to raw not character." So saving to the db works fine, but function dbGetQuery seems to convert the result to a character string. Can I avoid this convertion or is this a problem with DBI package and will it get fixed?
What happens if you serialize using "ascii=TRUE"? Then, you could probably save into a text field. Admittedly, I haven't tried this. Sean
On Wed, 1 Oct 2008, Sean Davis wrote:
On Wed, Oct 1, 2008 at 6:42 AM, Christian Ruckert <cruckert at uni-muenster.de> wrote:
Someone solved the problem of saving R-objects to a database or can give me an answer like "this is not possible at the moment"? All my attempts runned into problems with conversion between blob, raw and character representations. Saving the R-object to a file and then getting it into the db leads to similar problems. Is saving to the file system the only practicable solution?
Look at the serialize function. You can serialize any R object which can be stored into the database of your choice. Give that a try and let us know with a code example if you have problems.
This is what I posted on r-mailing list before:
ser = serialize(obj, NULL)
dbSendQuery(link, paste("insert into table values(1, '",ser,"')",sep=''))
The field to save the object in the MySQL Database is of type blob.
result = dbGetQuery(link,"select * from table where id=1")
unser = unserialize(result[1,"object_column"], NULL)
Error in unserialize(result[1, "object_column"], NULL) : unknown input
format
In addition: Warning message:
In unserialize(result[1, "object_column"], NULL) :
unserialize()from a character string is deprecated and will be withdrawn
in R 2.8.0
Brian Ripley answered: "DBI needs to convert blobs to raw not character." So saving to the db works fine, but function dbGetQuery seems to convert the result to a character string. Can I avoid this convertion or is this a problem with DBI package and will it get fixed?
What happens if you serialize using "ascii=TRUE"? Then, you could probably save into a text field. Admittedly, I haven't tried this.
The result is still a raw vector. You need to encode the raw vector as text before saving to the DBMS, and unencode after retrieval. AFAIK the problem referred to is specific to raw vectors and DBI, not general to all R-DBMS interfaces.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Wed, Oct 1, 2008 at 8:54 AM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
On Wed, 1 Oct 2008, Sean Davis wrote:
On Wed, Oct 1, 2008 at 6:42 AM, Christian Ruckert <cruckert at uni-muenster.de> wrote:
Someone solved the problem of saving R-objects to a database or can give me an answer like "this is not possible at the moment"? All my attempts runned into problems with conversion between blob, raw and character representations. Saving the R-object to a file and then getting it into the db leads to similar problems. Is saving to the file system the only practicable solution?
Look at the serialize function. You can serialize any R object which can be stored into the database of your choice. Give that a try and let us know with a code example if you have problems.
This is what I posted on r-mailing list before:
ser = serialize(obj, NULL)
dbSendQuery(link, paste("insert into table values(1,
'",ser,"')",sep=''))
The field to save the object in the MySQL Database is of type blob.
result = dbGetQuery(link,"select * from table where id=1")
unser = unserialize(result[1,"object_column"], NULL)
Error in unserialize(result[1, "object_column"], NULL) : unknown input
format
In addition: Warning message:
In unserialize(result[1, "object_column"], NULL) :
unserialize()from a character string is deprecated and will be
withdrawn
in R 2.8.0
Brian Ripley answered: "DBI needs to convert blobs to raw not character." So saving to the db works fine, but function dbGetQuery seems to convert the result to a character string. Can I avoid this convertion or is this a problem with DBI package and will it get fixed?
What happens if you serialize using "ascii=TRUE"? Then, you could probably save into a text field. Admittedly, I haven't tried this.
The result is still a raw vector. You need to encode the raw vector as text before saving to the DBMS, and unencode after retrieval. AFAIK the problem referred to is specific to raw vectors and DBI, not general to all R-DBMS interfaces.
Thanks for the clarification, again. Christian, it might be easier (and likely faster) to store the URI in the database and the object in Rda format on disk with a UUID as a file name. In the database world, I think the general consensus is that it is best not to store objects like this in the database as the overhead of backup, storage, retrieval, etc. are higher than storing on the file system. Sean Sean
Sean Davis schrieb:
On Wed, Oct 1, 2008 at 8:54 AM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
On Wed, 1 Oct 2008, Sean Davis wrote:
On Wed, Oct 1, 2008 at 6:42 AM, Christian Ruckert <cruckert at uni-muenster.de> wrote:
Someone solved the problem of saving R-objects to a database or can give me an answer like "this is not possible at the moment"? All my attempts runned into problems with conversion between blob, raw and character representations. Saving the R-object to a file and then getting it into the db leads to similar problems. Is saving to the file system the only practicable solution?
Look at the serialize function. You can serialize any R object which can be stored into the database of your choice. Give that a try and let us know with a code example if you have problems.
This is what I posted on r-mailing list before:
ser = serialize(obj, NULL)
dbSendQuery(link, paste("insert into table values(1,
'",ser,"')",sep=''))
The field to save the object in the MySQL Database is of type blob.
result = dbGetQuery(link,"select * from table where id=1")
unser = unserialize(result[1,"object_column"], NULL)
Error in unserialize(result[1, "object_column"], NULL) : unknown input
format
In addition: Warning message:
In unserialize(result[1, "object_column"], NULL) :
unserialize()from a character string is deprecated and will be
withdrawn
in R 2.8.0
Brian Ripley answered: "DBI needs to convert blobs to raw not character." So saving to the db works fine, but function dbGetQuery seems to convert the result to a character string. Can I avoid this convertion or is this a problem with DBI package and will it get fixed?
What happens if you serialize using "ascii=TRUE"? Then, you could probably save into a text field. Admittedly, I haven't tried this.
The result is still a raw vector. You need to encode the raw vector as text before saving to the DBMS, and unencode after retrieval. AFAIK the problem referred to is specific to raw vectors and DBI, not general to all R-DBMS interfaces.
Thanks for the clarification, again. Christian, it might be easier (and likely faster) to store the URI in the database and the object in Rda format on disk with a UUID as a file name. In the database world, I think the general consensus is that it is best not to store objects like this in the database as the overhead of backup, storage, retrieval, etc. are higher than storing on the file system. Sean Sean
(Un-)encode with rawToChar now worked for a small object, but for bigger objects the string is truncated. Thanks for your efforts, I've received an impression whats possible and what not. I think I'll give the file system solution a try. Christian
On 1 Oct 2008, at 11:42, Christian Ruckert wrote:
Someone solved the problem of saving R-objects to a database or can give me an answer like "this is not possible at the moment"? All my attempts runned into problems with conversion between blob, raw and character representations. Saving the R-object to a file and then getting it into the db leads to similar problems. Is saving to the file system the only practicable solution?
Look at the serialize function. You can serialize any R object which can be stored into the database of your choice. Give that a try and let us know with a code example if you have problems.
This is what I posted on r-mailing list before:
ser = serialize(obj, NULL)
dbSendQuery(link, paste("insert into table values(1,
'",ser,"')",sep=''))
The field to save the object in the MySQL Database is of type blob. result = dbGetQuery(link,"select * from table where id=1") unser = unserialize(result[1,"object_column"], NULL) Error in unserialize(result[1, "object_column"], NULL) : unknown
input format
In addition: Warning message: In unserialize(result[1, "object_column"], NULL) : unserialize()from a character string is deprecated and will be
withdrawn in R 2.8.0 Brian Ripley answered: "DBI needs to convert blobs to raw not character." So saving to the db works fine, but function dbGetQuery seems to convert the result to a character string. Can I avoid this convertion or is this a problem with DBI package and will it get fixed?
we have used something like this (these are snippets which will not
work as is... but should give you a starting point)
To Save:
s_rdata <- if(is.null(rdata)) "" else serialize(rdata,
connection=NULL, ascii=TRUE)
if(class(s_rdata) == "raw") s_rdata <- rawToChar(s_rdata)
sql <- paste("INSERT INTO my_table (dataset_id, rdata) ", "VALUES
(", dataset_id, ", E'", s_rdata, "')", sep="")
dbSendQuery(dbh, sql)
To Load:
sql <- paste("SELECT * FROM my_table WHERE dataset_id =
",dataset_id, sep="")
dataset <- dbGetResult(dbSendQuery(dbh, sql))
for(n in 1:length(dataset$rdata_type)) {
rdata_type <- dataset$rdata_type[n]
s_rdata <- dataset$rdata[n]
rdata <- if(nchar(s_rdata) > 0) unserialize(s_rdata) else NULL
}
(this was for use with PostgreSQL by the way, hence the E'' syntax
above)
HTH
adam
Hi Christian,
Christian Ruckert wrote:
[...]
(Un-)encode with rawToChar now worked for a small object, but for bigger objects the string is truncated.
You need to serialize with ascii=TRUE otherwise you get a raw vector with
lots of nul bytes that you won't be able to turn into a text without losing
part of it:
- rawToChar() will truncate the text at the first nul byte
- intToUtf8(as.integer()) will silently ignore those nul bytes.
This is because embedded nuls are not supported in character vectors anymore
(which is a good thing).
I'm not 100% sure but I think you won't get any nul bytes if you use
serialize( , ascii=TRUE). Try this:
objToText <- function(object)
rawToChar(serialize(object, NULL, ascii=TRUE))
textToObj <- function(text)
unserialize(charToRaw(text))
> x0 <- list(1, 2, 3)
> objToText(x0)
[1] "A\n2\n133120\n131840\n19\n3\n14\n1\n1\n14\n1\n2\n14\n1\n3\n"
> textToObj(objToText(x0))
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
I've not tried this on big objects though.
Note that, despite serialize() man page claiming that ascii=FALSE will produce
a more compact binary representation of the object, my experience turns out
to be the opposite.
Cheers,
H.