Skip to content

Writing character vectors with embedded nulls to a connection

3 messages · Jeffrey Horner, Brian Ripley

#
Is this possible? I've tried both writeChar() and writeBin() to no avail.

My goal is to serialize(ascii=FALSE) an object to a connection but 
determine the size of the serialized object before hand:

sobject <- serialize(object,NULL,ascii=FALSE)
len <- nchar(sobject)
#
# run some code here to notify listener on other end of connection
# how many bytes I'm getting ready to send
#
writeChar(sobject,con)

The other option is to serialize twice:

len <- nchar(serialize(object,NULL,ascii=FALSE))
#
# run some code here to notify listener on other end of connection
# how many bytes I'm getting ready to send
#
serialize(object,con,ascii=FALSE)

Object stores, like memcache (http://danga.com/memcached/), need to know 
object sizes before storing. RDBMS's which support large objects (CLOBS 
or BLOBS) don't nececarilly need to know object sizes before-hand, but 
they do have max column size limits which must be honored.

BTW, readchar() can read strings with embedded nulls; I figured 
writeChar() should be able to write them.
#
I think you should be using a raw type to hold such data in R.  It is not 
intentional that readChar handles embedded nuls (and in fact it might not 
in an MBCS).

As ?serialize says

      For 'serialize', 'NULL' unless 'connection=NULL', when the result
      is stored in the first element of a character vector (but is not a
      normal character string unless 'ascii = TRUE' and should not be
      processed except by 'unserialize').

so you have been told this is not intended to work as you tried.

serialize predates the raw type, or it would have made use of it.  In 
these days of MBCS character strings it is increasingly unsafe to use them 
to hold anything other than valid character data.
On Thu, 30 Mar 2006, Jeffrey Horner wrote:

            

  
    
#
The following approach

sobject <- charToRaw(serialize(object,NULL))
len <- length(sobject)
writeBin(sobject, outcon)

would appear to work.  As from 2.3.0 you will then be able to do

unserialize(readBin(incon, "raw", n=len))
On Fri, 31 Mar 2006, Prof Brian Ripley wrote: