Skip to content

corruption of data with serialize(ascii=TRUE)

3 messages · Brian Ripley, Roger D. Peng

#
I noticed the following peculiarity with `serialize()' when `ascii = TRUE' is 
used.  In today's (svn r37299) R-devel, I get

 > set.seed(10)
 > x <- rnorm(10)
 >
 > a <- serialize(x, con = NULL, ascii = TRUE)
 > b <- unserialize(a)
 >
 > identical(x, b)  ## FALSE
[1] FALSE
 > x - b
  [1] -3.469447e-18  2.775558e-17 -4.440892e-16  0.000000e+00  5.551115e-17
  [6] -5.551115e-17 -4.440892e-16  0.000000e+00  2.220446e-16 -5.551115e-17


I expected `x' and `b' to be identical, which is what I get when `ascii = FALSE':

 > a <- serialize(x, con = NULL, ascii = FALSE)
 > b <- unserialize(a)
 >
 > identical(x, b)  ## TRUE
[1] TRUE


The same phenomenon occurs with `.saveRDS(ascii = TRUE)',

 > .saveRDS(x, file = "asdf", ascii = TRUE)
 > d <- .readRDS("asdf")
 >
 > identical(x, d)  ## FALSE
[1] FALSE
 >

Has anyone noticed this before?  I didn't see anything in the docs for 
`serialize()' that would indicate this behavior should be expected.

I'm on Linux Fedora Core 4.

-roger
#
It is known (happens with save() too and did in earlier save formats). 
Nothing particularly clever is done (the format is "%.16g\n") and 
similarly as.character/parse are not inverses.

Perhaps more relevant is
[1]  0.000000e+00 -1.110223e-16  2.220446e-16  0.000000e+00  0.000000e+00
  [6]  2.220446e-16  4.440892e-16  0.000000e+00  2.220446e-16  0.000000e+00

so the error (on my system) is about what you would expect from 
floating-point computations.

There is a comment in serialize.c

 	    /* 16: full precision; 17 gives 999, 000 &c */

which suggests that the format is optimized for size not maximal possible 
accuracy.

Really all you have said is `floating point operations are subject to 
rounding error'.
On Wed, 8 Feb 2006, Roger D. Peng wrote:

            

  
    
#
Okay, I just wasn't sure of the source of the changes.  In retrospect, character 
and other vectors did serialize/unserialize to the original objects.

-roger
Prof Brian Ripley wrote: