Reading 64-bit integers
Dear Simon, Thank you for the response.
On 29 March 2011 15:06, Simon Urbanek <simon.urbanek at r-project.org> wrote:
On Mar 29, 2011, at 8:46 AM, Jon Clayden wrote:
Dear all, I see from some previous threads that support for 64-bit integers in R may be an aim for future versions, but in the meantime I'm wondering whether it is possible to read in integers of greater than 32 bits at all. Judging from ?readBin, it should be possible to read 8-byte integers to some degree, but it is clearly limited in practice by R's internally 32-bit integer type:
x <- as.raw(c(0,0,0,0,1,0,0,0)) (readBin(x,"integer",n=1,size=8,signed=F,endian="big"))
[1] 16777216
x <- as.raw(c(0,0,0,1,0,0,0,0)) (readBin(x,"integer",n=1,size=8,signed=F,endian="big"))
[1] 0 For values that fit into 32 bits it works fine, but for larger values it fails. (I'm a bit surprised by the zero - should the value not be NA if it is out of range?
No, it's not out of range - int is only 4 bytes so only 4 first bytes (respecting endianness order, hence LSB) are used.
The fact remains that I ask for the value of an 8-byte integer and don't get it. Pretending that it's really only four bytes because of the limits of R's integer type isn't all that helpful. Perhaps a warning should be put out if the cast will affect the value of the result? It looks like the relevant lines in src/main/connections.c are 3689-3697 in the current alpha: #if SIZEOF_LONG == 8 case sizeof(long): INTEGER(ans)[i] = (int)*((long *)buf); break; #elif SIZEOF_LONG_LONG == 8 case sizeof(_lli_t): INTEGER(ans)[i] = (int)*((_lli_t *)buf); break; #endif
) The value can be represented as a double, though:
4294967296
[1] 4294967296 I wouldn't expect readBin() to return a double if an integer was requested, but is there any way to get the correct value out of it?
Trivially (for your unsigned big-endian case): y <- readBin(x, "integer", n=length(x)/4L, endian="big") y <- ifelse(y < 0, 2^32 + y, y) i <- seq(1,length(y),2) y <- y[i] * 2^32 + y[i + 1L]
Thanks for the code, but I'm not sure I would call that trivial, especially if one needs to cater for little endian and signed cases as well! This is what I meant by reconstructing the number manually... All the best, Jon