Sorry for the simple question, but I am trying to read an "unsigned long long" using the R readBin() function. Can someone point me in the right direction, or am I better off using C for such things? The file that I am reading will have been produced on the same machine that is doing the reading. Thanks, Sean
Reading an "unsigned long long" using R readBin()
6 messages · Sean Davis, Simon Urbanek, Brian Ripley +2 more
On May 29, 2008, at 2:41 PM, Sean Davis wrote:
Sorry for the simple question, but I am trying to read an "unsigned long long" using the R readBin() function. Can someone point me in the right direction, or am I better off using C for such things? The file that I am reading will have been produced on the same machine that is doing the reading.
R has no data type that can hold 64-bit integers (long long), so there is no (lossless) way to read such a field in R. If you know the endianness of the machine you can read two integers and combine the result as a float to get an approximate value. Otherwise C is your friend (and easy to call from R) for 64-bit calculations, bitwise operations and other tricks that are hard to do in R. Cheers, Simon
Well, R has no unsigned quantities, so ultimately you can't actually do this. But using what="int" and an appropriate 'size' (likely to be 8) shold read the numbers, wrapping around very large ones to be negative. (The usual trick of storing integers in numeric will lose accuracy, but might be better than nothing.)
On Thu, 29 May 2008, Sean Davis wrote:
Sorry for the simple question, but I am trying to read an "unsigned long long" using the R readBin() function. Can someone point me in the right direction, or am I better off using C for such things? The file that I am reading will have been produced on the same machine that is doing the reading. Thanks, Sean
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On 5/30/2008 1:55 PM, Prof Brian Ripley wrote:
Well, R has no unsigned quantities, so ultimately you can't actually do this. But using what="int" and an appropriate 'size' (likely to be 8) shold read the numbers, wrapping around very large ones to be negative. (The usual trick of storing integers in numeric will lose accuracy, but might be better than nothing.)
I think reading size 8 integers on 32 bit Windows returns signed 32 bit integers, with values outside that range losing the high order bits, not just accuracy. At least that's what I see when I write the numbers 1:10 out as 4 byte integers, and read them as 8 byte integers: I get 1 3 5 7 9. Duncan Murdoch
On Thu, 29 May 2008, Sean Davis wrote:
Sorry for the simple question, but I am trying to read an "unsigned long long" using the R readBin() function. Can someone point me in the right direction, or am I better off using C for such things? The file that I am reading will have been produced on the same machine that is doing the reading. Thanks, Sean
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On Fri, 30 May 2008, Duncan Murdoch wrote:
On 5/30/2008 1:55 PM, Prof Brian Ripley wrote:
Well, R has no unsigned quantities, so ultimately you can't actually do this. But using what="int" and an appropriate 'size' (likely to be 8) shold read the numbers, wrapping around very large ones to be negative. (The usual trick of storing integers in numeric will lose accuracy, but might be better than nothing.)
I think reading size 8 integers on 32 bit Windows returns signed 32 bit integers, with values outside that range losing the high order bits, not just accuracy. At least that's what I see when I write the numbers 1:10 out as 4 byte integers, and read them as 8 byte integers: I get 1 3 5 7 9.
Yes, that's true for even larger ones. So to clarify: up to 2^31-1 should work, thereafter you will get the lower 32 bits and hence possibly a signed number.
Duncan Murdoch
On Thu, 29 May 2008, Sean Davis wrote:
Sorry for the simple question, but I am trying to read an "unsigned long long" using the R readBin() function. Can someone point me in the right direction, or am I better off using C for such things? The file that I am reading will have been produced on the same machine that is doing the reading. Thanks, Sean
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Fri, 30 May 2008, Prof Brian Ripley wrote:
On Fri, 30 May 2008, Duncan Murdoch wrote:
On 5/30/2008 1:55 PM, Prof Brian Ripley wrote:
Well, R has no unsigned quantities, so ultimately you can't actually do this. But using what="int" and an appropriate 'size' (likely to be 8) shold read the numbers, wrapping around very large ones to be negative. (The usual trick of storing integers in numeric will lose accuracy, but might be better than nothing.)
I think reading size 8 integers on 32 bit Windows returns signed 32 bit integers, with values outside that range losing the high order bits, not just accuracy. At least that's what I see when I write the numbers 1:10 out as 4 byte integers, and read them as 8 byte integers: I get 1 3 5 7 9.
Yes, that's true for even larger ones. So to clarify: up to 2^31-1 should work, thereafter you will get the lower 32 bits and hence possibly a signed number.
When we wrote a version of readBin() for Splus 8.0 we added an
extra argument, output=, that specifies the type of S object
to put the result into. The what= argument says what sort
of data is in the input file and by default output=what.
output="double" can be useful in this case, as a double can
store a 53 bit signed or unsigned integer without loss of
precision. If the integer is bigger than 2^53-1, the double
stores its most significant 53 bits, which may be better
than truncating the thing.
E.g., I wrote a C program to write some unsigned long longs to
a file:
#include <stdio.h>
int main(int argc, char *argv[])
{
unsigned long long data[7], one = 1ULL ;
data[0] = one ;
data[1] = (one<<31) - 1 ;
data[2] = (one<<31) + 1 ;
data[3] = (one<<32) - 1 ;
data[4] = (one<<32) + 1 ;
data[5] = (one<<52) + 1 ;
data[6] = (one<<54) + 1 ;
(void)fwrite((void *)data, sizeof(data[0]), sizeof(data)/sizeof(data[0]), stdout) ;
return 0 ;
}
od shows what it writes, as unsigned, signed, and hex
8 byte integers:
% ./a.out|od --format u8
0000000 1 2147483647
0000020 2147483649 4294967295
0000040 4294967297 4503599627370497
0000060 18014398509481985
0000070
% ./a.out | od --format d8
0000000 1 2147483647
0000020 2147483649 4294967295
0000040 4294967297 4503599627370497
0000060 18014398509481985
0000070
% ./a.out | od --format x8
0000000 0000000000000001 000000007fffffff
0000020 0000000080000001 00000000ffffffff
0000040 0000000100000001 0010000000000001
0000060 0040000000000001
0000070
and in 32-bit Splus I can read it with:
> z<-readBin(pipe("./a.out", open="br"), what="integer", n=7,
size=8, signed=FALSE, output="double")
> print(z, digits=16)
[1] 1 2147483647 2147483649 4294967295
[5] 4294967297 4503599627370497 18014398509481984
Note that it loses precision where z[7]>2^53.
Without the output="double" then the numbers > 2^32 would be
truncated and the signs would be wrong on ones between 2^31
anbd 2^32:
> readBin(pipe("./a.out", open="br"), what="integer", n=7,
size=8, signed=FALSE)
[1] 1 2147483647 -2147483647 -1 1 1
[7] 1
(That one gives the same result in R and Splus.)
What do folks think about having this option in R?
----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
"All statements in this message represent the opinions of the author and do
not necessarily reflect Insightful Corporation policy or position."