Skip to content

read.table() and precision?

7 messages · Wojciech Gryc, Peter Dalgaard, Knut Krueger +2 more

#
Hi,

I'm currently working with data that has values as large as 99,000,000
but is accurate to 6 decimal places. Unfortunately, when I load the
data using read.table(), it rounds everything to the nearest integer.
Is there any way for me to preserve the information or work with
arbitrarily large floating point numbers?

Thank you,
Wojciech
#
Wojciech Gryc wrote:
Are you sure?

To my knowledge, read.table doesn't round anything, except when running
out of bits to store the values in, and 13 decimal places should fit in
ordinary double precision variables.

Printing the result is another matter. Try playing with the
print(mydata, digits=15) and the like.
#
If x is the result of your read.table, it is a double
precision number (matrix, data.frame, etc.), but by
default only up to 7 decimal digits of x are printed,
so you do not see the rest of x. 
Try for example
options(digits=15) 
and see how your x look then.
--- Wojciech Gryc <wojciech at gmail.com> wrote:

            
#
Dear List,

Following the below question I have a question of my
own:
Suppose that I have large matrices which are produced
sequentially and must be used sequentially in the
reverse order. I do not have enough memory to store
them and so I would like to write them to disk and
then read them. This raises two questions:
1) what is the fastest (and the most economic
space-wise) way to do this?
2) functions like write, write.table, etc. write the
data the way it is printed and this may result in a
loss of accuracy. Is there any way to prevent this,
except for setting the "digits" option to a higher
value or using format prior to writing the data? Is it
possible to write binary files (similar to Fortran)?

Any suggestion will be greatly appreciated.
--- Wojciech Gryc <wojciech at gmail.com> wrote:

            
#
On Mon, 17 Dec 2007, Moshe Olshansky wrote:

            
Using save/load is the simplest.  Don't worry about finding better 
solutions until you know those are not good enough.  (serialize / 
unserialize is another interface to the same underlying idea.)
Do please read the help before making false claims. ?write.table says

      Real and complex numbers are written to the maximal possible
      precision.

OTOH, ?write says it is a wrapper for cat, whose help says

      'cat' converts numeric/complex elements in the same way as 'print'
      (and not in the same way as 'as.character' which is used by the S
      equivalent), so 'options' '"digits"' and '"scipen"' are relevant.
      However, it uses the minimum field width necessary for each
      element, rather than the same field width for all elements.

so this hints as.character() might be a useful preprocessor.
See ?writeBin.  save/load by default write binary files, but use of 
writeBin can be faster (and less flexible).
Somehow you have missed a great deal of information about R I/O.
Try help.start() and reading the sections the search engine shows you 
that look relevant.
#
Thank you for your response!

'write.table' writes up to 15 decimal digits which is
not the machine (double) precision but not far from
that - sorry for the misleading comments!

After all I found a way to do what I needed without
using disk or much memory and doing only twice as much
work as I could with unlimited memory, so I will stick
to this approach.
--- Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote: