Skip to content

seq() function accuracy inacceptable (PR#8779)

4 messages · Johannes.Prix at wu-wien.ac.at, Henrik Bengtsson, Don MacQueen +1 more

#
Full_Name: Johannes Prix
Version: 2.1.1
OS: WinXP, SuSE Linux
Submission from: (NULL) (137.208.41.195)



The seq-command produces unnescessary inaccurate results, which can be extremely
annoying.  I absolutely do not see the nescessity of numerical garbage to appear
in the following simple case.  E.g. try this:
digits=2 )

Output looks like:

 [1]  0.000000e+00 -7.105427e-15  0.000000e+00  0.000000e+00 -7.105427e-15
-7.105427e-15  0.000000e+00  0.000000e+00
 [9] -7.105427e-15  0.000000e+00  0.000000e+00  0.000000e+00 -7.105427e-15 
0.000000e+00  0.000000e+00 -7.105427e-15
[17] -7.105427e-15  0.000000e+00  0.000000e+00 -7.105427e-15  0.000000e+00 
0.000000e+00 -7.105427e-15 -7.105427e-15
[25]  0.000000e+00  0.000000e+00 -7.105427e-15  0.000000e+00  0.000000e+00
-7.105427e-15 -7.105427e-15  0.000000e+00
[33]  0.000000e+00 -7.105427e-15  0.000000e+00  0.000000e+00  0.000000e+00
-7.105427e-15  0.000000e+00  0.000000e+00
[41] -7.105427e-15 -7.105427e-15  0.000000e+00  0.000000e+00 -7.105427e-15 
0.000000e+00
It is particularly dangerous to use such seq()-contructed lists (without
rounding) when e.g. trying to find the first time a given number appears in the
list and the number is given without numerical garbage.
#
On 4/18/06, Johannes.Prix at wu-wien.ac.at <Johannes.Prix at wu-wien.ac.at> wrote:
Hmmm... I guess I'm the first on this one.  It is not garbage, it is
just the nearest number the computer can calculate given your
instructions. It has all to do about numerical precision, and
representation of numbers in a computer, e.g. 0.1 base 10 can not be
represented exactly in binary format.  For example, how do you write
1/3 (in base 10) on a piece of paper without using rational
representation?  In base 3 it is 0.1 sharp!  Look in the r-help
archive and you'll find tons of questions like yours and even more
answers.

For this reason, don't compare doubles using "==", that is very risky!
 Instead check if it is close enough, e.g. abs(x-y) < eps or
all.equal(x,y) or similar.

/Henrik
--
Henrik Bengtsson
Mobile: +46 708 909208 (+2h UTC)
#
Another thing to notice (rather than, it would seem, assume), is that 
using round() doesn't do any "better":

### without rounding
[1] 61.549999999999997 61.559999999999995 61.570000000000000 61.579999999999998
[5] 61.589999999999996

### with rounding
[1] 61.549999999999997 61.560000000000002 61.570000000000000 61.579999999999998
[5] 61.590000000000003

As Thomas and Henrik said, the sequence has to be calculated, and 
such calculations are not, and can not be, exact.

-Don
At 6:19 PM +0200 4/18/06, Johannes.Prix at wu-wien.ac.at wrote:

  
    
#
This is related to this FAQ:

http://stat.cmu.edu/R/CRAN/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

What you can do is create the sequence of integers from 6155 to 6200
rather than using floats and then divide by 100 in your subsequent
calculation.  Until the point you divide by 100, your numbers will be
exact.
[1] 6155 6156 6157 6158 6159 6160 6161 6162 6163 6164 6165 6166 6167 6168 6169
[16] 6170 6171 6172 6173 6174 6175 6176 6177 6178 6179 6180 6181 6182 6183 6184
[31] 6185 6186 6187 6188 6189 6190 6191 6192 6193 6194 6195 6196 6197 6198 6199
[46] 6200
[1] "integer"
On 4/18/06, Johannes.Prix at wu-wien.ac.at <Johannes.Prix at wu-wien.ac.at> wrote: