bug when subtracting decimals?

16 messages · wolfgang.siewert, Henrique Dallazuanna, Dimitris Rizopoulos +10 more

Original

1

16

wolfgang.siewert

Mon, Apr 20, 2009 6:07 AM #

Try this:

0.7-0.3==0.4
(We get FALSE)
0.7-0.3<0.4
(We get TRUE)

but
0.8-0.3==0.5
(TRUE)
0.8-0.3<0.5
(FALSE)

Funny, he?

There is a way around: 
round(0.7-0.3,1)==0.4
(TRUE)

Obviously there is a problem with some combinations of decimal subtractions,
that - we have the feeling - shouldt be solved.

Best regards
Sven & Wolfgang

View this message in context: http://www.nabble.com/bug-when-subtracting-decimals--tp23136337p23136337.html
Sent from the R help mailing list archive at Nabble.com.

Henrique Dallazuanna

Mon, Apr 20, 2009 6:31 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090420/c9eb3412/attachment-0001.pl>

Dimitris Rizopoulos

Mon, Apr 20, 2009 6:34 AM #

this is a (very) Frequently Asked Question; check:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f


Best,
Dimitris

wolfgang.siewert wrote:

Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

Dieter Menne

Mon, Apr 20, 2009 10:01 AM #

wolfgang.siewert <wolfgang.siewert <at> gmail.com> writes:

Oh no, not that one again! This was lecture two in my first computer
course in 1968, but it seems to be gone the way of the dodo since than.

Dietr

Mon, Apr 20, 2009 12:13 PM #

Dieter Menne wrote:

What makes you think that the average useR has had any exposure to 
computer science courses? ;-)

I bemoan the apparent inability of those asking such questions to use 
the resources provided to solve these problems for themselves...

Gavin

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Stephan Kolassa

Mon, Apr 20, 2009 12:32 PM #

Hi,

Gavin Simpson wrote:

Looking at all the people who quite obviously do NOT "read the posting 
guide and provide commented, minimal, self-contained, reproducible 
code", I wonder whether the mailing list could be configured to reply to 
each new mail (not replies in a thread) with an automated mail like this:

"Have you read the posting guide and the FAQs? If you do not get a reply 
within two days, you may want to look at both and think about 
reformulating your query. Oh, and while you are at it, look through the 
archives, a lot of questions have already been asked and answered before."

Just a thought,
Stephan

PIKAL Petr

Tue, Apr 21, 2009 12:48 AM #

Hi

r-help-bounces at r-project.org napsal dne 20.04.2009 19:01:46:

subtractions,

Maybe that is because of Excel is so widespread now and gives expected 
results (it probably silently rounds all decimal numbers before 
calculation). 
Regards
Petr

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

Dieter Menne

Tue, Apr 21, 2009 1:00 AM #

Petr PIKAL <petr.pikal <at> precheza.cz> writes:

Marc Schwartz already reminded me of that one, and it's a good point
to explicitly mention in lectures. 

I suggest to extend R by introducing %==% as "being Excellently equal".

Dieter

PIKAL Petr

Tue, Apr 21, 2009 1:19 AM #

r-help-bounces at r-project.org napsal dne 21.04.2009 10:00:06:

It helps but not in all cases

[1] FALSE

[1] TRUE

There always could be different issues with not exact representation of 
decimals. So educated user or internal rounding could help but I am not 
sure if later is desired.

Regards
Petr

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

Tue, Apr 21, 2009 3:55 AM #

On 21/04/2009 3:48 AM, Petr PIKAL wrote:

I don't have Excel, but I expect OpenOffice duplicates its bugs pretty 
well.  And in OpenOffice I see all sorts of bugs due to this, e.g. 
examples where x = y and y = z but x != z, cases where I can calculate a 
number like 1 + 4.e-15 and get something different from 1, but if I 
enter it directly as 1.000000000000004, it gets changed to 1.

So it only gives expected results in some tests, not others.

Duncan Murdoch

Tue, Apr 21, 2009 6:06 AM #

On Apr 21, 2009, at 5:55 AM, Duncan Murdoch wrote:

As Dieter noted from our offlist exchange, this had been discussed  
previously back in 2003. Just to refresh memories:

   https://stat.ethz.ch/pipermail/r-help/2003-June/034565.html

   https://stat.ethz.ch/pipermail/r-help/2003-June/034860.html


OO.org has replicated Excel's behavior to a fault.  Thus:

   Spreadsheet Use -> Brain to Porridge


Just to update OO.org's behavior using version 3.0.1 on OSX:

   Formula: =4.145 * 100 + 0.5     Result: 415.00000000000000000000

   Formula: =0.5 - 0.4 - 0.1       Result: 0.00000000000000000000

   Formula: =(0.5 - 0.4 - 0.1)     Result: 0.00000000000000000000

So nothing has changed in OO.org in five years.  Somebody with Excel  
2007 might want to try the 2nd and 3rd formula examples to see if  
using parens still makes a difference in the result as compared to the  
formula without the parens.


FWIW, now that I am on OSX, I can add the following output using  
Numbers '09:

   Formula: =4.145 * 100 + 0.5     Result: 415.00000000000000000000

   Formula: =0.5 - 0.4 - 0.1       Result: -2.77556E-17

   Formula: =(0.5 - 0.4 - 0.1)     Result: -2.77556E-17


It does look like R's behavior has changed since then. Using:

   R version 2.9.0 Patched (2009-04-18 r48348)

on OSX:

# This first example has changed.
# Prior result was 414.99999999999994
 > print(4.145 * 100 + 0.5, digits = 20)
[1] 415

 > formatC(4.145 * 100 + 0.5, format = "E", digits = 20)
[1] "4.14999999999999943157E+02"

 > print(0.5 - 0.4 - 0.1, digits = 20)
[1] -2.77555756156289e-17

 > formatC(0.5 - 0.4 - 0.1, format = "E", digits = 20)
[1] "-2.77555756156289135106E-17"


What is interesting is that:

 > 4.145 * 100 + 0.5 == 415
[1] FALSE

 > (4.145 * 100 + 0.5) - 415
[1] -5.684342e-14

 > all.equal(4.145 * 100 + 0.5, 415, 0)
[1] "Mean relative difference: 1.369721e-16"


So it would appear that in the first R example above, the print()  
function has changed in a material fashion.

HTH,

Marc Schwartz

Luis Iván Ortiz Valencia

Tue, Apr 21, 2009 6:38 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090421/594a7a15/attachment-0001.pl>

Tue, Apr 21, 2009 9:25 AM #

As I say every time someone brings this up, there are currently ~130
printed pages of FAQs.  Reading all that seems a rather large burden
on the novice poster.

Hadley

http://had.co.nz/

Sarah Goslee

Tue, Apr 21, 2009 9:51 AM #

I'd be happy if everyone made a minimal effort to solve their problems
before posting. There may be 130 pages of printed FAQs, but the
html file is fully searchable. It would be easier to browse with a good ToC,
but it's not that overwhelming to _look to see if there's a FAQ related to
your problem_.

I think pointing people to Rseek is probably more useful than sending
them directly to the FAQs.

Sarah

On Tue, Apr 21, 2009 at 12:25 PM, hadley wickham <h.wickham at gmail.com> wrote:

Sarah Goslee
http://www.functionaldiversity.org

Wed, Apr 22, 2009 1:49 AM #

On Tue, 2009-04-21 at 11:25 -0500, hadley wickham wrote:

There is a ToC and HTML pages are searchable.

Your point raises an important issue though; R is the product of
voluntary offers of time and expertise and despite the large array of
talented, clever individuals from a wealth of backgrounds contributing
untold improvements and extensions to R, we (the R community) haven't
really `solved` this documentation issue and the problem of getting
people to actually read the damned stuff before firing off an email
claiming x or wanting help with y, or developers to buy in and provide
information in new ways.

Either people don't have the time to implement a better system or we
haven't come up with a better system? Maybe this is just human nature
and we have to live with it?

I hope Stephan's suggestion was in jest - flooding the world with yet
more email traffic seems counter-productive.

Instead of the list going on about these issues every couple of months
or so, is there any scope for interested parties getting together and
coming up with suggestions for how to improve/change things, *and*, more
importantly, offers to help get it implemented? I know there are
discussions/work on going to improve the design of the R website and
CRAN. Now might be a good time to try and effect improvements that will
help the community. How do we raise the profile of resources already out
there - several people feel the Task Views aren't as well advertised as
they could be for example? Do we really need a new 'type' of
documentation or resource? How do we improve what we already have?

G

%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090422/79fadff2/attachment-0002.bin>

Martin Maechler

Wed, Apr 22, 2009 2:49 AM #

MS> On Apr 21, 2009, at 5:55 AM, Duncan Murdoch wrote:

>> On 21/04/2009 3:48 AM, Petr PIKAL wrote:

>>> Hi
    >>> r-help-bounces at r-project.org napsal dne 20.04.2009 19:01:46:
    >>>> wolfgang.siewert <wolfgang.siewert <at> gmail.com> writes:
    >>>> 
    >>>>> There is a way around: round(0.7-0.3,1)==0.4
    >>>>> (TRUE)
    >>>>> 
    >>>>> Obviously there is a problem with some combinations of decimal
    >>> subtractions,
    >>>>> that - we have the feeling - shouldt be solved.
    >>>> Oh no, not that one again! This was lecture two in my first computer
    >>>> course in 1968, but it seems to be gone the way of the dodo since  
    >>>> than.
    >>> Maybe that is because of Excel is so widespread now and gives  
    >>> expected results (it probably silently rounds all decimal numbers  
    >>> before calculation).
    >> 
    >> I don't have Excel, but I expect OpenOffice duplicates its bugs  
    >> pretty well.  And in OpenOffice I see all sorts of bugs due to this,  
    >> e.g. examples where x = y and y = z but x != z, cases where I can  
    >> calculate a number like 1 + 4.e-15 and get something different from  
    >> 1, but if I enter it directly as 1.000000000000004, it gets changed  
    >> to 1.
    >> 
    >> So it only gives expected results in some tests, not others.
    >> 
    >> Duncan Murdoch



    MS> As Dieter noted from our offlist exchange, this had been discussed  
    MS> previously back in 2003. Just to refresh memories:

    MS> https://stat.ethz.ch/pipermail/r-help/2003-June/034565.html

    MS> https://stat.ethz.ch/pipermail/r-help/2003-June/034860.html


    MS> OO.org has replicated Excel's behavior to a fault.  Thus:

    MS> Spreadsheet Use -> Brain to Porridge


    MS> Just to update OO.org's behavior using version 3.0.1 on OSX:

    MS> Formula: =4.145 * 100 + 0.5     Result: 415.00000000000000000000

    MS> Formula: =0.5 - 0.4 - 0.1       Result: 0.00000000000000000000

    MS> Formula: =(0.5 - 0.4 - 0.1)     Result: 0.00000000000000000000

    MS> So nothing has changed in OO.org in five years.  Somebody with Excel  
    MS> 2007 might want to try the 2nd and 3rd formula examples to see if  
    MS> using parens still makes a difference in the result as compared to the  
    MS> formula without the parens.


    MS> FWIW, now that I am on OSX, I can add the following output using  
    MS> Numbers '09:

    MS> Formula: =4.145 * 100 + 0.5     Result: 415.00000000000000000000

    MS> Formula: =0.5 - 0.4 - 0.1       Result: -2.77556E-17

    MS> Formula: =(0.5 - 0.4 - 0.1)     Result: -2.77556E-17


    MS> It does look like R's behavior has changed since then. Using:

    MS> R version 2.9.0 Patched (2009-04-18 r48348)

    MS> on OSX:

    MS> # This first example has changed.
    MS> # Prior result was 414.99999999999994
    >> print(4.145 * 100 + 0.5, digits = 20)
    MS> [1] 415

    >> formatC(4.145 * 100 + 0.5, format = "E", digits = 20)
    MS> [1] "4.14999999999999943157E+02"

    >> print(0.5 - 0.4 - 0.1, digits = 20)
    MS> [1] -2.77555756156289e-17

    >> formatC(0.5 - 0.4 - 0.1, format = "E", digits = 20)
    MS> [1] "-2.77555756156289135106E-17"


    MS> What is interesting is that:

    >> 4.145 * 100 + 0.5 == 415
    MS> [1] FALSE

    >> (4.145 * 100 + 0.5) - 415
    MS> [1] -5.684342e-14

    >> all.equal(4.145 * 100 + 0.5, 415, 0)
    MS> [1] "Mean relative difference: 1.369721e-16"


    MS> So it would appear that in the first R example above, the print()  
    MS> function has changed in a material fashion.

Yes  ((though not with *my* vote...)).
However, be aware that such calculations *are* platform
dependent, and IIUC, you are now using OS X wheras you've used
another platform previously, so some of the differences you see
may not be from changes in R, but from changes in the platform
you use.

Back to the topic of print():
Actually, also  format(<numeric>)  has changed similarly to my chagrin.
In older versions of R, you could ask it to give "too many" digits,
but now it gives "too few" even for maximal 'digits'.
{There is a good reason - which I don't recall - for the new behavior}

With as.character() it was worse (in older R versions): it gave
sometimes too little digits, sometimes too many, whereas now it
is at least consistently giving "too little".
But the effect is that in  ch <- as.character(x) ,
ch may contain duplicated entries even for unique x,
e.g., for x <- c(1, 1 + 4e-16)

BTW, one alternative to {"my"}  formatC() is  sprintf(), 
and if you are really interested: The latest changes (in 2.10.0 R-devel),
ensuring unique factor levels actually now make use of
	 sprintf("%.17g", .)
instead of as.character(.) exactly in order to ensure that
different numbers map to different strings necessarily.

BTW, we are way off topic for R-help, being in R-devel realm,
but as this thread has started here, we may keep it...

Martin Maechler, ETH Zurich

    MS> HTH,
    MS> Marc Schwartz