Skip to content

NA_real_ <op> NaN -> NA or NaN, should we care?

5 messages · Martin Maechler, Duncan Murdoch, William Dunlap

#
> On Linux when I compile R 2.10.0(devel) (src/main/arithmetic.c in
    > particular)
    > with gcc 3.4.5 using the flags -g -O2 I get noncommutative behavior when

is this really gcc 3.4.5  (which is quite old) ?

Without being an expert, I'd tend to claim this to be a
compiler (optimization) bug ....  but most probably the ANSI /
ISO  C (and libc ?) standards would not define the exact
behavior of arithmetic with NaNs.

    > adding NA and NaN:
    >> NA_real_ + NaN
    > [1] NaN
    >> NaN + NA_real_
    > [1] NA
    > If I compile src/main/arithmetic.c without optimization (just -g)
    > then both of those return NA.

    > On Windows, using a precompiled R 2.8.1 from CRAN I get
    > NA for both answers.

    > On Linux, after compiling src/main/arithmetic.c with -g -O2 the bit
    > patterns for NA_real_ and as.numeric(NA) are different:
    >> my_numeric_NA <- as.numeric(NA)
    >> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    > 0000000 07a2 0000 0000 7ff8
    > 0000010
    >> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    > 0000000 07a2 0000 0000 7ff0
    > 0000010 
    > On Linux, after compiling with -g the bit patterns for NA_real_
    > and as.numeric(NA) are identical.
    >> my_numeric_NA <- as.numeric(NA)
    >> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    > 0000000 07a2 0000 0000 7ff8
    > 0000010
    >> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    > 0000000 07a2 0000 0000 7ff8
    > 0000010

    > On Windows, using precompiled R 2.8.1 and cygwin/bin/od, both of those
    > gave the 7ff8 version.

    > Is this confounding of NA and NaN of concern or does R not promise to
    > keep NA and NaN distinct? 

Hmm, I'd say it *is* of some concern that "+" is not commutative
in the narrow sense, even if I don't know what exactly "R promises".

    > I haven't followed all the macros, but it looks like arithmetic.c just
    > does
    > result[i]=x[i]+y[i]
    > and lets the compiler/floating point unit decide what to do when x[i]
    > and y[i]
    > are different NaN values (NA is a NaN value).  I haven't looked at the C
    > code
    > for the initialization of NA_real_.  Adding explicit tests for NA-ness
    > in the
    > binary operators (as S+ does) adds a fairly significant cost.

Yes, I would be quite reluctant to add such
tests, because such costs are to be expected.

Maybe we ("R" :-) should explicitly state that operations mixing
NA & NaN give a result which is NA in the sense of fulfilling is.na(.) 
but *not* promise anything further.

Martin Maechler, ETH Zurich

    > Bill Dunlap
    > TIBCO Software Inc - Spotfire Division
    > wdunlap tibco.com
#
>> On Linux when I compile R 2.10.0(devel) (src/main/arithmetic.c in
    >> particular)
    >> with gcc 3.4.5 using the flags -g -O2 I get noncommutative behavior when

    MM> is this really gcc 3.4.5  (which is quite old) ?

    MM> Without being an expert, I'd tend to claim this to be a
    MM> compiler (optimization) bug ....  but most probably the ANSI /
    MM> ISO  C (and libc ?) standards would not define the exact
    MM> behavior of arithmetic with NaNs.

    >> adding NA and NaN:
    >>> NA_real_ + NaN
    >> [1] NaN
    >>> NaN + NA_real_
    >> [1] NA
    >> If I compile src/main/arithmetic.c without optimization (just -g)
    >> then both of those return NA.

    >> On Windows, using a precompiled R 2.8.1 from CRAN I get
    >> NA for both answers.

    >> On Linux, after compiling src/main/arithmetic.c with -g -O2 the bit
    >> patterns for NA_real_ and as.numeric(NA) are different:
    >>> my_numeric_NA <- as.numeric(NA)
    >>> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    >> 0000000 07a2 0000 0000 7ff8
    >> 0000010
    >>> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    >> 0000000 07a2 0000 0000 7ff0
    >> 0000010 
    >> On Linux, after compiling with -g the bit patterns for NA_real_
    >> and as.numeric(NA) are identical.
    >>> my_numeric_NA <- as.numeric(NA)
    >>> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    >> 0000000 07a2 0000 0000 7ff8
    >> 0000010
    >>> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
    >> 0000000 07a2 0000 0000 7ff8
    >> 0000010

    >> On Windows, using precompiled R 2.8.1 and cygwin/bin/od, both of those
    >> gave the 7ff8 version.


I've run a couple of 

     echo 'c(NA+NaN, NaN+NA)' | R --slave --vanilla

on 3 different platforms, all Linux, and many installed versions
of R -- all of which compiled with defaults, i.e. '-O2', 
        and depending on its release date, compiled with
	different versions of gcc {I can still check these
	because I run R from the installation directory, with
	all configure results ..}
on a given platform, and have observed the following:

The result is always "NA NA" (as desired!)  iff the platform is 32-bit,

whereas it is        "NaN NA" for the 64-bit platform and older
	   	     	      versions of R (R <= 2.3.1)

	   	 and "NA NaN" for versions of R (>= 2.4.1)

{{but note: the versions of R are confounded with the versions of gcc ...}}

Further, as the 64-bit platform can also use the 32-bit versions of R,
I've tested a few and found that indeed, the 32-bit version of R
would return "NA NA" also on the 64-bit platform.

Martin


    >> Is this confounding of NA and NaN of concern or does R not promise to
    >> keep NA and NaN distinct? 

    MM> Hmm, I'd say it *is* of some concern that "+" is not commutative
    MM> in the narrow sense, even if I don't know what exactly "R promises".

    >> I haven't followed all the macros, but it looks like arithmetic.c just
    >> does
    >> result[i]=x[i]+y[i]
    >> and lets the compiler/floating point unit decide what to do when x[i]
    >> and y[i]
    >> are different NaN values (NA is a NaN value).  I haven't looked at the C
    >> code
    >> for the initialization of NA_real_.  Adding explicit tests for NA-ness
    >> in the
    >> binary operators (as S+ does) adds a fairly significant cost.

    MM> Yes, I would be quite reluctant to add such
    MM> tests, because such costs are to be expected.

    MM> Maybe we ("R" :-) should explicitly state that operations mixing
    MM> NA & NaN give a result which is NA in the sense of fulfilling is.na(.) 
    MM> but *not* promise anything further.

    MM> Martin Maechler, ETH Zurich

    >> Bill Dunlap
    >> TIBCO Software Inc - Spotfire Division
    >> wdunlap tibco.com

    MM> ______________________________________________
    MM> R-devel at r-project.org mailing list
    MM> https://stat.ethz.ch/mailman/listinfo/r-devel
#
On 5/1/2009 8:14 AM, Martin Maechler wrote:
Here are a few bits of information, not particularly well organized:

I don't know if the IEEE 754 (or the ISO equivalent) specifies the 
behaviour, but the hardware handling of NaNs is inconsistent on Intel 
chips.  If two NaNs are involved in an operation, it returns the one 
with the larger significand when operations are done in the FPU, it 
returns the first one when they are done in the SSE.

Our NA has a bigger significand than the default NaN, but there are NaNs 
that are bigger.  (We're unlikely to see those in the results of 
arithmetic under our control, but there's nothing to stop external code 
from returning one, or getting one via readBin from a file.)

The type can be either quiet or signalling; our NA is a signalling NaN, 
and the one Bill was getting from the as.numeric conversion is a quiet 
NaN.  However, our IsNA test doesn't look at the bits determining that 
status, it only looks at the 07a2 0000 part, so both are displayed as 
NA.  I haven't traced the coercion code to know why they aren't 
identical, but I think it doesn't matter:  the FPU and SSE methods are 
inconsistent for most pairs of operands.

All of which leads me to conclude that we should either say we don't 
guarantee what happens when you mix NA with NaN, or we should add an 
explicit test to every operation.  I'd prefer the "no guarantees" solution.

Duncan Murdoch
#
Yes, it was 3.4.5, but here is a self-contained example of the same
issue using gcc 4.1.3 on an Ubuntu Linux machine:

% gcc -O2 t.c -o a.out ; ./a.out
NA : 7ff00000000007a2
NaN: fff8000000000000
NA+NaN: 7ff80000000007a2
NaN+NA: fff8000000000000
% gcc  t.c -o a.out ; ./a.out
NA : 7ff00000000007a2
NaN: fff8000000000000
NA+NaN: 7ff80000000007a2
NaN+NA: 7ff80000000007a2
% gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.1.3 --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release i486-linux-gnu
Thread model: posix
gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)
% cat t.c
#include <stdio.h>
#include <stdint.h>
#include <string.h>

int main(int argc, char *argv[])
{
    int64_t NA_int64 = 0x7ff00000000007a2LL ;
    int64_t NaN_int64 = 0xfff8000000000000LL ;
    int64_t sum_int64 ;
    double NA_double, NaN_double, sum_double ;

    memcpy((void*)&NA_double, (void*)&NA_int64, 8) ;
    memcpy((void*)&NaN_double, (void*)&NaN_int64, 8) ;

    NaN_double = 1/0.0 - 1/0.0 ;

    printf("NA : %Lx\n", *(int64_t*)&NA_double);
    printf("NaN: %Lx\n", *(int64_t*)&NaN_double);
    sum_double = NA_double + NaN_double ;
    memcpy((void*)&sum_int64, (void*)&sum_double, 8) ;
    printf("NA+NaN: %Lx\n", sum_int64) ;
    sum_double = NaN_double + NA_double ;
    memcpy((void*)&sum_int64, (void*)&sum_double, 8) ;
    printf("NaN+NA: %Lx\n", sum_int64);
    return 0 ;
}

When I add -Wall to the -O2 then it gives me some warnings about the
*(int64_t)&doubleVal in the printf statements for the inputs, but I used
memcpy() to avoid the warnings when printing the outputs.

% gcc -Wall -O2 t.c -o a.out ; ./a.out
t.c: In function ?main?:
t.c:17: warning: dereferencing type-punned pointer will break strict-aliasing rules
t.c:18: warning: dereferencing type-punned pointer will break strict-aliasing rules
NA : 7ff00000000007a2
NaN: fff8000000000000
NA+NaN: 7ff80000000007a2
NaN+NA: fff8000000000000

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
#
The optimizer in gcc 4.2 (but not 4.1) on Linux seems to get this
issue "right" on one of my test machines (a 32-bit Ubuntu box).
4.1 fails on a 64-bit Linux but I don't have a newer gcc on any
64-bit Linux machine I can find.

A cleaner version of the test is:
% cat t.c
#include <stdio.h>
#include <stdint.h>
#include <string.h>

void print_double(char *text, double x_double)
{
    int64_t x_int64 ;
    memcpy((void*)&x_int64, (void*)&x_double, 8) ;
    printf("%s: %Lx (%g)\n", text, x_int64, x_double) ;
}
int main(int argc, char *argv[])
{
    int64_t NA_int64 = 0x7ff00000000007a2LL ;
    int64_t NaN_int64 = 0xfff8000000000000LL ;
    double NA_double, NaN_double, sum_double ;

    memcpy((void*)&NA_double, (void*)&NA_int64, 8) ;
    memcpy((void*)&NaN_double, (void*)&NaN_int64, 8) ;

    print_double(" NA", NA_double);
    print_double("NaN", NaN_double);
    sum_double = NA_double + NaN_double ;
    print_double("NA+NaN", sum_double) ;
    sum_double = NaN_double + NA_double ;
    print_double("NaN+NA", sum_double);
    return 0 ;
}

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com