NA in C/C++ - R-help | R Mailing Lists

Sun, May 28, 2000 8:43 AM #

On Sun, 28 May 2000 cstrato at EUnet.at wrote:

It's in Writing R Extensions (sections 3.7.3 and 4.4 in the copy I have to
hand, but it's in the concept index). You cannot assume in R that NA is
represented by an NaN, although on most machines it is.  Conversely, most
NaNs are not NA.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

cstrato@EUnet.at

Sun, May 28, 2000 9:19 AM #

Dear R-people

Sorry for asking a question only indirectly related to S/R but since
data containing NA values
can so easily be handled in S/R, and you can write functions for S/R in
C, my question is:
How do you handle data containing NA in C/C++ ?

Although I know that IEEE floating point arithmetics supports NaN and
Inf, I cannot find
any information about this (e.g. in any of my many C++ books)

Thank you in advance for your help
Christian Stratowa, Vienna


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Peter Dalgaard

Sun, May 28, 2000 10:30 AM #

Prof Brian D Ripley <ripley at stats.ox.ac.uk> writes:

...

Perhaps it is necessary to be a little more specific here: The IEEE
NaN is not a single value, but a set of values characterized by having
an all-ones exponent and a non-zero significand (the cases with a zero
significand are +Inf and -Inf). Have a look at

http://www.linuxsupportline.com/~billm/index.html

for the details.

The double NA in R on IEEE-supporting systems is the NaN with
significand 1954 (no, I don't know who was born that year...). 

However integers have no definition of NaN, so NaInt is INT_MIN and
for systems that don't support IEEE, we have some special hacks too.
Have a look in src/include/R_ext/Arith.h and src/main/arithmetic.c.

O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

cstrato@EUnet.at

Sun, May 28, 2000 11:12 AM #

Dear Prof Ripley

Thank you very much for your fast response. I did not realize that meanwhile
CRAN has official documents.

Section 3.7.3 mentions macros in "Arith.h" for handling NAs.
I assume that these macros can also be used in normal C programs.

not able to deal with NAs?

Best regards
Christian Stratowa

Prof Brian D Ripley wrote:

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

cstrato@EUnet.at

Sun, May 28, 2000 12:19 PM #

Dear Dr. Dalgaard

Thank you, too, for your fast response.

I have checked the web-site you mentioned. There is a function:
    int isnan(floating-type x)
for floating point numbers, which I could use.
For integers I will check Arith.h and Arithmetic.c

Best regards
Christian Stratowa

Peter Dalgaard BSA wrote:

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Brian Ripley

Sun, May 28, 2000 12:48 PM #

On Sun, 28 May 2000 cstrato at EUnet.at wrote:

Unfortunately, we found isnan is not 100% reliable.  R defines

int R_IsNaNorNA(double);

for that purpose, and makes sure it works on all the platforms.
(Note that some platforms that have isnan are marked as non-IEEE
by the configure process.)

My memory is that the real mess came with finite() and Inf/-Inf (at least
one compiler had Inf < 3 true), but we also had problems with isnan
returning a value which differed from true (as in 1 == 1).

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Murray Jorgensen

Sun, May 28, 2000 10:04 PM #

I'm looking for a way to do something like Kernel Density Estimation in R.
I have some very big sets of things like packet interarrival times and
would like to make plots of the estimated density function.

Can anyone help me?


Murray Jorgensen,  Department of Statistics,  U of Waikato, Hamilton, NZ
-----[+64-7-838-4773]---------------------------[maj at waikato.ac.nz]-----
"Doubt everything or believe everything:these are two equally convenient
strategies. With either we dispense with the need to think."
http://www.stats.waikato.ac.nz/Staff/maj.html          - Henri Poincare'

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Brian Ripley

Sun, May 28, 2000 11:05 PM #

On Mon, 29 May 2000, Murray Jorgensen wrote:

Well, R itself has density, and the MASS library has bandwidth selection
techniques for it.  That should work well for large datasets (up to the
limits of R, anyway).   A similar and perhaps even more capable
approach is binning as taken by the library KernSmooth, an R version of
which is packaged on CRAN.  logspline and locfit (also on CRAN) have
more sophisticated approaches to density estimation.

There are comparisons and details of what is available in V&R3 and in
particular in our on-line statistics complements.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

John Chambers

Mon, May 29, 2000 7:46 AM #

A few general comments.

The IEEE 754 floating-point standard is one of the more striking
successes in getting computer hardware to be more useful for those who
program.  There are, of course, glitches, both in non-compliance and
in holes in the standard, but if we can work within the standard as
far as possible, while complaining about the glitches, we'll be better
off, and C/C++ software produced will be more likely to port
gracefully to other environments.

perhaps overriding them on machines that don't conform) has advantages
inside your own C/C++ code, IF that code is not intrinsically R/S
dependent.  So isnan() would be better in that case, R_IsNaNorNA()
better for code that is R-dependent.

Where it makes sense, there is also an advantage to doing the relevant
testing in the S language and passing the result to the C code, either
directly, say as a logical vector argument, or indirectly by doing the
selection outside and leaving the C code to just grind away on the
selected subset of the data.  Within the S language, is.na() is the
best test, because it deals with either floating point or integer
data.

Anyone interested in the relevance of the standard, or just a read
through some insightful if eccentric ranting about numerical
computation generally should eventually encounter W. Kahan, "the
father of IEEE 754".

There is a directory on the web at the Berkeley CS department:
  www.cs.berkeley.edu/%7Ewkahan/ieee754status/
All the papers in that directory are worth looking at, allowing
for Kahan's legendary rages at all those who failed his standards.
Having had the privilege (well, looking back on it anyway) of taking a
course from Kahan, I can verify that his personality comes across well
in the papers.

John M. Chambers                  jmc at bell-labs.com
Bell Labs, Lucent Technologies    office: (908)582-2681
700 Mountain Avenue, Room 2C-282  fax:    (908)582-3340
Murray Hill, NJ  07974            web: http://www.cs.bell-labs.com/~jmc
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Brian Ripley

Mon, May 29, 2000 8:26 AM #

On Mon, 29 May 2000, John Chambers wrote:

Since John may not have read *all* the R code yet, that is precisely what
R_IsNaNorNA is: a wrapper for isnan on machines which have a working
isnan, and something else otherwise. Namely:

#ifdef IEEE_754

int R_IsNaNorNA(double x)
{
/* NOTE: some systems do not return 1 for TRUE. */
    return (isnan(x) != 0);
}
#else
....

where IEEE_754 is only set after testing (somewhat) functionality.

For finite, which is buggier, the 1997 draft revision to the 
ANSI C standard promised isfinite and defined it tightly.  It's just that
neither that revision not isfinite seem to be making any progress.

One thing the R project keeps on teaching me is the importance of
pragmatism here: a large proportion of the bug-fixing time is actually
bug-avoidance over all of an increasing range of machines.  Even so. I was
unprepared for the truth of Inf < 3 on Visual C++ ! I would prefer to be
pragmatic with correct answers than purist with wrong ones.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

cstrato@EUnet.at

Mon, May 29, 2000 1:31 PM #

Dear experts

Thank you all for this interesting information. I have learned a lot and
hopefully can use it in C++.

A personal note: I liked especially the document:
http://www.cs.berkeley.edu/%7Ewkahan/ieee754status/754story.html
since it mentions SANE as one of the few environments supporting
the IEEE standard. The phonebook edition of "Inside Macintosh" was
my only source until now. Luckily I got from you now a lot of  information

on this issue.

Best regards
Christian Stratowa, Vienna

John Chambers wrote:

A few general comments.

The IEEE 754 floating-point standard is one of the more striking
successes in getting computer hardware to be more useful for those who
program.  There are, of course, glitches, both in non-compliance and
in holes in the standard, but if we can work within the standard as
far as possible, while complaining about the glitches, we'll be better
off, and C/C++ software produced will be more likely to port
gracefully to other environments.

From that view, using C routines that are part of the standard (while

perhaps overriding them on machines that don't conform) has advantages
inside your own C/C++ code, IF that code is not intrinsically R/S
dependent.  So isnan() would be better in that case, R_IsNaNorNA()
better for code that is R-dependent.

Where it makes sense, there is also an advantage to doing the relevant
testing in the S language and passing the result to the C code, either
directly, say as a logical vector argument, or indirectly by doing the
selection outside and leaving the C code to just grind away on the
selected subset of the data.  Within the S language, is.na() is the
best test, because it deals with either floating point or integer
data.

Anyone interested in the relevance of the standard, or just a read
through some insightful if eccentric ranting about numerical
computation generally should eventually encounter W. Kahan, "the
father of IEEE 754".

There is a directory on the web at the Berkeley CS department:
  www.cs.berkeley.edu/%7Ewkahan/ieee754status/
All the papers in that directory are worth looking at, allowing
for Kahan's legendary rages at all those who failed his standards.
Having had the privilege (well, looking back on it anyway) of taking a
course from Kahan, I can verify that his personality comes across well
in the papers.

--
John M. Chambers                  jmc at bell-labs.com
Bell Labs, Lucent Technologies    office: (908)582-2681
700 Mountain Avenue, Room 2C-282  fax:    (908)582-3340
Murray Hill, NJ  07974            web: http://www.cs.bell-labs.com/~jmc

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._