Skip to content
Prev 35465 / 63421 Next

Benefit of treating NA and NaN differently for numerics

On 31-Dec-09 20:43:43, Saptarshi Guha wrote:
Because they are used to represent different things. Others will be
able to give you a much more comprehensive account than I can of
their uses in R, but essentially:

NaN represents a result which is not valid (i.e. "Not a Number")
in the domain of quantities being evaluated. For example, R does
its arithmetic by default in the domain of "double", i.e. the
machine representation of real numbers. In this domain, sqrt(-1)
does not exist -- it is not a number in the domain of real numbers.
Hence:

  sqrt(-1)
  # [1] NaN
  # Warning message:
  # In sqrt(-1) : NaNs produced

In order to obtain a result which does exist, you need to switch
domain to complex numbers:

  > sqrt(as.complex(-1))
  # [1] 0+1i

NA, on the other hand, represents a value (in whatever domain:
double, logical, character, ...) which is not known, which is why
it is typically used to represent missing data. It would be a valid
entity in the current domain if its value were known, but the value
is not known. Hence the result of any expression involving NA
quantities is NA, since the value if the expression would depend on
the unkown elements, and hence the value of the expression is unknown.

This distinction is important and useful, so it should not be done
away with by merging NaN and NA!

Best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 31-Dec-09                                       Time: 21:05:06
------------------------------ XFMail ------------------------------