Recent changes to as.complex(NA_real_)
Martin Maechler
on Thu, 28 Sep 2023 12:11:27 +0200 writes:
Gregory R Warnes
on Sat, 23 Sep 2023 13:22:35 -0400 writes:
> It sounds like we need to add arguments (with sensible
> defaults) to complex(), Re(), Im(), is.na.complex() etc to
> allow the user to specify the desired behavior.
I don't think I'd like such extra flexibility for all these, ... ;-) and even much less I'd like to be part of the group who then has to *maintain* such behavior ;-)
[..........]
Currently, I'm actually tending to *simplify* things
drastically, also because it means less surprises in the long
term and much less code reading / debugging in formatting /
printing and dealing with complex numbers.
NB: there *is* the re-opened PR#16752,
https://bugs.r-project.org/show_bug.cgi?id=16752
where the investigation of the (C-level) R source is a major reason
for my current thinking ..
What if we decided to really treat complex numbers much more
than currently as pairs of real (i.e. "double") numbers,
notably also when print()ing them?
Consequently, Re() and Im() would continue to return what they
do now (contrary to Herv?'s original proposal) also in case of
non-finite numbers.
Of course, *no* change in arithmetic or other Ops (such as '==')
nor is.na(), is.finite(), is.nan(), etc.
The current formatting and printing of complex numbers is
complicated in some cases unnecessarily inaccurate and in other
cases unnecessarily *ugly*.
I believe that formatting, we should change to basically format
the (vector of) real parts and imaginary parts separately.
E.g., it is really unnecessarily ugly to switch to exponential
format for both Re and Im, in a situation like this:
(-1):2 + 1i*1e99
[1] 0e+00+1e+99i 0e+00+1e+99i 0e+00+1e+99i 0e+00+1e+99i It is very ugly to use exponential/scientific format for the Re() even if we'd fix the confusing and inaccurate *joint* rounding of Re and Im.
and then, I end with
... and indeed (as discusses here previously: While it makes some sense to print NA identically for logical, integer and double, it seems often confusing *not* to show <Re> + <Im>i in the complex case; where that *does* happen for Inf and NaN:
> complex(, NA, ((-1):2))
[1] NA NA NA NA
> complex(, NaN, ((-1):2))
[1] NaN-1i NaN+0i NaN+1i NaN+2i
> complex(, c(-Inf,Inf), ((-1):2))
[1] -Inf-1i Inf+0i -Inf+1i Inf+2i
>
where the first of these *does* keep the finite imaginary values, but does not show them
(cN <- complex(, NA, ((-1):2))); rbind(Re(cN), Im(cN))
[1] NA NA NA NA
[,1] [,2] [,3] [,4]
[1,] NA NA NA NA
[2,] -1 0 1 2
where really, I think we should keep that behavior (*), at least
for now: Changing it as well *does* have a relatively large
impact, is not back-compatible with (the long history of) S and
R, *and* it complicates documentation and teaching unnecessarily.
Experts will now how to differentiate the different complex NAs,
e.g. by using a simple utilities such as {"format complex", "print complex"}
fc <- function(z) paste0("(",Re(z), ",", Im(z),")")
pc <- function(z) noquote(fc(z))
which I've used now for testing/"visualizing" different scenarios
Martin
---
*) simply printing 'NA' in cases where is.na(.) is true and is.nan(.) is false