Revisiting this thread from April:
https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html
where the decision (not yet backported) was made for as.complex(NA_real_)
to give NA_complex_ instead of complex(r=NA_real_, i=0), to be consistent
with help("as.complex") and as.complex(NA) and as.complex(NA_integer_).
Was any consideration given to the alternative? That is, to changing
as.complex(NA) and as.complex(NA_integer_) to give complex(r=NA_real_, i=0),
consistent with as.complex(NA_real_), then amending help("as.complex")
accordingly?
The principle that Im(as.complex(<real=(double|integer|logical)>)) should
be zero is quite fundamental, in my view, hence the "new" behaviour seems
to really violate the principle of least surprise ...
Another (but maybe weaker) argument is that double->complex coercions happen
more often than logical->complex and integer->complex ones. Changing the
behaviour of the more frequently performed coercion is more likely to affect
code "out there".
Yet another argument is that one expects
identical(as.complex(NA_real_), NA_real_ + (0+0i))
to be TRUE, i.e., that coercing from double to complex is equivalent to
adding a complex zero. The new behaviour makes the above FALSE, since
NA_real_ + (0+0i) gives complex(r=NA_real_, i=0).
Having said that, one might also (but more naively) expect
identical(as.complex(as.double(NA_complex_)), NA_complex_)
to be TRUE. Under my proposal it continues to be FALSE. Well, I'd prefer
if it gave FALSE with a warning "imaginary parts discarded in coercion",
but it seems that as.double(complex(r=a, i=b)) never warns when either of
'a' and 'b' is NA_real_ or NaN, even where "information" {nonzero 'b'} is
clearly lost ...
Whatever decision is made about as.complex(NA_real_), maybe these points
should be weighed before it becomes part of R-release ...
Mikael
Recent changes to as.complex(NA_real_)
16 messages · Duncan Murdoch, Spencer Graves, Hervé Pagès +3 more
1 day later
Mikael Jagan
on Thu, 21 Sep 2023 00:47:39 -0400 writes:
> Revisiting this thread from April:
> https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html
> where the decision (not yet backported) was made for
> as.complex(NA_real_) to give NA_complex_ instead of
> complex(r=NA_real_, i=0), to be consistent with
> help("as.complex") and as.complex(NA) and as.complex(NA_integer_).
> Was any consideration given to the alternative?
> That is, to changing as.complex(NA) and as.complex(NA_integer_) to
> give complex(r=NA_real_, i=0), consistent with
> as.complex(NA_real_), then amending help("as.complex")
> accordingly?
Hmm, as, from R-core, mostly I was involved, I admit to say "no",
to my knowledge the (above) alternative wasn't considered.
> The principle that
> Im(as.complex(<real=(double|integer|logical)>)) should be zero
> is quite fundamental, in my view, hence the "new" behaviour
> seems to really violate the principle of least surprise ...
of course "least surprise" is somewhat subjective. Still,
I clearly agree that the above would be one desirable property.
I think that any solution will lead to *some* surprise for some
cases, I think primarily because there are *many* different
values z for which is.na(z) is true, and in any case
NA_complex_ is only of the many.
I also agree with Mikael that we should reconsider the issue
that was raised by Davis Vaughan here ("on R-devel") last April.
> Another (but maybe weaker) argument is that
> double->complex coercions happen more often than
> logical->complex and integer->complex ones. Changing the
> behaviour of the more frequently performed coercion is
> more likely to affect code "out there".
> Yet another argument is that one expects
> identical(as.complex(NA_real_), NA_real_ + (0+0i))
> to be TRUE, i.e., that coercing from double to complex is
> equivalent to adding a complex zero. The new behaviour
> makes the above FALSE, since NA_real_ + (0+0i) gives
> complex(r=NA_real_, i=0).
No! --- To my own surprise (!) --- in current R-devel the above is TRUE,
and
NA_real_ + (0+0i) , the same as
NA_real_ + 0i , really gives complex(r=NA, i=NA) :
Using showC() from ?complex
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
we see (in R-devel) quite consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = NA)
showC(NA + 0i) # NA is 'logical'
[1] (R = NA, I = NA)
where as in R 4.3.1 and "R-patched" -- *in*consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = 0)
showC(NA + 0i)
[1] (R = NA, I = NA)
.... and honestly, I do not see *where* (and when) we changed
the underlying code (in arithmetic.c !?) in R-devel to *also*
produce NA_complex_ in such complex *arithmetic*
> Having said that, one might also (but more naively) expect
> identical(as.complex(as.double(NA_complex_)), NA_complex_)
> to be TRUE.
as in current R-devel
> Under my proposal it continues to be FALSE.
as in "R-release"
> Well, I'd prefer if it gave FALSE with a warning
> "imaginary parts discarded in coercion", but it seems that
> as.double(complex(r=a, i=b)) never warns when either of
> 'a' and 'b' is NA_real_ or NaN, even where "information"
> {nonzero 'b'} is clearly lost ...
The question of *warning* here is related indeed, but I think
we should try to look at it only *secondary* to your first
proposal.
> Whatever decision is made about as.complex(NA_real_),
> maybe these points should be weighed before it becomes part of
> R-release ...
> Mikael
Indeed.
Can we please get other opinions / ideas here?
Thank you in advance for your thoughts!
Martin
---
PS:
Our *print()*ing of complex NA's ("NA" here meaning NA or NaN)
is also unsatisfactory, e.g. in the case where all entries of a
vector are NA in the sense of is.na(.), but their
Re() and Im() are not all NA:
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
z <- complex(, c(11, NA, NA), c(NA, 99, NA))
z
showC(z)
gives
> z
[1] NA NA NA
> showC(z)
[1] (R = 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
but that (printing of complex) *is* another issue,
in which we have the re-opened bugzilla PR#16752
==> https://bugs.r-project.org/show_bug.cgi?id=16752
on which we also worked during the R Sprint in Warwick three
weeks ago, and where I want to commit changes in any case {but
think we should change even a bit more than we got to during the
Sprint}.
On 2023-09-22 6:38 am, Martin Maechler wrote:
Mikael Jagan
on Thu, 21 Sep 2023 00:47:39 -0400 writes:
> Revisiting this thread from April:
> where the decision (not yet backported) was made for
> as.complex(NA_real_) to give NA_complex_ instead of
> complex(r=NA_real_, i=0), to be consistent with
> help("as.complex") and as.complex(NA) and as.complex(NA_integer_).
> Was any consideration given to the alternative?
> That is, to changing as.complex(NA) and as.complex(NA_integer_) to
> give complex(r=NA_real_, i=0), consistent with
> as.complex(NA_real_), then amending help("as.complex")
> accordingly?
Hmm, as, from R-core, mostly I was involved, I admit to say "no", to my knowledge the (above) alternative wasn't considered.
> The principle that > Im(as.complex(<real=(double|integer|logical)>)) should be zero > is quite fundamental, in my view, hence the "new" behaviour > seems to really violate the principle of least surprise ...
of course "least surprise" is somewhat subjective. Still,
I clearly agree that the above would be one desirable property.
I think that any solution will lead to *some* surprise for some
cases, I think primarily because there are *many* different
values z for which is.na(z) is true, and in any case
NA_complex_ is only of the many.
I also agree with Mikael that we should reconsider the issue
that was raised by Davis Vaughan here ("on R-devel") last April.
> Another (but maybe weaker) argument is that
> double->complex coercions happen more often than
> logical->complex and integer->complex ones. Changing the
> behaviour of the more frequently performed coercion is
> more likely to affect code "out there".
> Yet another argument is that one expects
> identical(as.complex(NA_real_), NA_real_ + (0+0i))
> to be TRUE, i.e., that coercing from double to complex is
> equivalent to adding a complex zero. The new behaviour
> makes the above FALSE, since NA_real_ + (0+0i) gives
> complex(r=NA_real_, i=0).
No! --- To my own surprise (!) --- in current R-devel the above is TRUE,
and
NA_real_ + (0+0i) , the same as
NA_real_ + 0i , really gives complex(r=NA, i=NA) :
Thank you for the correction - indeed, as.complex(NA_real_) and NA_real_ + (0+0i) are identical in both R-patched and R-devel, both giving complex(r=NA_real_, i=0) in R-patched and both giving NA_complex_ in R-devel. I was hallucating, it seems ...
Using showC() from ?complex
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
we see (in R-devel) quite consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = NA)
showC(NA + 0i) # NA is 'logical'
[1] (R = NA, I = NA)
where as in R 4.3.1 and "R-patched" -- *in*consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = 0)
showC(NA + 0i)
[1] (R = NA, I = NA)
.... and honestly, I do not see *where* (and when) we changed the underlying code (in arithmetic.c !?) in R-devel to *also* produce NA_complex_ in such complex *arithmetic*
R_binary() in arithmetic.c has always coerced REALSXP->CPLXSXP when encountering one of each. Surely then the changes in coerce.c are the cause and this arithmetic behaviour is just a (bad, IMO) side effect?
> Having said that, one might also (but more naively) expect
> identical(as.complex(as.double(NA_complex_)), NA_complex_)
> to be TRUE.
as in current R-devel
> Under my proposal it continues to be FALSE.
as in "R-release"
> Well, I'd prefer if it gave FALSE with a warning
> "imaginary parts discarded in coercion", but it seems that
> as.double(complex(r=a, i=b)) never warns when either of
> 'a' and 'b' is NA_real_ or NaN, even where "information"
> {nonzero 'b'} is clearly lost ...
The question of *warning* here is related indeed, but I think we should try to look at it only *secondary* to your first proposal.
> Whatever decision is made about as.complex(NA_real_),
> maybe these points should be weighed before it becomes part of
> R-release ...
> Mikael
Indeed. Can we please get other opinions / ideas here?
Thank you, Martin, for "reopening". Mikael
Thank you in advance for your thoughts!
Martin
---
PS:
Our *print()*ing of complex NA's ("NA" here meaning NA or NaN)
is also unsatisfactory, e.g. in the case where all entries of a
vector are NA in the sense of is.na(.), but their
Re() and Im() are not all NA:
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
z <- complex(, c(11, NA, NA), c(NA, 99, NA))
z
showC(z)
gives
> z
[1] NA NA NA
> showC(z)
[1] (R = 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
but that (printing of complex) *is* another issue,
in which we have the re-opened bugzilla PR#16752
==> https://bugs.r-project.org/show_bug.cgi?id=16752
on which we also worked during the R Sprint in Warwick three
weeks ago, and where I want to commit changes in any case {but
think we should change even a bit more than we got to during the
Sprint}.
We could also question the value of having an infinite number of NA representations in the complex space. For example all these complex values are displayed the same way (as NA), are considered NAs by is.na(), but are not identical or semantically equivalent (from an Re() or Im() point of view): ??? NA_real_ + 0i ??? complex(r=NA_real_, i=Inf) ??? complex(r=2, i=NA_real_) ??? complex(r=NaN, i=NA_real_) In other words, using a single representation for complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would avoid a lot of unnecessary complications and surprises. Once you do that, whether as.complex(NA_real_) should return complex(r=NA_real_, i=0) or complex(r=NA_real_, i=NA_real_) becomes a moot point. Best, H.
On 9/22/23 03:38, Martin Maechler wrote:
Mikael Jagan
on Thu, 21 Sep 2023 00:47:39 -0400 writes:
> Revisiting this thread from April:
> where the decision (not yet backported) was made for
> as.complex(NA_real_) to give NA_complex_ instead of
> complex(r=NA_real_, i=0), to be consistent with
> help("as.complex") and as.complex(NA) and as.complex(NA_integer_).
> Was any consideration given to the alternative?
> That is, to changing as.complex(NA) and as.complex(NA_integer_) to
> give complex(r=NA_real_, i=0), consistent with
> as.complex(NA_real_), then amending help("as.complex")
> accordingly?
Hmm, as, from R-core, mostly I was involved, I admit to say "no", to my knowledge the (above) alternative wasn't considered.
> The principle that > Im(as.complex(<real=(double|integer|logical)>)) should be zero > is quite fundamental, in my view, hence the "new" behaviour > seems to really violate the principle of least surprise ...
of course "least surprise" is somewhat subjective. Still,
I clearly agree that the above would be one desirable property.
I think that any solution will lead to *some* surprise for some
cases, I think primarily because there are *many* different
values z for which is.na(z) is true, and in any case
NA_complex_ is only of the many.
I also agree with Mikael that we should reconsider the issue
that was raised by Davis Vaughan here ("on R-devel") last April.
> Another (but maybe weaker) argument is that
> double->complex coercions happen more often than
> logical->complex and integer->complex ones. Changing the
> behaviour of the more frequently performed coercion is
> more likely to affect code "out there".
> Yet another argument is that one expects
> identical(as.complex(NA_real_), NA_real_ + (0+0i))
> to be TRUE, i.e., that coercing from double to complex is
> equivalent to adding a complex zero. The new behaviour
> makes the above FALSE, since NA_real_ + (0+0i) gives
> complex(r=NA_real_, i=0).
No! --- To my own surprise (!) --- in current R-devel the above is TRUE,
and
NA_real_ + (0+0i) , the same as
NA_real_ + 0i , really gives complex(r=NA, i=NA) :
Using showC() from ?complex
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
we see (in R-devel) quite consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = NA)
showC(NA + 0i) # NA is 'logical'
[1] (R = NA, I = NA) where as in R 4.3.1 and "R-patched" -- *in*consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = 0)
showC(NA + 0i)
[1] (R = NA, I = NA) .... and honestly, I do not see *where* (and when) we changed the underlying code (in arithmetic.c !?) in R-devel to *also* produce NA_complex_ in such complex *arithmetic*
> Having said that, one might also (but more naively) expect
> identical(as.complex(as.double(NA_complex_)), NA_complex_)
> to be TRUE.
as in current R-devel
> Under my proposal it continues to be FALSE.
as in "R-release"
> Well, I'd prefer if it gave FALSE with a warning
> "imaginary parts discarded in coercion", but it seems that
> as.double(complex(r=a, i=b)) never warns when either of
> 'a' and 'b' is NA_real_ or NaN, even where "information"
> {nonzero 'b'} is clearly lost ...
The question of *warning* here is related indeed, but I think we should try to look at it only *secondary* to your first proposal.
> Whatever decision is made about as.complex(NA_real_),
> maybe these points should be weighed before it becomes part of
> R-release ...
> Mikael
Indeed.
Can we please get other opinions / ideas here?
Thank you in advance for your thoughts!
Martin
---
PS:
Our *print()*ing of complex NA's ("NA" here meaning NA or NaN)
is also unsatisfactory, e.g. in the case where all entries of a
vector are NA in the sense of is.na(.), but their
Re() and Im() are not all NA:
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
z <- complex(, c(11, NA, NA), c(NA, 99, NA))
z
showC(z)
gives
> z
[1] NA NA NA
> showC(z)
[1] (R = 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
but that (printing of complex) *is* another issue,
in which we have the re-opened bugzilla PR#16752
==>https://bugs.r-project.org/show_bug.cgi?id=16752
on which we also worked during the R Sprint in Warwick three
weeks ago, and where I want to commit changes in any case {but
think we should change even a bit more than we got to during the
Sprint}.
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com [[alternative HTML version deleted]]
Since the result of is.na(x) is the same on each of those, I don't see a problem. As long as that is consistent, I don't see a problem. You shouldn't be using any other test for NA-ness. You should never be expecting identical() to treat different types as the same (e.g. identical(NA, NA_real_) is FALSE, as it should be). If you are using a different test, that's user error. Duncan Murdoch
On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
We could also question the value of having an infinite number of NA representations in the complex space. For example all these complex values are displayed the same way (as NA), are considered NAs by is.na(), but are not identical or semantically equivalent (from an Re() or Im() point of view): ??? NA_real_ + 0i ??? complex(r=NA_real_, i=Inf) ??? complex(r=2, i=NA_real_) ??? complex(r=NaN, i=NA_real_) In other words, using a single representation for complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would avoid a lot of unnecessary complications and surprises. Once you do that, whether as.complex(NA_real_) should return complex(r=NA_real_, i=0) or complex(r=NA_real_, i=NA_real_) becomes a moot point. Best, H. On 9/22/23 03:38, Martin Maechler wrote:
Mikael Jagan
on Thu, 21 Sep 2023 00:47:39 -0400 writes:
> Revisiting this thread from April:
> where the decision (not yet backported) was made for
> as.complex(NA_real_) to give NA_complex_ instead of
> complex(r=NA_real_, i=0), to be consistent with
> help("as.complex") and as.complex(NA) and as.complex(NA_integer_).
> Was any consideration given to the alternative?
> That is, to changing as.complex(NA) and as.complex(NA_integer_) to
> give complex(r=NA_real_, i=0), consistent with
> as.complex(NA_real_), then amending help("as.complex")
> accordingly?
Hmm, as, from R-core, mostly I was involved, I admit to say "no", to my knowledge the (above) alternative wasn't considered.
> The principle that
> Im(as.complex(<real=(double|integer|logical)>)) should be zero
> is quite fundamental, in my view, hence the "new" behaviour
> seems to really violate the principle of least surprise ...
of course "least surprise" is somewhat subjective. Still,
I clearly agree that the above would be one desirable property.
I think that any solution will lead to *some* surprise for some
cases, I think primarily because there are *many* different
values z for which is.na(z) is true, and in any case
NA_complex_ is only of the many.
I also agree with Mikael that we should reconsider the issue
that was raised by Davis Vaughan here ("on R-devel") last April.
> Another (but maybe weaker) argument is that
> double->complex coercions happen more often than
> logical->complex and integer->complex ones. Changing the
> behaviour of the more frequently performed coercion is
> more likely to affect code "out there".
> Yet another argument is that one expects
> identical(as.complex(NA_real_), NA_real_ + (0+0i))
> to be TRUE, i.e., that coercing from double to complex is
> equivalent to adding a complex zero. The new behaviour
> makes the above FALSE, since NA_real_ + (0+0i) gives
> complex(r=NA_real_, i=0).
No! --- To my own surprise (!) --- in current R-devel the above is TRUE,
and
NA_real_ + (0+0i) , the same as
NA_real_ + 0i , really gives complex(r=NA, i=NA) :
Using showC() from ?complex
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
we see (in R-devel) quite consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = NA)
showC(NA + 0i) # NA is 'logical'
[1] (R = NA, I = NA) where as in R 4.3.1 and "R-patched" -- *in*consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = 0)
showC(NA + 0i)
[1] (R = NA, I = NA) .... and honestly, I do not see *where* (and when) we changed the underlying code (in arithmetic.c !?) in R-devel to *also* produce NA_complex_ in such complex *arithmetic*
> Having said that, one might also (but more naively) expect
> identical(as.complex(as.double(NA_complex_)), NA_complex_)
> to be TRUE.
as in current R-devel
> Under my proposal it continues to be FALSE.
as in "R-release"
> Well, I'd prefer if it gave FALSE with a warning
> "imaginary parts discarded in coercion", but it seems that
> as.double(complex(r=a, i=b)) never warns when either of
> 'a' and 'b' is NA_real_ or NaN, even where "information"
> {nonzero 'b'} is clearly lost ...
The question of *warning* here is related indeed, but I think we should try to look at it only *secondary* to your first proposal.
> Whatever decision is made about as.complex(NA_real_),
> maybe these points should be weighed before it becomes part of
> R-release ...
> Mikael
Indeed.
Can we please get other opinions / ideas here?
Thank you in advance for your thoughts!
Martin
---
PS:
Our *print()*ing of complex NA's ("NA" here meaning NA or NaN)
is also unsatisfactory, e.g. in the case where all entries of a
vector are NA in the sense of is.na(.), but their
Re() and Im() are not all NA:
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
z <- complex(, c(11, NA, NA), c(NA, 99, NA))
z
showC(z)
gives
> z
[1] NA NA NA
> showC(z)
[1] (R = 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
but that (printing of complex) *is* another issue,
in which we have the re-opened bugzilla PR#16752
==>https://bugs.r-project.org/show_bug.cgi?id=16752
on which we also worked during the R Sprint in Warwick three
weeks ago, and where I want to commit changes in any case {but
think we should change even a bit more than we got to during the
Sprint}.
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Perhaps I shouldn't comment without having read the entire thread, but I will: I can envision situations where I might want, e.g., 2 from complex(r=2, i=NA_real_). Spencer Graves
On 9/22/23 3:43 PM, Duncan Murdoch wrote:
Since the result of is.na(x) is the same on each of those, I don't see a problem.? As long as that is consistent, I don't see a problem.? You shouldn't be using any other test for NA-ness.? You should never be expecting identical() to treat different types as the same (e.g. identical(NA, NA_real_) is FALSE, as it should be).? If you are using a different test, that's user error. Duncan Murdoch On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
We could also question the value of having an infinite number of NA representations in the complex space. For example all these complex values are displayed the same way (as NA), are considered NAs by is.na(), but are not identical or semantically equivalent (from an Re() or Im() point of view): ? ??? NA_real_ + 0i ? ??? complex(r=NA_real_, i=Inf) ? ??? complex(r=2, i=NA_real_) ? ??? complex(r=NaN, i=NA_real_) In other words, using a single representation for complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would avoid a lot of unnecessary complications and surprises. Once you do that, whether as.complex(NA_real_) should return complex(r=NA_real_, i=0) or complex(r=NA_real_, i=NA_real_) becomes a moot point. Best, H. On 9/22/23 03:38, Martin Maechler wrote:
Mikael Jagan ????? on Thu, 21 Sep 2023 00:47:39 -0400 writes:
????? > Revisiting this thread from April: ????? >https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html ????? > where the decision (not yet backported) was made for ????? > as.complex(NA_real_) to give NA_complex_ instead of ????? > complex(r=NA_real_, i=0), to be consistent with ????? > help("as.complex") and as.complex(NA) and as.complex(NA_integer_). ????? > Was any consideration given to the alternative? ????? > That is, to changing as.complex(NA) and as.complex(NA_integer_) to ????? > give complex(r=NA_real_, i=0), consistent with ????? > as.complex(NA_real_), then amending help("as.complex") ????? > accordingly? Hmm, as, from R-core, mostly I was involved, I admit to say "no", to my knowledge the (above) alternative wasn't considered. ??? > The principle that ??? > Im(as.complex(<real=(double|integer|logical)>)) should be zero ??? > is quite fundamental, in my view, hence the "new" behaviour ??? > seems to really violate the principle of least surprise ... of course "least surprise"? is somewhat subjective.? Still, I clearly agree that the above would be one desirable property. I think that any solution will lead to *some* surprise for some cases, I think primarily because there are *many* different values z? for which? is.na(z)? is true,? and in any case NA_complex_? is only of the many. I also agree with Mikael that we should reconsider the issue that was raised by Davis Vaughan here ("on R-devel") last April. ????? > Another (but maybe weaker) argument is that ????? > double->complex coercions happen more often than ????? > logical->complex and integer->complex ones.? Changing the ????? > behaviour of the more frequently performed coercion is ????? > more likely to affect code "out there". ????? > Yet another argument is that one expects ????? >????? identical(as.complex(NA_real_), NA_real_ + (0+0i)) ????? > to be TRUE, i.e., that coercing from double to complex is ????? > equivalent to adding a complex zero.? The new behaviour ????? > makes the above FALSE, since NA_real_ + (0+0i) gives ????? > complex(r=NA_real_, i=0). No!? --- To my own surprise (!) --- in current R-devel the above is TRUE, and ??????? NA_real_ + (0+0i)? , the same as ??????? NA_real_ + 0i????? , really gives? complex(r=NA, i=NA) : Using showC() from ?complex ??? showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z))) we see (in R-devel) quite consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = NA)
showC(NA?????? + 0i)? # NA is 'logical'
[1] (R = NA, I = NA) where as in R 4.3.1 and "R-patched" -- *in*consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = 0)
showC(NA + 0i)
[1] (R = NA, I = NA)
.... and honestly, I do not see *where* (and when) we changed
the underlying code (in arithmetic.c !?)? in R-devel to *also*
produce? NA_complex_? in such complex *arithmetic*
????? > Having said that, one might also (but more naively) expect
????? >???? identical(as.complex(as.double(NA_complex_)), NA_complex_)
????? > to be TRUE.
as in current R-devel
????? > Under my proposal it continues to be FALSE.
as in "R-release"
????? > Well, I'd prefer if it gave FALSE with a warning
????? > "imaginary parts discarded in coercion", but it seems that
????? > as.double(complex(r=a, i=b)) never warns when either of
????? > 'a' and 'b' is NA_real_ or NaN, even where "information"
????? > {nonzero 'b'} is clearly lost ...
The question of *warning* here is related indeed, but I think
we should try to look at it only *secondary* to your first
proposal.
????? > Whatever decision is made about as.complex(NA_real_),
????? > maybe these points should be weighed before it becomes part of
????? > R-release ...
????? > Mikael
Indeed.
Can we please get other opinions / ideas here?
Thank you in advance for your thoughts!
Martin
---
PS:
?? Our *print()*ing? of complex NA's ("NA" here meaning NA or NaN)
?? is also unsatisfactory, e.g. in the case where all entries of a
?? vector are NA in the sense of is.na(.), but their
?? Re() and Im() are not all NA:
??? showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z),
Im(z)))
??? z <- complex(, c(11, NA, NA), c(NA, 99, NA))
??? z
??? showC(z)
gives
??? > z
??? [1] NA NA NA
??? > showC(z)
??? [1] (R = 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
but that (printing of complex) *is* another issue,
in which we have the re-opened bugzilla PR#16752
????? ==>https://bugs.r-project.org/show_bug.cgi?id=16752
on which we also worked during the R Sprint in Warwick three
weeks ago, and where I want to commit changes in any case {but
think we should change even a bit more than we got to during the
Sprint}.
______________________________________________ R-devel at r-project.org? mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
The problem is that you have things that are **semantically** different but look exactly the same: They look the same: > x [1] NA > y [1] NA > z [1] NA > is.na(x) [1] TRUE > is.na(y) [1] TRUE > is.na(z) [1] TRUE > str(x) ?cplx NA > str(y) ?num NA > str(z) ?cplx NA but they are semantically different e.g. > Re(x) [1] NA > Re(y) [1] -0.5? # surprise! > Im(x)? # surprise! [1] 2 > Im(z) [1] NA so any expression involving Re() or Im() will produce different results on input that look the same on the surface. You can address this either by normalizing the internal representation of complex NA to always be complex(r=NaN, i=NA_real_), like for NA_complex_, or by allowing the infinite variations that are currently allowed and at the same time making sure that both Re() and Im()? always return NA_real_ on a complex NA. My point is that the behavior of complex NA should be predictable. Right now it's not. Once it's predictable (with Re() and Im() both returning NA_real_ regardless of internal representation), then it no longer matters what kind of complex NA is returned by as.complex(NA_real_), because they are no onger distinguishable. H.
On 9/22/23 13:43, Duncan Murdoch wrote:
Since the result of is.na(x) is the same on each of those, I don't see a problem.? As long as that is consistent, I don't see a problem. You shouldn't be using any other test for NA-ness.? You should never be expecting identical() to treat different types as the same (e.g. identical(NA, NA_real_) is FALSE, as it should be).? If you are using a different test, that's user error. Duncan Murdoch On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
We could also question the value of having an infinite number of NA representations in the complex space. For example all these complex values are displayed the same way (as NA), are considered NAs by is.na(), but are not identical or semantically equivalent (from an Re() or Im() point of view): ? ??? NA_real_ + 0i ? ??? complex(r=NA_real_, i=Inf) ? ??? complex(r=2, i=NA_real_) ? ??? complex(r=NaN, i=NA_real_) In other words, using a single representation for complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would avoid a lot of unnecessary complications and surprises. Once you do that, whether as.complex(NA_real_) should return complex(r=NA_real_, i=0) or complex(r=NA_real_, i=NA_real_) becomes a moot point. Best, H. On 9/22/23 03:38, Martin Maechler wrote:
Mikael Jagan ????? on Thu, 21 Sep 2023 00:47:39 -0400 writes:
????? > Revisiting this thread from April:
????? > where the decision (not yet backported) was made for
????? > as.complex(NA_real_) to give NA_complex_ instead of
????? > complex(r=NA_real_, i=0), to be consistent with
????? > help("as.complex") and as.complex(NA) and
as.complex(NA_integer_).
????? > Was any consideration given to the alternative?
????? > That is, to changing as.complex(NA) and
as.complex(NA_integer_) to
????? > give complex(r=NA_real_, i=0), consistent with
????? > as.complex(NA_real_), then amending help("as.complex")
????? > accordingly?
Hmm, as, from R-core, mostly I was involved, I admit to say "no",
to my knowledge the (above) alternative wasn't considered.
??? > The principle that
??? > Im(as.complex(<real=(double|integer|logical)>)) should be zero
??? > is quite fundamental, in my view, hence the "new" behaviour
??? > seems to really violate the principle of least surprise ...
of course "least surprise"? is somewhat subjective.? Still,
I clearly agree that the above would be one desirable property.
I think that any solution will lead to *some* surprise for some
cases, I think primarily because there are *many* different
values z? for which? is.na(z)? is true,? and in any case
NA_complex_? is only of the many.
I also agree with Mikael that we should reconsider the issue
that was raised by Davis Vaughan here ("on R-devel") last April.
????? > Another (but maybe weaker) argument is that
????? > double->complex coercions happen more often than
????? > logical->complex and integer->complex ones. Changing the
????? > behaviour of the more frequently performed coercion is
????? > more likely to affect code "out there".
????? > Yet another argument is that one expects
????? >????? identical(as.complex(NA_real_), NA_real_ + (0+0i))
????? > to be TRUE, i.e., that coercing from double to complex is
????? > equivalent to adding a complex zero.? The new behaviour
????? > makes the above FALSE, since NA_real_ + (0+0i) gives
????? > complex(r=NA_real_, i=0).
No!? --- To my own surprise (!) --- in current R-devel the above is
TRUE,
and
??????? NA_real_ + (0+0i)? , the same as
??????? NA_real_ + 0i????? , really gives? complex(r=NA, i=NA) :
Using showC() from ?complex
??? showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z),
Im(z)))
we see (in R-devel) quite consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = NA)
showC(NA?????? + 0i)? # NA is 'logical'
[1] (R = NA, I = NA) where as in R 4.3.1 and "R-patched" -- *in*consistently
showC(NA_real_ + 0i)
[1] (R = NA, I = 0)
showC(NA + 0i)
[1] (R = NA, I = NA)
.... and honestly, I do not see *where* (and when) we changed
the underlying code (in arithmetic.c !?)? in R-devel to *also*
produce? NA_complex_? in such complex *arithmetic*
????? > Having said that, one might also (but more naively) expect
????? >???? identical(as.complex(as.double(NA_complex_)), NA_complex_)
????? > to be TRUE.
as in current R-devel
????? > Under my proposal it continues to be FALSE.
as in "R-release"
????? > Well, I'd prefer if it gave FALSE with a warning
????? > "imaginary parts discarded in coercion", but it seems that
????? > as.double(complex(r=a, i=b)) never warns when either of
????? > 'a' and 'b' is NA_real_ or NaN, even where "information"
????? > {nonzero 'b'} is clearly lost ...
The question of *warning* here is related indeed, but I think
we should try to look at it only *secondary* to your first
proposal.
????? > Whatever decision is made about as.complex(NA_real_),
????? > maybe these points should be weighed before it becomes part of
????? > R-release ...
????? > Mikael
Indeed.
Can we please get other opinions / ideas here?
Thank you in advance for your thoughts!
Martin
---
PS:
?? Our *print()*ing? of complex NA's ("NA" here meaning NA or NaN)
?? is also unsatisfactory, e.g. in the case where all entries of a
?? vector are NA in the sense of is.na(.), but their
?? Re() and Im() are not all NA:
?? ??? showC <- function(z) noquote(sprintf("(R = %g, I = %g)",
Re(z), Im(z)))
??? z <- complex(, c(11, NA, NA), c(NA, 99, NA))
??? z
??? showC(z)
gives
??? > z
??? [1] NA NA NA
??? > showC(z)
??? [1] (R = 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
but that (printing of complex) *is* another issue,
in which we have the re-opened bugzilla PR#16752
????? ==>https://bugs.r-project.org/show_bug.cgi?id=16752
on which we also worked during the R Sprint in Warwick three
weeks ago, and where I want to commit changes in any case {but
think we should change even a bit more than we got to during the
Sprint}.
______________________________________________ R-devel at r-project.org? mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com [[alternative HTML version deleted]]
On 9/22/23 16:55, Herv? Pag?s wrote:
The problem is that you have things that are **semantically** different but look exactly the same: They look the same:
x
[1] NA
y
[1] NA
z
[1] NA
is.na(x)
[1] TRUE
is.na(y)
[1] TRUE
is.na(z)
[1] TRUE
str(x)
?cplx NA
str(y)
?num NA
oops, that was supposed to be: > str(y) ?cplx NA but somehow I managed to copy/paste the wrong thing, sorry. H.
Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com [[alternative HTML version deleted]]
Herv? Pag?s
on Fri, 22 Sep 2023 16:55:05 -0700 writes:
> The problem is that you have things that are
> **semantically** different but look exactly the same:
> They look the same:
>> x
> [1] NA
>> y
> [1] NA
>> z
> [1] NA
>> is.na(x)
> [1] TRUE
>> is.na(y)
> [1] TRUE
>> is.na(z)
> [1] TRUE
>> str(x)
> ?cplx NA
>> str(y)
> ?num NA
>> str(z)
> ?cplx NA
> but they are semantically different e.g.
>> Re(x)
> [1] NA
>> Re(y)
> [1] -0.5? # surprise!
>> Im(x)? # surprise!
> [1] 2
>> Im(z)
> [1] NA
> so any expression involving Re() or Im() will produce
> different results on input that look the same on the
> surface.
> You can address this either by normalizing the internal
> representation of complex NA to always be complex(r=NaN,
> i=NA_real_), like for NA_complex_, or by allowing the
> infinite variations that are currently allowed and at the
> same time making sure that both Re() and Im()? always
> return NA_real_ on a complex NA.
> My point is that the behavior of complex NA should be
> predictable. Right now it's not. Once it's predictable
> (with Re() and Im() both returning NA_real_ regardless of
> internal representation), then it no longer matters what
> kind of complex NA is returned by as.complex(NA_real_),
> because they are no onger distinguishable.
> H.
> On 9/22/23 13:43, Duncan Murdoch wrote:
>> Since the result of is.na(x) is the same on each of
>> those, I don't see a problem.? As long as that is
>> consistent, I don't see a problem. You shouldn't be using
>> any other test for NA-ness.? You should never be
>> expecting identical() to treat different types as the
>> same (e.g. identical(NA, NA_real_) is FALSE, as it
>> should be).? If you are using a different test, that's
>> user error.
>>
>> Duncan Murdoch
>>
>> On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
>>> We could also question the value of having an infinite
>>> number of NA representations in the complex space. For
>>> example all these complex values are displayed the same
>>> way (as NA), are considered NAs by is.na(), but are not
>>> identical or semantically equivalent (from an Re() or
>>> Im() point of view):
>>>
>>> ? ??? NA_real_ + 0i
>>>
>>> ? ??? complex(r=NA_real_, i=Inf)
>>>
>>> ? ??? complex(r=2, i=NA_real_)
>>>
>>> ? ??? complex(r=NaN, i=NA_real_)
>>>
>>> In other words, using a single representation for
>>> complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would
>>> avoid a lot of unnecessary complications and surprises.
>>>
>>> Once you do that, whether as.complex(NA_real_) should
>>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
>>> i=NA_real_) becomes a moot point.
>>>
>>> Best,
>>>
>>> H.
Thank you, Herv?.
Your proposition is yet another one,
to declare that all complex NA's should be treated as identical
(almost/fully?) everywhere.
This would be a possibility, but I think a drastic one.
I think there are too many cases, where I want to keep the
information of the real part independent of the values of the
imaginary part (e.g. think of the Riemann hypothesis), and
typically vice versa.
With your proposal, for a (potentially large) vector of complex numbers,
after
Re(z) <- 1/2
I could no longer rely on Re(z) == 1/2,
because it would be wrong for those z where (the imaginary part/ the number)
was NA/NaN.
Also, in a similar case, a
Im(z) <- NA
would have to "destroy" all real parts Re(z);
not really typically in memory, but effectively for the user, Re(z)
would be all NA/NaN.
And I think there are quite a few other situations
where looking at Re() and Im() separately makes a lot of sense.
Spencer also made a remark in this direction.
All in all I'd be very reluctant to move in this direction;
but yes, I'm just one person ... let's continue musing and
considering !
Martin
>>> On 9/22/23 03:38, Martin Maechler wrote:
>>>>>>>>> Mikael Jagan ????? on Thu, 21 Sep 2023 00:47:39
>>>>>>>>> -0400 writes:
>>>> ????? > Revisiting this thread from April:
>>>>
>>>> >https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html
>>>>
>>>> ????? > where the decision (not yet backported) was
>>>> made for ????? > as.complex(NA_real_) to give
>>>> NA_complex_ instead of ????? > complex(r=NA_real_,
>>>> i=0), to be consistent with ????? > help("as.complex")
>>>> and as.complex(NA) and as.complex(NA_integer_).
>>>>
>>>> ????? > Was any consideration given to the alternative?
>>>> ????? > That is, to changing as.complex(NA) and
>>>> as.complex(NA_integer_) to ????? > give
>>>> complex(r=NA_real_, i=0), consistent with ????? >
>>>> as.complex(NA_real_), then amending help("as.complex")
>>>> ????? > accordingly?
>>>>
>>>> Hmm, as, from R-core, mostly I was involved, I admit to
>>>> say "no", to my knowledge the (above) alternative
>>>> wasn't considered.
>>>>
>>>> ??? > The principle that ??? >
>>>> Im(as.complex(<real=(double|integer|logical)>)) should
>>>> be zero ??? > is quite fundamental, in my view, hence
>>>> the "new" behaviour ??? > seems to really violate the
>>>> principle of least surprise ...
>>>>
>>>> of course "least surprise"? is somewhat subjective.?
>>>> Still, I clearly agree that the above would be one
>>>> desirable property.
>>>>
>>>> I think that any solution will lead to *some* surprise
>>>> for some cases, I think primarily because there are
>>>> *many* different values z? for which? is.na(z)? is
>>>> true,? and in any case NA_complex_? is only of the
>>>> many.
>>>>
>>>> I also agree with Mikael that we should reconsider the
>>>> issue that was raised by Davis Vaughan here ("on
>>>> R-devel") last April.
>>>>
>>>> ????? > Another (but maybe weaker) argument is that
>>>> ????? > double->complex coercions happen more often
>>>> than ????? > logical->complex and integer->complex
>>>> ones. Changing the ????? > behaviour of the more
>>>> frequently performed coercion is ????? > more likely to
>>>> affect code "out there".
>>>>
>>>> ????? > Yet another argument is that one expects
>>>>
>>>> ????? >????? identical(as.complex(NA_real_), NA_real_ +
>>>> (0+0i))
>>>>
>>>> ????? > to be TRUE, i.e., that coercing from double to
>>>> complex is ????? > equivalent to adding a complex
>>>> zero.? The new behaviour ????? > makes the above FALSE,
>>>> since NA_real_ + (0+0i) gives ????? >
>>>> complex(r=NA_real_, i=0).
>>>>
>>>> No!? --- To my own surprise (!) --- in current R-devel
>>>> the above is TRUE, and ??????? NA_real_ + (0+0i)? , the
>>>> same as ??????? NA_real_ + 0i????? , really gives?
>>>> complex(r=NA, i=NA) :
>>>>
>>>> Using showC() from ?complex
>>>>
>>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>>>> %g)", Re(z), Im(z)))
>>>>
>>>> we see (in R-devel) quite consistently
>>>>
>>>>> showC(NA_real_ + 0i)
>>>> [1] (R = NA, I = NA)
>>>>> showC(NA?????? + 0i)? # NA is 'logical'
>>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
>>>> "R-patched" -- *in*consistently
>>>>
>>>>> showC(NA_real_ + 0i)
>>>> [1] (R = NA, I = 0)
>>>>> showC(NA + 0i)
>>>> [1] (R = NA, I = NA) .... and honestly, I do not see
>>>> *where* (and when) we changed the underlying code (in
>>>> arithmetic.c !?)? in R-devel to *also* produce?
>>>> NA_complex_? in such complex *arithmetic*
>>>>
>>>>
>>>> ????? > Having said that, one might also (but more
>>>> naively) expect
>>>>
>>>> ????? >????
>>>> identical(as.complex(as.double(NA_complex_)),
>>>> NA_complex_)
>>>>
>>>> ????? > to be TRUE.
>>>>
>>>> as in current R-devel
>>>>
>>>> ????? > Under my proposal it continues to be FALSE.
>>>>
>>>> as in "R-release"
>>>>
>>>> ????? > Well, I'd prefer if it gave FALSE with a
>>>> warning ????? > "imaginary parts discarded in
>>>> coercion", but it seems that ????? >
>>>> as.double(complex(r=a, i=b)) never warns when either of
>>>> ????? > 'a' and 'b' is NA_real_ or NaN, even where
>>>> "information" ????? > {nonzero 'b'} is clearly lost ...
>>>>
>>>> The question of *warning* here is related indeed, but I
>>>> think we should try to look at it only *secondary* to
>>>> your first proposal.
>>>>
>>>> ????? > Whatever decision is made about
>>>> as.complex(NA_real_), ????? > maybe these points should
>>>> be weighed before it becomes part of ????? > R-release
>>>> ...
>>>>
>>>> ????? > Mikael
>>>>
>>>> Indeed.
>>>>
>>>> Can we please get other opinions / ideas here?
>>>>
>>>> Thank you in advance for your thoughts! Martin
>>>>
>>>> ---
>>>>
>>>> PS:
>>>>
>>>> ?? Our *print()*ing? of complex NA's ("NA" here meaning
>>>> NA or NaN) ?? is also unsatisfactory, e.g. in the case
>>>> where all entries of a ?? vector are NA in the sense of
>>>> is.na(.), but their ?? Re() and Im() are not all NA: ??
>>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>>>> %g)", Re(z), Im(z))) ??? z <- complex(, c(11, NA, NA),
>>>> c(NA, 99, NA)) ??? z ??? showC(z)
>>>>
>>>> gives
>>>>
>>>> ??? > z ??? [1] NA NA NA ??? > showC(z) ??? [1] (R =
>>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
>>>>
>>>> but that (printing of complex) *is* another issue, in
>>>> which we have the re-opened bugzilla PR#16752 ?????
>>>> ==>https://bugs.r-project.org/show_bug.cgi?id=16752
>>>>
>>>> on which we also worked during the R Sprint in Warwick
>>>> three weeks ago, and where I want to commit changes in
>>>> any case {but think we should change even a bit more
>>>> than we got to during the Sprint}.
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org? mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
> --
> Herv? Pag?s
> Bioconductor Core Team hpages.on.github at gmail.com
On 2023-09-23 9:43 am, Martin Maechler wrote:
Herv? Pag?s
on Fri, 22 Sep 2023 16:55:05 -0700 writes:
> The problem is that you have things that are
> **semantically** different but look exactly the same:
> They look the same:
>> x
> [1] NA
>> y
> [1] NA
>> z
> [1] NA
>> is.na(x)
> [1] TRUE
>> is.na(y)
> [1] TRUE
>> is.na(z)
> [1] TRUE
>> str(x)
> ?cplx NA
>> str(y)
> ?num NA
>> str(z)
> ?cplx NA
> but they are semantically different e.g.
>> Re(x)
> [1] NA
>> Re(y)
> [1] -0.5? # surprise!
>> Im(x)? # surprise!
> [1] 2
>> Im(z)
> [1] NA
> so any expression involving Re() or Im() will produce
> different results on input that look the same on the
> surface.
> You can address this either by normalizing the internal
> representation of complex NA to always be complex(r=NaN,
> i=NA_real_), like for NA_complex_, or by allowing the
> infinite variations that are currently allowed and at the
> same time making sure that both Re() and Im()? always
> return NA_real_ on a complex NA.
> My point is that the behavior of complex NA should be
> predictable. Right now it's not. Once it's predictable
> (with Re() and Im() both returning NA_real_ regardless of
> internal representation), then it no longer matters what
> kind of complex NA is returned by as.complex(NA_real_),
> because they are no onger distinguishable.
> H.
> On 9/22/23 13:43, Duncan Murdoch wrote:
>> Since the result of is.na(x) is the same on each of
>> those, I don't see a problem.? As long as that is
>> consistent, I don't see a problem. You shouldn't be using
>> any other test for NA-ness.? You should never be
>> expecting identical() to treat different types as the
>> same (e.g. identical(NA, NA_real_) is FALSE, as it
>> should be).? If you are using a different test, that's
>> user error.
>>
>> Duncan Murdoch
>>
>> On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
>>> We could also question the value of having an infinite
>>> number of NA representations in the complex space. For
>>> example all these complex values are displayed the same
>>> way (as NA), are considered NAs by is.na(), but are not
>>> identical or semantically equivalent (from an Re() or
>>> Im() point of view):
>>>
>>> ? ??? NA_real_ + 0i
>>>
>>> ? ??? complex(r=NA_real_, i=Inf)
>>>
>>> ? ??? complex(r=2, i=NA_real_)
>>>
>>> ? ??? complex(r=NaN, i=NA_real_)
>>>
>>> In other words, using a single representation for
>>> complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would
>>> avoid a lot of unnecessary complications and surprises.
>>>
>>> Once you do that, whether as.complex(NA_real_) should
>>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
>>> i=NA_real_) becomes a moot point.
>>>
>>> Best,
>>>
>>> H.
Thank you, Herv?.
Your proposition is yet another one,
to declare that all complex NA's should be treated as identical
(almost/fully?) everywhere.
This would be a possibility, but I think a drastic one.
I think there are too many cases, where I want to keep the
information of the real part independent of the values of the
imaginary part (e.g. think of the Riemann hypothesis), and
typically vice versa.
With your proposal, for a (potentially large) vector of complex numbers,
after
Re(z) <- 1/2
I could no longer rely on Re(z) == 1/2,
because it would be wrong for those z where (the imaginary part/ the number)
was NA/NaN.
Also, in a similar case, a
Im(z) <- NA
would have to "destroy" all real parts Re(z);
not really typically in memory, but effectively for the user, Re(z)
would be all NA/NaN.
And I think there are quite a few other situations
where looking at Re() and Im() separately makes a lot of sense.
Indeed, and there is no way to "tell" BLAS and LAPACK to treat both the real and imaginary parts as NA_REAL when either is NA_REAL. Hence the only reliable way to implement such a proposal would be to post-process the result of any computation returning a complex type, testing for NA_REAL and setting both parts to NA_REAL in that case. My expectation is that such testing would drastically slow down basic arithmetic and algebraic operations ... Mikael
Spencer also made a remark in this direction. All in all I'd be very reluctant to move in this direction; but yes, I'm just one person ... let's continue musing and considering ! Martin
>>> On 9/22/23 03:38, Martin Maechler wrote:
>>>>>>>>> Mikael Jagan ????? on Thu, 21 Sep 2023 00:47:39
>>>>>>>>> -0400 writes:
>>>> ????? > Revisiting this thread from April:
>>>>
>>>>
>>>> ????? > where the decision (not yet backported) was
>>>> made for ????? > as.complex(NA_real_) to give
>>>> NA_complex_ instead of ????? > complex(r=NA_real_,
>>>> i=0), to be consistent with ????? > help("as.complex")
>>>> and as.complex(NA) and as.complex(NA_integer_).
>>>>
>>>> ????? > Was any consideration given to the alternative?
>>>> ????? > That is, to changing as.complex(NA) and
>>>> as.complex(NA_integer_) to ????? > give
>>>> complex(r=NA_real_, i=0), consistent with ????? >
>>>> as.complex(NA_real_), then amending help("as.complex")
>>>> ????? > accordingly?
>>>>
>>>> Hmm, as, from R-core, mostly I was involved, I admit to
>>>> say "no", to my knowledge the (above) alternative
>>>> wasn't considered.
>>>>
>>>> ??? > The principle that ??? >
>>>> Im(as.complex(<real=(double|integer|logical)>)) should
>>>> be zero ??? > is quite fundamental, in my view, hence
>>>> the "new" behaviour ??? > seems to really violate the
>>>> principle of least surprise ...
>>>>
>>>> of course "least surprise"? is somewhat subjective.
>>>> Still, I clearly agree that the above would be one
>>>> desirable property.
>>>>
>>>> I think that any solution will lead to *some* surprise
>>>> for some cases, I think primarily because there are
>>>> *many* different values z? for which? is.na(z)? is
>>>> true,? and in any case NA_complex_? is only of the
>>>> many.
>>>>
>>>> I also agree with Mikael that we should reconsider the
>>>> issue that was raised by Davis Vaughan here ("on
>>>> R-devel") last April.
>>>>
>>>> ????? > Another (but maybe weaker) argument is that
>>>> ????? > double->complex coercions happen more often
>>>> than ????? > logical->complex and integer->complex
>>>> ones. Changing the ????? > behaviour of the more
>>>> frequently performed coercion is ????? > more likely to
>>>> affect code "out there".
>>>>
>>>> ????? > Yet another argument is that one expects
>>>>
>>>> ????? >????? identical(as.complex(NA_real_), NA_real_ +
>>>> (0+0i))
>>>>
>>>> ????? > to be TRUE, i.e., that coercing from double to
>>>> complex is ????? > equivalent to adding a complex
>>>> zero.? The new behaviour ????? > makes the above FALSE,
>>>> since NA_real_ + (0+0i) gives ????? >
>>>> complex(r=NA_real_, i=0).
>>>>
>>>> No!? --- To my own surprise (!) --- in current R-devel
>>>> the above is TRUE, and ??????? NA_real_ + (0+0i)? , the
>>>> same as ??????? NA_real_ + 0i????? , really gives
>>>> complex(r=NA, i=NA) :
>>>>
>>>> Using showC() from ?complex
>>>>
>>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>>>> %g)", Re(z), Im(z)))
>>>>
>>>> we see (in R-devel) quite consistently
>>>>
>>>>> showC(NA_real_ + 0i)
>>>> [1] (R = NA, I = NA)
>>>>> showC(NA?????? + 0i)? # NA is 'logical'
>>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
>>>> "R-patched" -- *in*consistently
>>>>
>>>>> showC(NA_real_ + 0i)
>>>> [1] (R = NA, I = 0)
>>>>> showC(NA + 0i)
>>>> [1] (R = NA, I = NA) .... and honestly, I do not see
>>>> *where* (and when) we changed the underlying code (in
>>>> arithmetic.c !?)? in R-devel to *also* produce
>>>> NA_complex_? in such complex *arithmetic*
>>>>
>>>>
>>>> ????? > Having said that, one might also (but more
>>>> naively) expect
>>>>
>>>> ????? >
>>>> identical(as.complex(as.double(NA_complex_)),
>>>> NA_complex_)
>>>>
>>>> ????? > to be TRUE.
>>>>
>>>> as in current R-devel
>>>>
>>>> ????? > Under my proposal it continues to be FALSE.
>>>>
>>>> as in "R-release"
>>>>
>>>> ????? > Well, I'd prefer if it gave FALSE with a
>>>> warning ????? > "imaginary parts discarded in
>>>> coercion", but it seems that ????? >
>>>> as.double(complex(r=a, i=b)) never warns when either of
>>>> ????? > 'a' and 'b' is NA_real_ or NaN, even where
>>>> "information" ????? > {nonzero 'b'} is clearly lost ...
>>>>
>>>> The question of *warning* here is related indeed, but I
>>>> think we should try to look at it only *secondary* to
>>>> your first proposal.
>>>>
>>>> ????? > Whatever decision is made about
>>>> as.complex(NA_real_), ????? > maybe these points should
>>>> be weighed before it becomes part of ????? > R-release
>>>> ...
>>>>
>>>> ????? > Mikael
>>>>
>>>> Indeed.
>>>>
>>>> Can we please get other opinions / ideas here?
>>>>
>>>> Thank you in advance for your thoughts! Martin
>>>>
>>>> ---
>>>>
>>>> PS:
>>>>
>>>> ?? Our *print()*ing? of complex NA's ("NA" here meaning
>>>> NA or NaN) ?? is also unsatisfactory, e.g. in the case
>>>> where all entries of a ?? vector are NA in the sense of
>>>> is.na(.), but their ?? Re() and Im() are not all NA:
>>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>>>> %g)", Re(z), Im(z))) ??? z <- complex(, c(11, NA, NA),
>>>> c(NA, 99, NA)) ??? z ??? showC(z)
>>>>
>>>> gives
>>>>
>>>> ??? > z ??? [1] NA NA NA ??? > showC(z) ??? [1] (R =
>>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
>>>>
>>>> but that (printing of complex) *is* another issue, in
>>>> which we have the re-opened bugzilla PR#16752
>>>> ==>https://bugs.r-project.org/show_bug.cgi?id=16752
>>>>
>>>> on which we also worked during the R Sprint in Warwick
>>>> three weeks ago, and where I want to commit changes in
>>>> any case {but think we should change even a bit more
>>>> than we got to during the Sprint}.
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org? mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
> --
> Herv? Pag?s
> Bioconductor Core Team hpages.on.github at gmail.com
It sounds like we need to add arguments (with sensible defaults) to complex(), Re(), Im(), is.na.complex() etc to allow the user to specify the desired behavior. -- Change your thoughts and you change the world. --Dr. Norman Vincent Peale
On Sep 23, 2023, at 12:37 PM, Mikael Jagan <jaganmn2 at gmail.com> wrote: ? On 2023-09-23 9:43 am, Martin Maechler wrote:
Herv? Pag?s
on Fri, 22 Sep 2023 16:55:05 -0700 writes:
> The problem is that you have things that are
> **semantically** different but look exactly the same:
> They look the same:
>> x
> [1] NA
>> y
> [1] NA
>> z
> [1] NA
>> is.na(x)
> [1] TRUE
>> is.na(y)
> [1] TRUE
>> is.na(z)
> [1] TRUE
>> str(x)
> cplx NA
>> str(y)
> num NA
>> str(z)
> cplx NA
> but they are semantically different e.g.
>> Re(x)
> [1] NA
>> Re(y)
> [1] -0.5 # surprise!
>> Im(x) # surprise!
> [1] 2
>> Im(z)
> [1] NA
> so any expression involving Re() or Im() will produce
> different results on input that look the same on the
> surface.
> You can address this either by normalizing the internal
> representation of complex NA to always be complex(r=NaN,
> i=NA_real_), like for NA_complex_, or by allowing the
> infinite variations that are currently allowed and at the
> same time making sure that both Re() and Im() always
> return NA_real_ on a complex NA.
> My point is that the behavior of complex NA should be
> predictable. Right now it's not. Once it's predictable
> (with Re() and Im() both returning NA_real_ regardless of
> internal representation), then it no longer matters what
> kind of complex NA is returned by as.complex(NA_real_),
> because they are no onger distinguishable.
> H.
> On 9/22/23 13:43, Duncan Murdoch wrote:
>> Since the result of is.na(x) is the same on each of
>> those, I don't see a problem. As long as that is
>> consistent, I don't see a problem. You shouldn't be using
>> any other test for NA-ness. You should never be
>> expecting identical() to treat different types as the
>> same (e.g. identical(NA, NA_real_) is FALSE, as it
>> should be). If you are using a different test, that's
>> user error.
>>
>> Duncan Murdoch
>>
>> On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
>>> We could also question the value of having an infinite
>>> number of NA representations in the complex space. For
>>> example all these complex values are displayed the same
>>> way (as NA), are considered NAs by is.na(), but are not
>>> identical or semantically equivalent (from an Re() or
>>> Im() point of view):
>>>
>>> NA_real_ + 0i
>>>
>>> complex(r=NA_real_, i=Inf)
>>>
>>> complex(r=2, i=NA_real_)
>>>
>>> complex(r=NaN, i=NA_real_)
>>>
>>> In other words, using a single representation for
>>> complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would
>>> avoid a lot of unnecessary complications and surprises.
>>>
>>> Once you do that, whether as.complex(NA_real_) should
>>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
>>> i=NA_real_) becomes a moot point.
>>>
>>> Best,
>>>
>>> H.
Thank you, Herv?.
Your proposition is yet another one,
to declare that all complex NA's should be treated as identical
(almost/fully?) everywhere.
This would be a possibility, but I think a drastic one.
I think there are too many cases, where I want to keep the
information of the real part independent of the values of the
imaginary part (e.g. think of the Riemann hypothesis), and
typically vice versa.
With your proposal, for a (potentially large) vector of complex numbers,
after
Re(z) <- 1/2
I could no longer rely on Re(z) == 1/2,
because it would be wrong for those z where (the imaginary part/ the number)
was NA/NaN.
Also, in a similar case, a
Im(z) <- NA
would have to "destroy" all real parts Re(z);
not really typically in memory, but effectively for the user, Re(z)
would be all NA/NaN.
And I think there are quite a few other situations
where looking at Re() and Im() separately makes a lot of sense.
Indeed, and there is no way to "tell" BLAS and LAPACK to treat both the real and imaginary parts as NA_REAL when either is NA_REAL. Hence the only reliable way to implement such a proposal would be to post-process the result of any computation returning a complex type, testing for NA_REAL and setting both parts to NA_REAL in that case. My expectation is that such testing would drastically slow down basic arithmetic and algebraic operations ... Mikael
Spencer also made a remark in this direction. All in all I'd be very reluctant to move in this direction; but yes, I'm just one person ... let's continue musing and considering ! Martin
>>> On 9/22/23 03:38, Martin Maechler wrote:
>>>>>>>>> Mikael Jagan on Thu, 21 Sep 2023 00:47:39
>>>>>>>>> -0400 writes:
>>>> > Revisiting this thread from April:
>>>>
>>>>
>>>> > where the decision (not yet backported) was
>>>> made for > as.complex(NA_real_) to give
>>>> NA_complex_ instead of > complex(r=NA_real_,
>>>> i=0), to be consistent with > help("as.complex")
>>>> and as.complex(NA) and as.complex(NA_integer_).
>>>>
>>>> > Was any consideration given to the alternative?
>>>> > That is, to changing as.complex(NA) and
>>>> as.complex(NA_integer_) to > give
>>>> complex(r=NA_real_, i=0), consistent with >
>>>> as.complex(NA_real_), then amending help("as.complex")
>>>> > accordingly?
>>>>
>>>> Hmm, as, from R-core, mostly I was involved, I admit to
>>>> say "no", to my knowledge the (above) alternative
>>>> wasn't considered.
>>>>
>>>> > The principle that >
>>>> Im(as.complex(<real=(double|integer|logical)>)) should
>>>> be zero > is quite fundamental, in my view, hence
>>>> the "new" behaviour > seems to really violate the
>>>> principle of least surprise ...
>>>>
>>>> of course "least surprise" is somewhat subjective.
>>>> Still, I clearly agree that the above would be one
>>>> desirable property.
>>>>
>>>> I think that any solution will lead to *some* surprise
>>>> for some cases, I think primarily because there are
>>>> *many* different values z for which is.na(z) is
>>>> true, and in any case NA_complex_ is only of the
>>>> many.
>>>>
>>>> I also agree with Mikael that we should reconsider the
>>>> issue that was raised by Davis Vaughan here ("on
>>>> R-devel") last April.
>>>>
>>>> > Another (but maybe weaker) argument is that
>>>> > double->complex coercions happen more often
>>>> than > logical->complex and integer->complex
>>>> ones. Changing the > behaviour of the more
>>>> frequently performed coercion is > more likely to
>>>> affect code "out there".
>>>>
>>>> > Yet another argument is that one expects
>>>>
>>>> > identical(as.complex(NA_real_), NA_real_ +
>>>> (0+0i))
>>>>
>>>> > to be TRUE, i.e., that coercing from double to
>>>> complex is > equivalent to adding a complex
>>>> zero. The new behaviour > makes the above FALSE,
>>>> since NA_real_ + (0+0i) gives >
>>>> complex(r=NA_real_, i=0).
>>>>
>>>> No! --- To my own surprise (!) --- in current R-devel
>>>> the above is TRUE, and NA_real_ + (0+0i) , the
>>>> same as NA_real_ + 0i , really gives
>>>> complex(r=NA, i=NA) :
>>>>
>>>> Using showC() from ?complex
>>>>
>>>> showC <- function(z) noquote(sprintf("(R = %g, I =
>>>> %g)", Re(z), Im(z)))
>>>>
>>>> we see (in R-devel) quite consistently
>>>>
>>>>> showC(NA_real_ + 0i)
>>>> [1] (R = NA, I = NA)
>>>>> showC(NA + 0i) # NA is 'logical'
>>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
>>>> "R-patched" -- *in*consistently
>>>>
>>>>> showC(NA_real_ + 0i)
>>>> [1] (R = NA, I = 0)
>>>>> showC(NA + 0i)
>>>> [1] (R = NA, I = NA) .... and honestly, I do not see
>>>> *where* (and when) we changed the underlying code (in
>>>> arithmetic.c !?) in R-devel to *also* produce
>>>> NA_complex_ in such complex *arithmetic*
>>>>
>>>>
>>>> > Having said that, one might also (but more
>>>> naively) expect
>>>>
>>>> >
>>>> identical(as.complex(as.double(NA_complex_)),
>>>> NA_complex_)
>>>>
>>>> > to be TRUE.
>>>>
>>>> as in current R-devel
>>>>
>>>> > Under my proposal it continues to be FALSE.
>>>>
>>>> as in "R-release"
>>>>
>>>> > Well, I'd prefer if it gave FALSE with a
>>>> warning > "imaginary parts discarded in
>>>> coercion", but it seems that >
>>>> as.double(complex(r=a, i=b)) never warns when either of
>>>> > 'a' and 'b' is NA_real_ or NaN, even where
>>>> "information" > {nonzero 'b'} is clearly lost ...
>>>>
>>>> The question of *warning* here is related indeed, but I
>>>> think we should try to look at it only *secondary* to
>>>> your first proposal.
>>>>
>>>> > Whatever decision is made about
>>>> as.complex(NA_real_), > maybe these points should
>>>> be weighed before it becomes part of > R-release
>>>> ...
>>>>
>>>> > Mikael
>>>>
>>>> Indeed.
>>>>
>>>> Can we please get other opinions / ideas here?
>>>>
>>>> Thank you in advance for your thoughts! Martin
>>>>
>>>> ---
>>>>
>>>> PS:
>>>>
>>>> Our *print()*ing of complex NA's ("NA" here meaning
>>>> NA or NaN) is also unsatisfactory, e.g. in the case
>>>> where all entries of a vector are NA in the sense of
>>>> is.na(.), but their Re() and Im() are not all NA:
>>>> showC <- function(z) noquote(sprintf("(R = %g, I =
>>>> %g)", Re(z), Im(z))) z <- complex(, c(11, NA, NA),
>>>> c(NA, 99, NA)) z showC(z)
>>>>
>>>> gives
>>>>
>>>> > z [1] NA NA NA > showC(z) [1] (R =
>>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
>>>>
>>>> but that (printing of complex) *is* another issue, in
>>>> which we have the re-opened bugzilla PR#16752
>>>> ==>https://bugs.r-project.org/show_bug.cgi?id=16752
>>>>
>>>> on which we also worked during the R Sprint in Warwick
>>>> three weeks ago, and where I want to commit changes in
>>>> any case {but think we should change even a bit more
>>>> than we got to during the Sprint}.
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
> --
> Herv? Pag?s
> Bioconductor Core Team hpages.on.github at gmail.com
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Hi Martin,
On 9/23/23 06:43, Martin Maechler wrote:
Herv? Pag?s
on Fri, 22 Sep 2023 16:55:05 -0700 writes:
> The problem is that you have things that are
> **semantically** different but look exactly the same:
> They look the same:
>> x
> [1] NA
>> y
> [1] NA
>> z
> [1] NA
>> is.na(x)
> [1] TRUE
>> is.na(y)
> [1] TRUE
>> is.na(z)
> [1] TRUE
>> str(x)
> ?cplx NA
>> str(y)
> ?num NA
>> str(z)
> ?cplx NA
> but they are semantically different e.g.
>> Re(x)
> [1] NA
>> Re(y)
> [1] -0.5? # surprise!
>> Im(x)? # surprise!
> [1] 2
>> Im(z)
> [1] NA
> so any expression involving Re() or Im() will produce
> different results on input that look the same on the
> surface.
> You can address this either by normalizing the internal
> representation of complex NA to always be complex(r=NaN,
> i=NA_real_), like for NA_complex_, or by allowing the
> infinite variations that are currently allowed and at the
> same time making sure that both Re() and Im()? always
> return NA_real_ on a complex NA.
> My point is that the behavior of complex NA should be
> predictable. Right now it's not. Once it's predictable
> (with Re() and Im() both returning NA_real_ regardless of
> internal representation), then it no longer matters what
> kind of complex NA is returned by as.complex(NA_real_),
> because they are no onger distinguishable.
> H.
> On 9/22/23 13:43, Duncan Murdoch wrote:
>> Since the result of is.na(x) is the same on each of
>> those, I don't see a problem.? As long as that is
>> consistent, I don't see a problem. You shouldn't be using
>> any other test for NA-ness.? You should never be
>> expecting identical() to treat different types as the
>> same (e.g. identical(NA, NA_real_) is FALSE, as it
>> should be).? If you are using a different test, that's
>> user error.
>>
>> Duncan Murdoch
>>
>> On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
>>> We could also question the value of having an infinite
>>> number of NA representations in the complex space. For
>>> example all these complex values are displayed the same
>>> way (as NA), are considered NAs by is.na(), but are not
>>> identical or semantically equivalent (from an Re() or
>>> Im() point of view):
>>>
>>> ? ??? NA_real_ + 0i
>>>
>>> ? ??? complex(r=NA_real_, i=Inf)
>>>
>>> ? ??? complex(r=2, i=NA_real_)
>>>
>>> ? ??? complex(r=NaN, i=NA_real_)
>>>
>>> In other words, using a single representation for
>>> complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would
>>> avoid a lot of unnecessary complications and surprises.
>>>
>>> Once you do that, whether as.complex(NA_real_) should
>>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
>>> i=NA_real_) becomes a moot point.
>>>
>>> Best,
>>>
>>> H.
Thank you, Herv?. Your proposition is yet another one, to declare that all complex NA's should be treated as identical (almost/fully?) everywhere. This would be a possibility, but I think a drastic one. I think there are too many cases, where I want to keep the information of the real part independent of the values of the imaginary part (e.g. think of the Riemann hypothesis), and typically vice versa.
Use NaN for that, not NA.
With your proposal, for a (potentially large) vector of complex numbers,
after
Re(z) <- 1/2
I could no longer rely on Re(z) == 1/2,
because it would be wrong for those z where (the imaginary part/ the number)
was NA/NaN.
My proposal is to do this only if the Re and/or Im parts are NAs, not if they are NaNs. BTW the difference between how NAs and NaNs are treated in complex vectors is another issue that adds to the confusion: ? > complex(r=NA, i=2) ? [1] NA ? > complex(r=NaN, i=2) ? [1] NaN+2i Not displaying the real + imaginary parts in the NA case kind of suggests that somehow they are gone i.e. that Re(z) and Im(z) are both NA. Note that my proposal is not to change the display but to change Re() and Im() to make them consistent with the display. In your Re(z) <- 1/2 example (which seems to be theoretical only because I don't see `Re<-` in base R), any NA in 'z' would be replaced with complex(r=NaN, i=1/2), so you could rely on Re(z) == 1/2.
Also, in a similar case, a
Im(z) <- NA
would have to "destroy" all real parts Re(z);
not really typically in memory, but effectively for the user, Re(z)
would be all NA/NaN.
Yes, setting a value to NA destroys it beyond repair in the sense that there's no way you can retrieve any original parts of it. I'm fine with that. I'm not fine with an NA being used to store hidden information.
And I think there are quite a few other situations where looking at Re() and Im() separately makes a lot of sense.
Still doable if the Re or Im parts contain NaNs.
Spencer also made a remark in this direction. All in all I'd be very reluctant to move in this direction; but yes, I'm just one person ... let's continue musing and considering !
I understand the reluctance since this would not be a light move, but thanks for considering. Best, H.
Martin
>>> On 9/22/23 03:38, Martin Maechler wrote:
>>>>>>>>> Mikael Jagan ????? on Thu, 21 Sep 2023 00:47:39
>>>>>>>>> -0400 writes:
>>>> ????? > Revisiting this thread from April:
>>>>
>>>>
>>>> ????? > where the decision (not yet backported) was
>>>> made for ????? > as.complex(NA_real_) to give
>>>> NA_complex_ instead of ????? > complex(r=NA_real_,
>>>> i=0), to be consistent with ????? > help("as.complex")
>>>> and as.complex(NA) and as.complex(NA_integer_).
>>>>
>>>> ????? > Was any consideration given to the alternative?
>>>> ????? > That is, to changing as.complex(NA) and
>>>> as.complex(NA_integer_) to ????? > give
>>>> complex(r=NA_real_, i=0), consistent with ????? >
>>>> as.complex(NA_real_), then amending help("as.complex")
>>>> ????? > accordingly?
>>>>
>>>> Hmm, as, from R-core, mostly I was involved, I admit to
>>>> say "no", to my knowledge the (above) alternative
>>>> wasn't considered.
>>>>
>>>> ??? > The principle that ??? >
>>>> Im(as.complex(<real=(double|integer|logical)>)) should
>>>> be zero ??? > is quite fundamental, in my view, hence
>>>> the "new" behaviour ??? > seems to really violate the
>>>> principle of least surprise ...
>>>>
>>>> of course "least surprise"? is somewhat subjective.
>>>> Still, I clearly agree that the above would be one
>>>> desirable property.
>>>>
>>>> I think that any solution will lead to *some* surprise
>>>> for some cases, I think primarily because there are
>>>> *many* different values z? for which? is.na(z)? is
>>>> true,? and in any case NA_complex_? is only of the
>>>> many.
>>>>
>>>> I also agree with Mikael that we should reconsider the
>>>> issue that was raised by Davis Vaughan here ("on
>>>> R-devel") last April.
>>>>
>>>> ????? > Another (but maybe weaker) argument is that
>>>> ????? > double->complex coercions happen more often
>>>> than ????? > logical->complex and integer->complex
>>>> ones. Changing the ????? > behaviour of the more
>>>> frequently performed coercion is ????? > more likely to
>>>> affect code "out there".
>>>>
>>>> ????? > Yet another argument is that one expects
>>>>
>>>> ????? >????? identical(as.complex(NA_real_), NA_real_ +
>>>> (0+0i))
>>>>
>>>> ????? > to be TRUE, i.e., that coercing from double to
>>>> complex is ????? > equivalent to adding a complex
>>>> zero.? The new behaviour ????? > makes the above FALSE,
>>>> since NA_real_ + (0+0i) gives ????? >
>>>> complex(r=NA_real_, i=0).
>>>>
>>>> No!? --- To my own surprise (!) --- in current R-devel
>>>> the above is TRUE, and ??????? NA_real_ + (0+0i)? , the
>>>> same as ??????? NA_real_ + 0i????? , really gives
>>>> complex(r=NA, i=NA) :
>>>>
>>>> Using showC() from ?complex
>>>>
>>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>>>> %g)", Re(z), Im(z)))
>>>>
>>>> we see (in R-devel) quite consistently
>>>>
>>>>> showC(NA_real_ + 0i)
>>>> [1] (R = NA, I = NA)
>>>>> showC(NA?????? + 0i)? # NA is 'logical'
>>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
>>>> "R-patched" -- *in*consistently
>>>>
>>>>> showC(NA_real_ + 0i)
>>>> [1] (R = NA, I = 0)
>>>>> showC(NA + 0i)
>>>> [1] (R = NA, I = NA) .... and honestly, I do not see
>>>> *where* (and when) we changed the underlying code (in
>>>> arithmetic.c !?)? in R-devel to *also* produce
>>>> NA_complex_? in such complex *arithmetic*
>>>>
>>>>
>>>> ????? > Having said that, one might also (but more
>>>> naively) expect
>>>>
>>>> ????? >
>>>> identical(as.complex(as.double(NA_complex_)),
>>>> NA_complex_)
>>>>
>>>> ????? > to be TRUE.
>>>>
>>>> as in current R-devel
>>>>
>>>> ????? > Under my proposal it continues to be FALSE.
>>>>
>>>> as in "R-release"
>>>>
>>>> ????? > Well, I'd prefer if it gave FALSE with a
>>>> warning ????? > "imaginary parts discarded in
>>>> coercion", but it seems that ????? >
>>>> as.double(complex(r=a, i=b)) never warns when either of
>>>> ????? > 'a' and 'b' is NA_real_ or NaN, even where
>>>> "information" ????? > {nonzero 'b'} is clearly lost ...
>>>>
>>>> The question of *warning* here is related indeed, but I
>>>> think we should try to look at it only *secondary* to
>>>> your first proposal.
>>>>
>>>> ????? > Whatever decision is made about
>>>> as.complex(NA_real_), ????? > maybe these points should
>>>> be weighed before it becomes part of ????? > R-release
>>>> ...
>>>>
>>>> ????? > Mikael
>>>>
>>>> Indeed.
>>>>
>>>> Can we please get other opinions / ideas here?
>>>>
>>>> Thank you in advance for your thoughts! Martin
>>>>
>>>> ---
>>>>
>>>> PS:
>>>>
>>>> ?? Our *print()*ing? of complex NA's ("NA" here meaning
>>>> NA or NaN) ?? is also unsatisfactory, e.g. in the case
>>>> where all entries of a ?? vector are NA in the sense of
>>>> is.na(.), but their ?? Re() and Im() are not all NA:
>>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>>>> %g)", Re(z), Im(z))) ??? z <- complex(, c(11, NA, NA),
>>>> c(NA, 99, NA)) ??? z ??? showC(z)
>>>>
>>>> gives
>>>>
>>>> ??? > z ??? [1] NA NA NA ??? > showC(z) ??? [1] (R =
>>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
>>>>
>>>> but that (printing of complex) *is* another issue, in
>>>> which we have the re-opened bugzilla PR#16752
>>>> ==>https://bugs.r-project.org/show_bug.cgi?id=16752
>>>>
>>>> on which we also worked during the R Sprint in Warwick
>>>> three weeks ago, and where I want to commit changes in
>>>> any case {but think we should change even a bit more
>>>> than we got to during the Sprint}.
>>>>
>>>> ______________________________________________
>>>>R-devel at r-project.org ? mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
> --
> Herv? Pag?s
> Bioconductor Core Teamhpages.on.github at gmail.com
Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com [[alternative HTML version deleted]]
1 day later
Herv? Pag?s
on Sat, 23 Sep 2023 16:52:21 -0700 writes:
> Hi Martin,
> On 9/23/23 06:43, Martin Maechler wrote:
>>>>>>> Herv? Pag?s
>>>>>>> on Fri, 22 Sep 2023 16:55:05 -0700 writes:
>> > The problem is that you have things that are
>> > **semantically** different but look exactly the same:
>>
>> > They look the same:
>>
>> >> x
>> > [1] NA
>> >> y
>> > [1] NA
>> >> z
>> > [1] NA
>>
>> >> is.na(x)
>> > [1] TRUE
>> >> is.na(y)
>> > [1] TRUE
>> >> is.na(z)
>> > [1] TRUE
>>
>> >> str(x)
>> > ?cplx NA
>> >> str(y)
>> > ?num NA
>> >> str(z)
>> > ?cplx NA
>>
>> > but they are semantically different e.g.
>>
>> >> Re(x)
>> > [1] NA
>> >> Re(y)
>> > [1] -0.5? # surprise!
>>
>> >> Im(x)? # surprise!
>> > [1] 2
>> >> Im(z)
>> > [1] NA
>>
>> > so any expression involving Re() or Im() will produce
>> > different results on input that look the same on the
>> > surface.
>>
>> > You can address this either by normalizing the internal
>> > representation of complex NA to always be complex(r=NaN,
>> > i=NA_real_), like for NA_complex_, or by allowing the
>> > infinite variations that are currently allowed and at the
>> > same time making sure that both Re() and Im()? always
>> > return NA_real_ on a complex NA.
>>
>> > My point is that the behavior of complex NA should be
>> > predictable. Right now it's not. Once it's predictable
>> > (with Re() and Im() both returning NA_real_ regardless of
>> > internal representation), then it no longer matters what
>> > kind of complex NA is returned by as.complex(NA_real_),
>> > because they are no onger distinguishable.
>>
>> > H.
>>
>> > On 9/22/23 13:43, Duncan Murdoch wrote:
>> >> Since the result of is.na(x) is the same on each of
>> >> those, I don't see a problem.? As long as that is
>> >> consistent, I don't see a problem. You shouldn't be using
>> >> any other test for NA-ness.? You should never be
>> >> expecting identical() to treat different types as the
>> >> same (e.g. identical(NA, NA_real_) is FALSE, as it
>> >> should be).? If you are using a different test, that's
>> >> user error.
>> >>
>> >> Duncan Murdoch
>> >>
>> >> On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
>> >>> We could also question the value of having an infinite
>> >>> number of NA representations in the complex space. For
>> >>> example all these complex values are displayed the same
>> >>> way (as NA), are considered NAs by is.na(), but are not
>> >>> identical or semantically equivalent (from an Re() or
>> >>> Im() point of view):
>> >>>
>> >>> ? ??? NA_real_ + 0i
>> >>>
>> >>> ? ??? complex(r=NA_real_, i=Inf)
>> >>>
>> >>> ? ??? complex(r=2, i=NA_real_)
>> >>>
>> >>> ? ??? complex(r=NaN, i=NA_real_)
>> >>>
>> >>> In other words, using a single representation for
>> >>> complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would
>> >>> avoid a lot of unnecessary complications and surprises.
>> >>>
>> >>> Once you do that, whether as.complex(NA_real_) should
>> >>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
>> >>> i=NA_real_) becomes a moot point.
>> >>>
>> >>> Best,
>> >>>
>> >>> H.
>>
>> Thank you, Herv?.
>> Your proposition is yet another one,
>> to declare that all complex NA's should be treated as identical
>> (almost/fully?) everywhere.
>>
>> This would be a possibility, but I think a drastic one.
>>
>> I think there are too many cases, where I want to keep the
>> information of the real part independent of the values of the
>> imaginary part (e.g. think of the Riemann hypothesis), and
>> typically vice versa.
> Use NaN for that, not NA.
Aa..h, *that* is your point.
Well, I was on exactly this line till a few years ago.
However, very *sadly* to me, note how example(complex)
nowadays ends :
##----------------------------------------------------------------------------
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
## The exact result of this *depends* on the platform, compiler, math-library:
(NpNA <- NaN + NA_complex_) ; str(NpNA) # *behaves* as 'cplx NA' ..
stopifnot(is.na(NpNA), is.na(NA_complex_), is.na(Re(NA_complex_)), is.na(Im(NA_complex_)))
showC(NpNA)# but does not always show '(R = NaN, I = NA)'
## and this is not TRUE everywhere:
identical(NpNA, NA_complex_)
showC(NA_complex_) # always == (R = NA, I = NA)
##----------------------------------------------------------------------------
Unfortunately --- notably by the appearance of the new (M1, M1 pro, M2, ...)
processors, but not only ---
I (and others, but the real experts) have wrongly assumed that
NA {which on the C-level is *one* of the many possible internal NaN's}
would be preserved in computations, as they are on the R level
-- well, typically, and as long as we've used intel-compatible
chips and gcc-compilers.
But modern speed optimizations (also seen in accelerated
Blas/Lapack ..) have noticed that no official C standard
requires such preservations (i.e., in our case of NA, *the* special NaN),
and -- for speed reasons -- now on these accelerated platforms,
R-level NA's "suddenly" turn into R-level NaN's (all are NaN on
the C level but "with different payload") from quite "trivial" computations.
Consequently, the strict distinction between NA and NaN
even when they are so important for us statisticians / careful data analysts,
nowadays will tend to have to be dismissed eventually.
... and as I have mentioned also mentioned earlier in this thread,
I believe we should also print the complex values of z
fulfilling is.na(z) by their Re & Im, i.e., e.g.
NA+iNA (or NaN+iNA or NA+iNaN or NaN+iNaN
NA+0i, NaN+1i, 3+iNaN, 4+iNA etc
but note that the exact printing itself should *not* become the topic of this
thread unless by mentioning that I strongly believe the print()ing
of complex vectors in R should change anway *and* for that reason,
the printing / "looks the same as" / ... should not be strong
reasons in my view for deciding how *coercion*,
notably as.complex(.) should work.
Martin
>> With your proposal, for a (potentially large) vector of complex numbers,
>> after
>> Re(z) <- 1/2
>>
>> I could no longer rely on Re(z) == 1/2,
>> because it would be wrong for those z where (the imaginary part/ the number)
>> was NA/NaN.
> My proposal is to do this only if the Re and/or Im parts are NAs, not if
> they are NaNs.
> BTW the difference between how NAs and NaNs are treated in complex
> vectors is another issue that adds to the confusion:
> ? > complex(r=NA, i=2)
> ? [1] NA
> ? > complex(r=NaN, i=2)
> ? [1] NaN+2i
> Not displaying the real + imaginary parts in the NA case kind of
> suggests that somehow they are gone i.e. that Re(z) and Im(z) are both NA.
> Note that my proposal is not to change the display but to change Re()
> and Im() to make them consistent with the display.
> In your Re(z) <- 1/2 example (which seems to be theoretical only because
> I don't see `Re<-` in base R), any NA in 'z' would be replaced with
> complex(r=NaN, i=1/2), so you could rely on Re(z) == 1/2.
>> Also, in a similar case, a
>>
>> Im(z) <- NA
>>
>> would have to "destroy" all real parts Re(z);
>> not really typically in memory, but effectively for the user, Re(z)
>> would be all NA/NaN.
> Yes, setting a value to NA destroys it beyond repair in the sense that
> there's no way you can retrieve any original parts of it. I'm fine with
> that. I'm not fine with an NA being used to store hidden information.
>>
>> And I think there are quite a few other situations
>> where looking at Re() and Im() separately makes a lot of sense.
> Still doable if the Re or Im parts contain NaNs.
>>
>> Spencer also made a remark in this direction.
>>
>> All in all I'd be very reluctant to move in this direction;
>> but yes, I'm just one person ... let's continue musing and
>> considering !
> I understand the reluctance since this would not be a light move, but
> thanks for considering.
> Best,
> H.
>>
>> Martin
>>
>> >>> On 9/22/23 03:38, Martin Maechler wrote:
>> >>>>>>>>> Mikael Jagan ????? on Thu, 21 Sep 2023 00:47:39
>> >>>>>>>>> -0400 writes:
>> >>>> ????? > Revisiting this thread from April:
>> >>>>
>> >>>> >https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html
>> >>>>
>> >>>> ????? > where the decision (not yet backported) was
>> >>>> made for ????? > as.complex(NA_real_) to give
>> >>>> NA_complex_ instead of ????? > complex(r=NA_real_,
>> >>>> i=0), to be consistent with ????? > help("as.complex")
>> >>>> and as.complex(NA) and as.complex(NA_integer_).
>> >>>>
>> >>>> ????? > Was any consideration given to the alternative?
>> >>>> ????? > That is, to changing as.complex(NA) and
>> >>>> as.complex(NA_integer_) to ????? > give
>> >>>> complex(r=NA_real_, i=0), consistent with ????? >
>> >>>> as.complex(NA_real_), then amending help("as.complex")
>> >>>> ????? > accordingly?
>> >>>>
>> >>>> Hmm, as, from R-core, mostly I was involved, I admit to
>> >>>> say "no", to my knowledge the (above) alternative
>> >>>> wasn't considered.
>> >>>>
>> >>>> ??? > The principle that ??? >
>> >>>> Im(as.complex(<real=(double|integer|logical)>)) should
>> >>>> be zero ??? > is quite fundamental, in my view, hence
>> >>>> the "new" behaviour ??? > seems to really violate the
>> >>>> principle of least surprise ...
>> >>>>
>> >>>> of course "least surprise"? is somewhat subjective.
>> >>>> Still, I clearly agree that the above would be one
>> >>>> desirable property.
>> >>>>
>> >>>> I think that any solution will lead to *some* surprise
>> >>>> for some cases, I think primarily because there are
>> >>>> *many* different values z? for which? is.na(z)? is
>> >>>> true,? and in any case NA_complex_? is only of the
>> >>>> many.
>> >>>>
>> >>>> I also agree with Mikael that we should reconsider the
>> >>>> issue that was raised by Davis Vaughan here ("on
>> >>>> R-devel") last April.
>> >>>>
>> >>>> ????? > Another (but maybe weaker) argument is that
>> >>>> ????? > double->complex coercions happen more often
>> >>>> than ????? > logical->complex and integer->complex
>> >>>> ones. Changing the ????? > behaviour of the more
>> >>>> frequently performed coercion is ????? > more likely to
>> >>>> affect code "out there".
>> >>>>
>> >>>> ????? > Yet another argument is that one expects
>> >>>>
>> >>>> ????? >????? identical(as.complex(NA_real_), NA_real_ +
>> >>>> (0+0i))
>> >>>>
>> >>>> ????? > to be TRUE, i.e., that coercing from double to
>> >>>> complex is ????? > equivalent to adding a complex
>> >>>> zero.? The new behaviour ????? > makes the above FALSE,
>> >>>> since NA_real_ + (0+0i) gives ????? >
>> >>>> complex(r=NA_real_, i=0).
>> >>>>
>> >>>> No!? --- To my own surprise (!) --- in current R-devel
>> >>>> the above is TRUE, and ??????? NA_real_ + (0+0i)? , the
>> >>>> same as ??????? NA_real_ + 0i????? , really gives
>> >>>> complex(r=NA, i=NA) :
>> >>>>
>> >>>> Using showC() from ?complex
>> >>>>
>> >>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>> >>>> %g)", Re(z), Im(z)))
>> >>>>
>> >>>> we see (in R-devel) quite consistently
>> >>>>
>> >>>>> showC(NA_real_ + 0i)
>> >>>> [1] (R = NA, I = NA)
>> >>>>> showC(NA?????? + 0i)? # NA is 'logical'
>> >>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
>> >>>> "R-patched" -- *in*consistently
>> >>>>
>> >>>>> showC(NA_real_ + 0i)
>> >>>> [1] (R = NA, I = 0)
>> >>>>> showC(NA + 0i)
>> >>>> [1] (R = NA, I = NA) .... and honestly, I do not see
>> >>>> *where* (and when) we changed the underlying code (in
>> >>>> arithmetic.c !?)? in R-devel to *also* produce
>> >>>> NA_complex_? in such complex *arithmetic*
>> >>>>
>> >>>>
>> >>>> ????? > Having said that, one might also (but more
>> >>>> naively) expect
>> >>>>
>> >>>> ????? >
>> >>>> identical(as.complex(as.double(NA_complex_)),
>> >>>> NA_complex_)
>> >>>>
>> >>>> ????? > to be TRUE.
>> >>>>
>> >>>> as in current R-devel
>> >>>>
>> >>>> ????? > Under my proposal it continues to be FALSE.
>> >>>>
>> >>>> as in "R-release"
>> >>>>
>> >>>> ????? > Well, I'd prefer if it gave FALSE with a
>> >>>> warning ????? > "imaginary parts discarded in
>> >>>> coercion", but it seems that ????? >
>> >>>> as.double(complex(r=a, i=b)) never warns when either of
>> >>>> ????? > 'a' and 'b' is NA_real_ or NaN, even where
>> >>>> "information" ????? > {nonzero 'b'} is clearly lost ...
>> >>>>
>> >>>> The question of *warning* here is related indeed, but I
>> >>>> think we should try to look at it only *secondary* to
>> >>>> your first proposal.
>> >>>>
>> >>>> ????? > Whatever decision is made about
>> >>>> as.complex(NA_real_), ????? > maybe these points should
>> >>>> be weighed before it becomes part of ????? > R-release
>> >>>> ...
>> >>>>
>> >>>> ????? > Mikael
>> >>>>
>> >>>> Indeed.
>> >>>>
>> >>>> Can we please get other opinions / ideas here?
>> >>>>
>> >>>> Thank you in advance for your thoughts! Martin
>> >>>>
>> >>>> ---
>> >>>>
>> >>>> PS:
>> >>>>
>> >>>> ?? Our *print()*ing? of complex NA's ("NA" here meaning
>> >>>> NA or NaN) ?? is also unsatisfactory, e.g. in the case
>> >>>> where all entries of a ?? vector are NA in the sense of
>> >>>> is.na(.), but their ?? Re() and Im() are not all NA:
>> >>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>> >>>> %g)", Re(z), Im(z))) ??? z <- complex(, c(11, NA, NA),
>> >>>> c(NA, 99, NA)) ??? z ??? showC(z)
>> >>>>
>> >>>> gives
>> >>>>
>> >>>> ??? > z ??? [1] NA NA NA ??? > showC(z) ??? [1] (R =
>> >>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
>> >>>>
>> >>>> but that (printing of complex) *is* another issue, in
>> >>>> which we have the re-opened bugzilla PR#16752
>> >>>> ==>https://bugs.r-project.org/show_bug.cgi?id=16752
>> >>>>
>> >>>> on which we also worked during the R Sprint in Warwick
>> >>>> three weeks ago, and where I want to commit changes in
>> >>>> any case {but think we should change even a bit more
>> >>>> than we got to during the Sprint}.
>> >>>>
>> >>>> ______________________________________________
>> >>>>R-devel at r-project.org ? mailing list
>> >>>>https://stat.ethz.ch/mailman/listinfo/r-devel
>> >>>
>> >>
>> > --
>> > Herv? Pag?s
>>
>> > Bioconductor Core Teamhpages.on.github at gmail.com
>>
>>
>>
> --
> Herv? Pag?s
> Bioconductor Core Team
> hpages.on.github at gmail.com
> [[alternative HTML version deleted]]
On 9/25/23 07:05, Martin Maechler wrote:
Herv? Pag?s
on Sat, 23 Sep 2023 16:52:21 -0700 writes:
> Hi Martin,
> On 9/23/23 06:43, Martin Maechler wrote:
>>>>>>> Herv? Pag?s
>>>>>>> on Fri, 22 Sep 2023 16:55:05 -0700 writes:
>> > The problem is that you have things that are
>> > **semantically** different but look exactly the same:
>>
>> > They look the same:
>>
>> >> x
>> > [1] NA
>> >> y
>> > [1] NA
>> >> z
>> > [1] NA
>>
>> >> is.na(x)
>> > [1] TRUE
>> >> is.na(y)
>> > [1] TRUE
>> >> is.na(z)
>> > [1] TRUE
>>
>> >> str(x)
>> > ?cplx NA
>> >> str(y)
>> > ?num NA
>> >> str(z)
>> > ?cplx NA
>>
>> > but they are semantically different e.g.
>>
>> >> Re(x)
>> > [1] NA
>> >> Re(y)
>> > [1] -0.5? # surprise!
>>
>> >> Im(x)? # surprise!
>> > [1] 2
>> >> Im(z)
>> > [1] NA
>>
>> > so any expression involving Re() or Im() will produce
>> > different results on input that look the same on the
>> > surface.
>>
>> > You can address this either by normalizing the internal
>> > representation of complex NA to always be complex(r=NaN,
>> > i=NA_real_), like for NA_complex_, or by allowing the
>> > infinite variations that are currently allowed and at the
>> > same time making sure that both Re() and Im()? always
>> > return NA_real_ on a complex NA.
>>
>> > My point is that the behavior of complex NA should be
>> > predictable. Right now it's not. Once it's predictable
>> > (with Re() and Im() both returning NA_real_ regardless of
>> > internal representation), then it no longer matters what
>> > kind of complex NA is returned by as.complex(NA_real_),
>> > because they are no onger distinguishable.
>>
>> > H.
>>
>> > On 9/22/23 13:43, Duncan Murdoch wrote:
>> >> Since the result of is.na(x) is the same on each of
>> >> those, I don't see a problem.? As long as that is
>> >> consistent, I don't see a problem. You shouldn't be using
>> >> any other test for NA-ness.? You should never be
>> >> expecting identical() to treat different types as the
>> >> same (e.g. identical(NA, NA_real_) is FALSE, as it
>> >> should be).? If you are using a different test, that's
>> >> user error.
>> >>
>> >> Duncan Murdoch
>> >>
>> >> On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
>> >>> We could also question the value of having an infinite
>> >>> number of NA representations in the complex space. For
>> >>> example all these complex values are displayed the same
>> >>> way (as NA), are considered NAs by is.na(), but are not
>> >>> identical or semantically equivalent (from an Re() or
>> >>> Im() point of view):
>> >>>
>> >>> ? ??? NA_real_ + 0i
>> >>>
>> >>> ? ??? complex(r=NA_real_, i=Inf)
>> >>>
>> >>> ? ??? complex(r=2, i=NA_real_)
>> >>>
>> >>> ? ??? complex(r=NaN, i=NA_real_)
>> >>>
>> >>> In other words, using a single representation for
>> >>> complex NA (i.e. complex(r=NA_real_, i=NA_real_)) would
>> >>> avoid a lot of unnecessary complications and surprises.
>> >>>
>> >>> Once you do that, whether as.complex(NA_real_) should
>> >>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
>> >>> i=NA_real_) becomes a moot point.
>> >>>
>> >>> Best,
>> >>>
>> >>> H.
>>
>> Thank you, Herv?.
>> Your proposition is yet another one,
>> to declare that all complex NA's should be treated as identical
>> (almost/fully?) everywhere.
>>
>> This would be a possibility, but I think a drastic one.
>>
>> I think there are too many cases, where I want to keep the
>> information of the real part independent of the values of the
>> imaginary part (e.g. think of the Riemann hypothesis), and
>> typically vice versa.
> Use NaN for that, not NA.
Aa..h, *that* is your point.
Well, I was on exactly this line till a few years ago.
However, very *sadly* to me, note how example(complex)
nowadays ends :
##----------------------------------------------------------------------------
showC <- function(z) noquote(sprintf("(R = %g, I = %g)", Re(z), Im(z)))
## The exact result of this *depends* on the platform, compiler, math-library:
(NpNA <- NaN + NA_complex_) ; str(NpNA) # *behaves* as 'cplx NA' ..
stopifnot(is.na(NpNA), is.na(NA_complex_), is.na(Re(NA_complex_)), is.na(Im(NA_complex_)))
showC(NpNA)# but does not always show '(R = NaN, I = NA)'
## and this is not TRUE everywhere:
identical(NpNA, NA_complex_)
showC(NA_complex_) # always == (R = NA, I = NA)
##----------------------------------------------------------------------------
Unfortunately --- notably by the appearance of the new (M1, M1 pro, M2, ...)
processors, but not only ---
I (and others, but the real experts) have wrongly assumed that
NA {which on the C-level is *one* of the many possible internal NaN's}
would be preserved in computations, as they are on the R level
-- well, typically, and as long as we've used intel-compatible
chips and gcc-compilers.
But modern speed optimizations (also seen in accelerated
Blas/Lapack ..) have noticed that no official C standard
requires such preservations (i.e., in our case of NA, *the* special NaN),
and -- for speed reasons -- now on these accelerated platforms,
R-level NA's "suddenly" turn into R-level NaN's (all are NaN on
the C level but "with different payload") from quite "trivial" computations.
Consequently, the strict distinction between NA and NaN
even when they are so important for us statisticians / careful data analysts,
nowadays will tend to have to be dismissed eventually.
I see. Thanks for pointing that out. This would actually be an argument in favor of preserving the current as.complex(NA_real_) behavior. If the difference between NA and NaN is fading away then there's no reason to change as.complex(NA_real_) to make it behave radically differently from as.complex(NaN).
... and as I have mentioned also mentioned earlier in this thread, I believe we should also print the complex values of z fulfilling is.na(z) by their Re & Im, i.e., e.g. NA+iNA (or NaN+iNA or NA+iNaN or NaN+iNaN NA+0i, NaN+1i, 3+iNaN, 4+iNA etc
oops, I should have paid more attention to your first post in this thread, sorry for that. And yes, I totally agree with improving the printing of complexes when Re and/or Im is NA Best, H.
but note that the exact printing itself should *not* become the topic of this thread unless by mentioning that I strongly believe the print()ing of complex vectors in R should change anway *and* for that reason, the printing / "looks the same as" / ... should not be strong reasons in my view for deciding how *coercion*, notably as.complex(.) should work. Martin
>> With your proposal, for a (potentially large) vector of complex numbers,
>> after
>> Re(z) <- 1/2
>>
>> I could no longer rely on Re(z) == 1/2,
>> because it would be wrong for those z where (the imaginary part/ the number)
>> was NA/NaN.
> My proposal is to do this only if the Re and/or Im parts are NAs, not if
> they are NaNs.
> BTW the difference between how NAs and NaNs are treated in complex
> vectors is another issue that adds to the confusion:
> ? > complex(r=NA, i=2)
> ? [1] NA
> ? > complex(r=NaN, i=2)
> ? [1] NaN+2i
> Not displaying the real + imaginary parts in the NA case kind of
> suggests that somehow they are gone i.e. that Re(z) and Im(z) are both NA.
> Note that my proposal is not to change the display but to change Re()
> and Im() to make them consistent with the display.
> In your Re(z) <- 1/2 example (which seems to be theoretical only because
> I don't see `Re<-` in base R), any NA in 'z' would be replaced with
> complex(r=NaN, i=1/2), so you could rely on Re(z) == 1/2.
>> Also, in a similar case, a
>>
>> Im(z) <- NA
>>
>> would have to "destroy" all real parts Re(z);
>> not really typically in memory, but effectively for the user, Re(z)
>> would be all NA/NaN.
> Yes, setting a value to NA destroys it beyond repair in the sense that
> there's no way you can retrieve any original parts of it. I'm fine with
> that. I'm not fine with an NA being used to store hidden information.
>>
>> And I think there are quite a few other situations
>> where looking at Re() and Im() separately makes a lot of sense.
> Still doable if the Re or Im parts contain NaNs.
>>
>> Spencer also made a remark in this direction.
>>
>> All in all I'd be very reluctant to move in this direction;
>> but yes, I'm just one person ... let's continue musing and
>> considering !
> I understand the reluctance since this would not be a light move, but
> thanks for considering.
> Best,
> H.
>>
>> Martin
>>
>> >>> On 9/22/23 03:38, Martin Maechler wrote:
>> >>>>>>>>> Mikael Jagan ????? on Thu, 21 Sep 2023 00:47:39
>> >>>>>>>>> -0400 writes:
>> >>>> ????? > Revisiting this thread from April:
>> >>>>
>> >>>>
>> >>>> ????? > where the decision (not yet backported) was
>> >>>> made for ????? > as.complex(NA_real_) to give
>> >>>> NA_complex_ instead of ????? > complex(r=NA_real_,
>> >>>> i=0), to be consistent with ????? > help("as.complex")
>> >>>> and as.complex(NA) and as.complex(NA_integer_).
>> >>>>
>> >>>> ????? > Was any consideration given to the alternative?
>> >>>> ????? > That is, to changing as.complex(NA) and
>> >>>> as.complex(NA_integer_) to ????? > give
>> >>>> complex(r=NA_real_, i=0), consistent with ????? >
>> >>>> as.complex(NA_real_), then amending help("as.complex")
>> >>>> ????? > accordingly?
>> >>>>
>> >>>> Hmm, as, from R-core, mostly I was involved, I admit to
>> >>>> say "no", to my knowledge the (above) alternative
>> >>>> wasn't considered.
>> >>>>
>> >>>> ??? > The principle that ??? >
>> >>>> Im(as.complex(<real=(double|integer|logical)>)) should
>> >>>> be zero ??? > is quite fundamental, in my view, hence
>> >>>> the "new" behaviour ??? > seems to really violate the
>> >>>> principle of least surprise ...
>> >>>>
>> >>>> of course "least surprise"? is somewhat subjective.
>> >>>> Still, I clearly agree that the above would be one
>> >>>> desirable property.
>> >>>>
>> >>>> I think that any solution will lead to *some* surprise
>> >>>> for some cases, I think primarily because there are
>> >>>> *many* different values z? for which? is.na(z)? is
>> >>>> true,? and in any case NA_complex_? is only of the
>> >>>> many.
>> >>>>
>> >>>> I also agree with Mikael that we should reconsider the
>> >>>> issue that was raised by Davis Vaughan here ("on
>> >>>> R-devel") last April.
>> >>>>
>> >>>> ????? > Another (but maybe weaker) argument is that
>> >>>> ????? > double->complex coercions happen more often
>> >>>> than ????? > logical->complex and integer->complex
>> >>>> ones. Changing the ????? > behaviour of the more
>> >>>> frequently performed coercion is ????? > more likely to
>> >>>> affect code "out there".
>> >>>>
>> >>>> ????? > Yet another argument is that one expects
>> >>>>
>> >>>> ????? >????? identical(as.complex(NA_real_), NA_real_ +
>> >>>> (0+0i))
>> >>>>
>> >>>> ????? > to be TRUE, i.e., that coercing from double to
>> >>>> complex is ????? > equivalent to adding a complex
>> >>>> zero.? The new behaviour ????? > makes the above FALSE,
>> >>>> since NA_real_ + (0+0i) gives ????? >
>> >>>> complex(r=NA_real_, i=0).
>> >>>>
>> >>>> No!? --- To my own surprise (!) --- in current R-devel
>> >>>> the above is TRUE, and ??????? NA_real_ + (0+0i)? , the
>> >>>> same as ??????? NA_real_ + 0i????? , really gives
>> >>>> complex(r=NA, i=NA) :
>> >>>>
>> >>>> Using showC() from ?complex
>> >>>>
>> >>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>> >>>> %g)", Re(z), Im(z)))
>> >>>>
>> >>>> we see (in R-devel) quite consistently
>> >>>>
>> >>>>> showC(NA_real_ + 0i)
>> >>>> [1] (R = NA, I = NA)
>> >>>>> showC(NA?????? + 0i)? # NA is 'logical'
>> >>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
>> >>>> "R-patched" -- *in*consistently
>> >>>>
>> >>>>> showC(NA_real_ + 0i)
>> >>>> [1] (R = NA, I = 0)
>> >>>>> showC(NA + 0i)
>> >>>> [1] (R = NA, I = NA) .... and honestly, I do not see
>> >>>> *where* (and when) we changed the underlying code (in
>> >>>> arithmetic.c !?)? in R-devel to *also* produce
>> >>>> NA_complex_? in such complex *arithmetic*
>> >>>>
>> >>>>
>> >>>> ????? > Having said that, one might also (but more
>> >>>> naively) expect
>> >>>>
>> >>>> ????? >
>> >>>> identical(as.complex(as.double(NA_complex_)),
>> >>>> NA_complex_)
>> >>>>
>> >>>> ????? > to be TRUE.
>> >>>>
>> >>>> as in current R-devel
>> >>>>
>> >>>> ????? > Under my proposal it continues to be FALSE.
>> >>>>
>> >>>> as in "R-release"
>> >>>>
>> >>>> ????? > Well, I'd prefer if it gave FALSE with a
>> >>>> warning ????? > "imaginary parts discarded in
>> >>>> coercion", but it seems that ????? >
>> >>>> as.double(complex(r=a, i=b)) never warns when either of
>> >>>> ????? > 'a' and 'b' is NA_real_ or NaN, even where
>> >>>> "information" ????? > {nonzero 'b'} is clearly lost ...
>> >>>>
>> >>>> The question of *warning* here is related indeed, but I
>> >>>> think we should try to look at it only *secondary* to
>> >>>> your first proposal.
>> >>>>
>> >>>> ????? > Whatever decision is made about
>> >>>> as.complex(NA_real_), ????? > maybe these points should
>> >>>> be weighed before it becomes part of ????? > R-release
>> >>>> ...
>> >>>>
>> >>>> ????? > Mikael
>> >>>>
>> >>>> Indeed.
>> >>>>
>> >>>> Can we please get other opinions / ideas here?
>> >>>>
>> >>>> Thank you in advance for your thoughts! Martin
>> >>>>
>> >>>> ---
>> >>>>
>> >>>> PS:
>> >>>>
>> >>>> ?? Our *print()*ing? of complex NA's ("NA" here meaning
>> >>>> NA or NaN) ?? is also unsatisfactory, e.g. in the case
>> >>>> where all entries of a ?? vector are NA in the sense of
>> >>>> is.na(.), but their ?? Re() and Im() are not all NA:
>> >>>> ??? showC <- function(z) noquote(sprintf("(R = %g, I =
>> >>>> %g)", Re(z), Im(z))) ??? z <- complex(, c(11, NA, NA),
>> >>>> c(NA, 99, NA)) ??? z ??? showC(z)
>> >>>>
>> >>>> gives
>> >>>>
>> >>>> ??? > z ??? [1] NA NA NA ??? > showC(z) ??? [1] (R =
>> >>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
>> >>>>
>> >>>> but that (printing of complex) *is* another issue, in
>> >>>> which we have the re-opened bugzilla PR#16752
>> >>>> ==>https://bugs.r-project.org/show_bug.cgi?id=16752
>> >>>>
>> >>>> on which we also worked during the R Sprint in Warwick
>> >>>> three weeks ago, and where I want to commit changes in
>> >>>> any case {but think we should change even a bit more
>> >>>> than we got to during the Sprint}.
>> >>>>
>> >>>> ______________________________________________
>> >>>>R-devel at r-project.org ? mailing list
>> >>>>https://stat.ethz.ch/mailman/listinfo/r-devel
>> >>>
>> >>
>> > --
>> > Herv? Pag?s
>>
>> > Bioconductor CoreTeamhpages.on.github at gmail.com
>>
>>
>>
> --
> Herv? Pag?s
> Bioconductor Core Team
>hpages.on.github at gmail.com
> [[alternative HTML version deleted]]
Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com [[alternative HTML version deleted]]
2 days later
Gregory R Warnes
on Sat, 23 Sep 2023 13:22:35 -0400 writes:
> It sounds like we need to add arguments (with sensible
> defaults) to complex(), Re(), Im(), is.na.complex() etc to
> allow the user to specify the desired behavior.
I don't think I'd like such extra flexibility for all these,
... ;-) and even much less I'd like to be part of the group who
then has to *maintain* such behavior ;-)
> --
> Change your thoughts and you change the world.
> --Dr. Norman Vincent Peale
( .. *some* hybris from the last century ..)
Currently, I'm actually tending to *simplify* things
drastically, also because it means less surprises in the long
term and much less code reading / debugging in formatting /
printing and dealing with complex numbers.
NB: there *is* the re-opened PR#16752,
https://bugs.r-project.org/show_bug.cgi?id=16752
where the investigation of the (C-level) R source is a major reason
for my current thinking ..
What if we decided to really treat complex numbers much more
than currently as pairs of real (i.e. "double") numbers,
notably also when print()ing them?
Consequently, Re() and Im() would continue to return what they
do now (contrary to Herv?'s original proposal) also in case of
non-finite numbers.
Of course, *no* change in arithmetic or other Ops (such as '==')
nor is.na(), is.finite(), is.nan(), etc.
The current formatting and printing of complex numbers is
complicated in some cases unnecessarily inaccurate and in other
cases unnecessarily *ugly*.
I believe that formatting, we should change to basically format
the (vector of) real parts and imaginary parts separately.
E.g., it is really unnecessarily ugly to switch to exponential
format for both Re and Im, in a situation like this:
(-1):2 + 1i*1e99
[1] 0e+00+1e+99i 0e+00+1e+99i 0e+00+1e+99i 0e+00+1e+99i It is very ugly to use exponential/scientific format for the Re() even if we'd fix the confusing and inaccurate *joint* rounding of Re and Im. ... and indeed (as discusses here previously: While it makes some sense to print NA identically for logical, integer and double, it seems often confusing *not* to show <Re> + <Im>i in the complex case; where that *does* happen for Inf and NaN: > complex(, NA, ((-1):2)) [1] NA NA NA NA > complex(, NaN, ((-1):2)) [1] NaN-1i NaN+0i NaN+1i NaN+2i > complex(, c(-Inf,Inf), ((-1):2)) [1] -Inf-1i Inf+0i -Inf+1i Inf+2i > where the first of these *does* keep the finite imaginary values, but does not show them
(cN <- complex(, NA, ((-1):2))); rbind(Re(cN), Im(cN))
[1] NA NA NA NA
[,1] [,2] [,3] [,4]
[1,] NA NA NA NA
[2,] -1 0 1 2
Martin Maechler
on Thu, 28 Sep 2023 12:11:27 +0200 writes:
Gregory R Warnes
on Sat, 23 Sep 2023 13:22:35 -0400 writes:
> It sounds like we need to add arguments (with sensible
> defaults) to complex(), Re(), Im(), is.na.complex() etc to
> allow the user to specify the desired behavior.
I don't think I'd like such extra flexibility for all these, ... ;-) and even much less I'd like to be part of the group who then has to *maintain* such behavior ;-)
[..........]
Currently, I'm actually tending to *simplify* things
drastically, also because it means less surprises in the long
term and much less code reading / debugging in formatting /
printing and dealing with complex numbers.
NB: there *is* the re-opened PR#16752,
https://bugs.r-project.org/show_bug.cgi?id=16752
where the investigation of the (C-level) R source is a major reason
for my current thinking ..
What if we decided to really treat complex numbers much more
than currently as pairs of real (i.e. "double") numbers,
notably also when print()ing them?
Consequently, Re() and Im() would continue to return what they
do now (contrary to Herv?'s original proposal) also in case of
non-finite numbers.
Of course, *no* change in arithmetic or other Ops (such as '==')
nor is.na(), is.finite(), is.nan(), etc.
The current formatting and printing of complex numbers is
complicated in some cases unnecessarily inaccurate and in other
cases unnecessarily *ugly*.
I believe that formatting, we should change to basically format
the (vector of) real parts and imaginary parts separately.
E.g., it is really unnecessarily ugly to switch to exponential
format for both Re and Im, in a situation like this:
(-1):2 + 1i*1e99
[1] 0e+00+1e+99i 0e+00+1e+99i 0e+00+1e+99i 0e+00+1e+99i It is very ugly to use exponential/scientific format for the Re() even if we'd fix the confusing and inaccurate *joint* rounding of Re and Im.
and then, I end with
... and indeed (as discusses here previously: While it makes some sense to print NA identically for logical, integer and double, it seems often confusing *not* to show <Re> + <Im>i in the complex case; where that *does* happen for Inf and NaN:
> complex(, NA, ((-1):2))
[1] NA NA NA NA
> complex(, NaN, ((-1):2))
[1] NaN-1i NaN+0i NaN+1i NaN+2i
> complex(, c(-Inf,Inf), ((-1):2))
[1] -Inf-1i Inf+0i -Inf+1i Inf+2i
>
where the first of these *does* keep the finite imaginary values, but does not show them
(cN <- complex(, NA, ((-1):2))); rbind(Re(cN), Im(cN))
[1] NA NA NA NA
[,1] [,2] [,3] [,4]
[1,] NA NA NA NA
[2,] -1 0 1 2
where really, I think we should keep that behavior (*), at least
for now: Changing it as well *does* have a relatively large
impact, is not back-compatible with (the long history of) S and
R, *and* it complicates documentation and teaching unnecessarily.
Experts will now how to differentiate the different complex NAs,
e.g. by using a simple utilities such as {"format complex", "print complex"}
fc <- function(z) paste0("(",Re(z), ",", Im(z),")")
pc <- function(z) noquote(fc(z))
which I've used now for testing/"visualizing" different scenarios
Martin
---
*) simply printing 'NA' in cases where is.na(.) is true and is.nan(.) is false