thr3ads.net - R devel - [Rd] Recent changes to as.complex(NA_real

If this information is useful, please help other people find it:
Share via:

Martin Maechler

2023-Sep-23 13:43 UTC

[Rd] Recent changes to as.complex(NA_real_)

>>>>> Herv? Pag?s 
>>>>>     on Fri, 22 Sep 2023 16:55:05 -0700 writes:
    > The problem is that you have things that are
    > **semantically** different but look exactly the same:

    > They look the same:

    >> x
    > [1] NA
    >> y
    > [1] NA
    >> z
    > [1] NA

    >> is.na(x)
    > [1] TRUE
    >> is.na(y)
    > [1] TRUE
    >> is.na(z)
    > [1] TRUE

    >> str(x)
    >  ?cplx NA
    >> str(y)
    >  ?num NA
    >> str(z)
    >  ?cplx NA

    > but they are semantically different e.g.

    >> Re(x)
    > [1] NA
    >> Re(y)
    > [1] -0.5? # surprise!

    >> Im(x)? # surprise!
    > [1] 2
    >> Im(z)
    > [1] NA

    > so any expression involving Re() or Im() will produce
    > different results on input that look the same on the
    > surface.

    > You can address this either by normalizing the internal
    > representation of complex NA to always be complex(r=NaN,
    > i=NA_real_), like for NA_complex_, or by allowing the
    > infinite variations that are currently allowed and at the
    > same time making sure that both Re() and Im()? always
    > return NA_real_ on a complex NA.

    > My point is that the behavior of complex NA should be
    > predictable. Right now it's not. Once it's predictable
    > (with Re() and Im() both returning NA_real_ regardless of
    > internal representation), then it no longer matters what
    > kind of complex NA is returned by as.complex(NA_real_),
    > because they are no onger distinguishable.

    > H.

    > On 9/22/23 13:43, Duncan Murdoch wrote:
    >> Since the result of is.na(x) is the same on each of
    >> those, I don't see a problem.? As long as that is
    >> consistent, I don't see a problem. You shouldn't be using
    >> any other test for NA-ness.? You should never be
    >> expecting identical() to treat different types as the
    >> same (e.g.  identical(NA, NA_real_) is FALSE, as it
    >> should be).? If you are using a different test, that's
    >> user error.
    >> 
    >> Duncan Murdoch
    >> 
    >> On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
    >>> We could also question the value of having an infinite
    >>> number of NA representations in the complex space. For
    >>> example all these complex values are displayed the same
    >>> way (as NA), are considered NAs by is.na(), but are not
    >>> identical or semantically equivalent (from an Re() or
    >>> Im() point of view):
    >>> 
    >>> ? ??? NA_real_ + 0i
    >>> 
    >>> ? ??? complex(r=NA_real_, i=Inf)
    >>> 
    >>> ? ??? complex(r=2, i=NA_real_)
    >>> 
    >>> ? ??? complex(r=NaN, i=NA_real_)
    >>> 
    >>> In other words, using a single representation for
    >>> complex NA (i.e.  complex(r=NA_real_, i=NA_real_)) would
    >>> avoid a lot of unnecessary complications and surprises.
    >>> 
    >>> Once you do that, whether as.complex(NA_real_) should
    >>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
    >>> i=NA_real_) becomes a moot point.
    >>> 
    >>> Best,
    >>> 
    >>> H.

Thank you, Herv?.
Your proposition is yet another one,
to declare that all complex NA's should be treated as identical
(almost/fully?) everywhere.

This would be a possibility, but I think a drastic one.

I think there are too many cases, where I want to keep the
information of the real part independent of the values of the
imaginary part (e.g. think of the Riemann hypothesis), and
typically vice versa.

With your proposal, for a (potentially large) vector of complex numbers,
after
      Re(z)  <-  1/2

I could no longer rely on   Re(z) == 1/2,
because it would be wrong for those z where (the imaginary part/ the number)
was NA/NaN.
Also, in a similar case, a

      Im(z) <- NA

would have to "destroy" all real parts  Re(z);
not really typically in memory, but effectively for the user,  Re(z)
would be all NA/NaN.

And I think there are quite a few other situations
where looking at Re() and Im() separately makes a lot of sense.

Spencer also made a remark in this direction.

All in all I'd be very reluctant to move in this direction;
but yes, I'm just one person ... let's continue musing and
considering !

Martin

    >>> On 9/22/23 03:38, Martin Maechler wrote:
    >>>>>>>>> Mikael Jagan ????? on Thu, 21 Sep 2023
00:47:39
    >>>>>>>>> -0400 writes:
    >>>> ????? > Revisiting this thread from April:
    >>>> 
    >>>>
>https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html
    >>>> 
    >>>> ????? > where the decision (not yet backported) was
    >>>> made for ????? > as.complex(NA_real_) to give
    >>>> NA_complex_ instead of ????? > complex(r=NA_real_,
    >>>> i=0), to be consistent with ????? >
help("as.complex")
    >>>> and as.complex(NA) and as.complex(NA_integer_).
    >>>> 
    >>>> ????? > Was any consideration given to the alternative?
    >>>> ????? > That is, to changing as.complex(NA) and
    >>>> as.complex(NA_integer_) to ????? > give
    >>>> complex(r=NA_real_, i=0), consistent with ????? >
    >>>> as.complex(NA_real_), then amending
help("as.complex")
    >>>> ????? > accordingly?
    >>>> 
    >>>> Hmm, as, from R-core, mostly I was involved, I admit to
    >>>> say "no", to my knowledge the (above) alternative
    >>>> wasn't considered.
    >>>> 
    >>>> ??? > The principle that ??? >
    >>>> Im(as.complex(<real=(double|integer|logical)>))
should
    >>>> be zero ??? > is quite fundamental, in my view, hence
    >>>> the "new" behaviour ??? > seems to really
violate the
    >>>> principle of least surprise ...
    >>>> 
    >>>> of course "least surprise"? is somewhat
subjective.?
    >>>> Still, I clearly agree that the above would be one
    >>>> desirable property.
    >>>> 
    >>>> I think that any solution will lead to *some* surprise
    >>>> for some cases, I think primarily because there are
    >>>> *many* different values z? for which? is.na(z)? is
    >>>> true,? and in any case NA_complex_? is only of the
    >>>> many.
    >>>> 
    >>>> I also agree with Mikael that we should reconsider the
    >>>> issue that was raised by Davis Vaughan here ("on
    >>>> R-devel") last April.
    >>>> 
    >>>> ????? > Another (but maybe weaker) argument is that
    >>>> ????? > double->complex coercions happen more often
    >>>> than ????? > logical->complex and integer->complex
    >>>> ones. Changing the ????? > behaviour of the more
    >>>> frequently performed coercion is ????? > more likely to
    >>>> affect code "out there".
    >>>> 
    >>>> ????? > Yet another argument is that one expects
    >>>> 
    >>>> ????? >????? identical(as.complex(NA_real_), NA_real_ +
    >>>> (0+0i))
    >>>> 
    >>>> ????? > to be TRUE, i.e., that coercing from double to
    >>>> complex is ????? > equivalent to adding a complex
    >>>> zero.? The new behaviour ????? > makes the above FALSE,
    >>>> since NA_real_ + (0+0i) gives ????? >
    >>>> complex(r=NA_real_, i=0).
    >>>> 
    >>>> No!? --- To my own surprise (!) --- in current R-devel
    >>>> the above is TRUE, and ??????? NA_real_ + (0+0i)? , the
    >>>> same as ??????? NA_real_ + 0i????? , really gives?
    >>>> complex(r=NA, i=NA) :
    >>>> 
    >>>> Using showC() from ?complex
    >>>> 
    >>>> ??? showC <- function(z) noquote(sprintf("(R = %g,
I     >>>> %g)", Re(z), Im(z)))
    >>>> 
    >>>> we see (in R-devel) quite consistently
    >>>> 
    >>>>> showC(NA_real_ + 0i)
    >>>> [1] (R = NA, I = NA)
    >>>>> showC(NA?????? + 0i)? # NA is 'logical'
    >>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
    >>>> "R-patched" -- *in*consistently
    >>>> 
    >>>>> showC(NA_real_ + 0i)
    >>>> [1] (R = NA, I = 0)
    >>>>> showC(NA + 0i)
    >>>> [1] (R = NA, I = NA) .... and honestly, I do not see
    >>>> *where* (and when) we changed the underlying code (in
    >>>> arithmetic.c !?)? in R-devel to *also* produce?
    >>>> NA_complex_? in such complex *arithmetic*
    >>>> 
    >>>> 
    >>>> ????? > Having said that, one might also (but more
    >>>> naively) expect
    >>>> 
    >>>> ????? >????
    >>>> identical(as.complex(as.double(NA_complex_)),
    >>>> NA_complex_)
    >>>> 
    >>>> ????? > to be TRUE.
    >>>> 
    >>>> as in current R-devel
    >>>> 
    >>>> ????? > Under my proposal it continues to be FALSE.
    >>>> 
    >>>> as in "R-release"
    >>>> 
    >>>> ????? > Well, I'd prefer if it gave FALSE with a
    >>>> warning ????? > "imaginary parts discarded in
    >>>> coercion", but it seems that ????? >
    >>>> as.double(complex(r=a, i=b)) never warns when either of
    >>>> ????? > 'a' and 'b' is NA_real_ or NaN,
even where
    >>>> "information" ????? > {nonzero 'b'} is
clearly lost ...
    >>>> 
    >>>> The question of *warning* here is related indeed, but I
    >>>> think we should try to look at it only *secondary* to
    >>>> your first proposal.
    >>>> 
    >>>> ????? > Whatever decision is made about
    >>>> as.complex(NA_real_), ????? > maybe these points should
    >>>> be weighed before it becomes part of ????? > R-release
    >>>> ...
    >>>> 
    >>>> ????? > Mikael
    >>>> 
    >>>> Indeed.
    >>>> 
    >>>> Can we please get other opinions / ideas here?
    >>>> 
    >>>> Thank you in advance for your thoughts!  Martin
    >>>> 
    >>>> ---
    >>>> 
    >>>> PS:
    >>>> 
    >>>> ?? Our *print()*ing? of complex NA's ("NA"
here meaning
    >>>> NA or NaN) ?? is also unsatisfactory, e.g. in the case
    >>>> where all entries of a ?? vector are NA in the sense of
    >>>> is.na(.), but their ?? Re() and Im() are not all NA: ??
    >>>> ??? showC <- function(z) noquote(sprintf("(R = %g,
I     >>>> %g)", Re(z), Im(z))) ??? z <- complex(, c(11, NA,
NA),
    >>>> c(NA, 99, NA)) ??? z ??? showC(z)
    >>>> 
    >>>> gives
    >>>> 
    >>>> ??? > z ??? [1] NA NA NA ??? > showC(z) ??? [1] (R   
>>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
    >>>> 
    >>>> but that (printing of complex) *is* another issue, in
    >>>> which we have the re-opened bugzilla PR#16752 ?????
    >>>> ==>https://bugs.r-project.org/show_bug.cgi?id=16752
    >>>> 
    >>>> on which we also worked during the R Sprint in Warwick
    >>>> three weeks ago, and where I want to commit changes in
    >>>> any case {but think we should change even a bit more
    >>>> than we got to during the Sprint}.
    >>>> 
    >>>> ______________________________________________
    >>>> R-devel at r-project.org? mailing list
    >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
    >>> 
    >> 
    > -- 
    > Herv? Pag?s

    > Bioconductor Core Team hpages.on.github at gmail.com

Mikael Jagan

2023-Sep-23 16:36 UTC

head link

[Rd] Recent changes to as.complex(NA_real_)

On 2023-09-23 9:43 am, Martin Maechler wrote:>>>>>> Herv? Pag?s
>>>>>>      on Fri, 22 Sep 2023 16:55:05 -0700 writes:
> 
>      > The problem is that you have things that are
>      > **semantically** different but look exactly the same:
> 
>      > They look the same:
> 
>      >> x
>      > [1] NA
>      >> y
>      > [1] NA
>      >> z
>      > [1] NA
> 
>      >> is.na(x)
>      > [1] TRUE
>      >> is.na(y)
>      > [1] TRUE
>      >> is.na(z)
>      > [1] TRUE
> 
>      >> str(x)
>      >  ?cplx NA
>      >> str(y)
>      >  ?num NA
>      >> str(z)
>      >  ?cplx NA
> 
>      > but they are semantically different e.g.
> 
>      >> Re(x)
>      > [1] NA
>      >> Re(y)
>      > [1] -0.5? # surprise!
> 
>      >> Im(x)? # surprise!
>      > [1] 2
>      >> Im(z)
>      > [1] NA
> 
>      > so any expression involving Re() or Im() will produce
>      > different results on input that look the same on the
>      > surface.
> 
>      > You can address this either by normalizing the internal
>      > representation of complex NA to always be complex(r=NaN,
>      > i=NA_real_), like for NA_complex_, or by allowing the
>      > infinite variations that are currently allowed and at the
>      > same time making sure that both Re() and Im()? always
>      > return NA_real_ on a complex NA.
> 
>      > My point is that the behavior of complex NA should be
>      > predictable. Right now it's not. Once it's predictable
>      > (with Re() and Im() both returning NA_real_ regardless of
>      > internal representation), then it no longer matters what
>      > kind of complex NA is returned by as.complex(NA_real_),
>      > because they are no onger distinguishable.
> 
>      > H.
> 
>      > On 9/22/23 13:43, Duncan Murdoch wrote:
>      >> Since the result of is.na(x) is the same on each of
>      >> those, I don't see a problem.? As long as that is
>      >> consistent, I don't see a problem. You shouldn't be
using
>      >> any other test for NA-ness.? You should never be
>      >> expecting identical() to treat different types as the
>      >> same (e.g.  identical(NA, NA_real_) is FALSE, as it
>      >> should be).? If you are using a different test, that's
>      >> user error.
>      >>
>      >> Duncan Murdoch
>      >>
>      >> On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
>      >>> We could also question the value of having an infinite
>      >>> number of NA representations in the complex space. For
>      >>> example all these complex values are displayed the same
>      >>> way (as NA), are considered NAs by is.na(), but are not
>      >>> identical or semantically equivalent (from an Re() or
>      >>> Im() point of view):
>      >>>
>      >>> ? ??? NA_real_ + 0i
>      >>>
>      >>> ? ??? complex(r=NA_real_, i=Inf)
>      >>>
>      >>> ? ??? complex(r=2, i=NA_real_)
>      >>>
>      >>> ? ??? complex(r=NaN, i=NA_real_)
>      >>>
>      >>> In other words, using a single representation for
>      >>> complex NA (i.e.  complex(r=NA_real_, i=NA_real_)) would
>      >>> avoid a lot of unnecessary complications and surprises.
>      >>>
>      >>> Once you do that, whether as.complex(NA_real_) should
>      >>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
>      >>> i=NA_real_) becomes a moot point.
>      >>>
>      >>> Best,
>      >>>
>      >>> H.
> 
> Thank you, Herv?.
> Your proposition is yet another one,
> to declare that all complex NA's should be treated as identical
> (almost/fully?) everywhere.
> 
> This would be a possibility, but I think a drastic one.
> 
> I think there are too many cases, where I want to keep the
> information of the real part independent of the values of the
> imaginary part (e.g. think of the Riemann hypothesis), and
> typically vice versa.
> 
> With your proposal, for a (potentially large) vector of complex numbers,
> after
>        Re(z)  <-  1/2
> 
> I could no longer rely on   Re(z) == 1/2,
> because it would be wrong for those z where (the imaginary part/ the
number)
> was NA/NaN.
> Also, in a similar case, a
> 
>        Im(z) <- NA
> 
> would have to "destroy" all real parts  Re(z);
> not really typically in memory, but effectively for the user,  Re(z)
> would be all NA/NaN.
> 
> And I think there are quite a few other situations
> where looking at Re() and Im() separately makes a lot of sense.
Indeed, and there is no way to "tell" BLAS and LAPACK to treat both
the real and
imaginary parts as NA_REAL when either is NA_REAL.  Hence the only reliable way
to implement such a proposal would be to post-process the result of any
computation returning a complex type, testing for NA_REAL and setting both parts
to NA_REAL in that case.  My expectation is that such testing would drastically
slow down basic arithmetic and algebraic operations ...

Mikael
> 
> Spencer also made a remark in this direction.
> 
> All in all I'd be very reluctant to move in this direction;
> but yes, I'm just one person ... let's continue musing and
> considering !
> 
> Martin
> 
>      >>> On 9/22/23 03:38, Martin Maechler wrote:
>      >>>>>>>>> Mikael Jagan ????? on Thu, 21 Sep
2023 00:47:39
>      >>>>>>>>> -0400 writes:
>      >>>> ????? > Revisiting this thread from April:
>      >>>>
>      >>>>
>https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html
>      >>>>
>      >>>> ????? > where the decision (not yet backported)
was
>      >>>> made for ????? > as.complex(NA_real_) to give
>      >>>> NA_complex_ instead of ????? > complex(r=NA_real_,
>      >>>> i=0), to be consistent with ????? >
help("as.complex")
>      >>>> and as.complex(NA) and as.complex(NA_integer_).
>      >>>>
>      >>>> ????? > Was any consideration given to the
alternative?
>      >>>> ????? > That is, to changing as.complex(NA) and
>      >>>> as.complex(NA_integer_) to ????? > give
>      >>>> complex(r=NA_real_, i=0), consistent with ????? >
>      >>>> as.complex(NA_real_), then amending
help("as.complex")
>      >>>> ????? > accordingly?
>      >>>>
>      >>>> Hmm, as, from R-core, mostly I was involved, I admit
to
>      >>>> say "no", to my knowledge the (above)
alternative
>      >>>> wasn't considered.
>      >>>>
>      >>>> ??? > The principle that ??? >
>      >>>> Im(as.complex(<real=(double|integer|logical)>))
should
>      >>>> be zero ??? > is quite fundamental, in my view,
hence
>      >>>> the "new" behaviour ??? > seems to
really violate the
>      >>>> principle of least surprise ...
>      >>>>
>      >>>> of course "least surprise"? is somewhat
subjective.
>      >>>> Still, I clearly agree that the above would be one
>      >>>> desirable property.
>      >>>>
>      >>>> I think that any solution will lead to *some*
surprise
>      >>>> for some cases, I think primarily because there are
>      >>>> *many* different values z? for which? is.na(z)? is
>      >>>> true,? and in any case NA_complex_? is only of the
>      >>>> many.
>      >>>>
>      >>>> I also agree with Mikael that we should reconsider
the
>      >>>> issue that was raised by Davis Vaughan here ("on
>      >>>> R-devel") last April.
>      >>>>
>      >>>> ????? > Another (but maybe weaker) argument is
that
>      >>>> ????? > double->complex coercions happen more
often
>      >>>> than ????? > logical->complex and
integer->complex
>      >>>> ones. Changing the ????? > behaviour of the more
>      >>>> frequently performed coercion is ????? > more
likely to
>      >>>> affect code "out there".
>      >>>>
>      >>>> ????? > Yet another argument is that one expects
>      >>>>
>      >>>> ????? >????? identical(as.complex(NA_real_),
NA_real_ +
>      >>>> (0+0i))
>      >>>>
>      >>>> ????? > to be TRUE, i.e., that coercing from
double to
>      >>>> complex is ????? > equivalent to adding a complex
>      >>>> zero.? The new behaviour ????? > makes the above
FALSE,
>      >>>> since NA_real_ + (0+0i) gives ????? >
>      >>>> complex(r=NA_real_, i=0).
>      >>>>
>      >>>> No!? --- To my own surprise (!) --- in current
R-devel
>      >>>> the above is TRUE, and ??????? NA_real_ + (0+0i)? ,
the
>      >>>> same as ??????? NA_real_ + 0i????? , really gives
>      >>>> complex(r=NA, i=NA) :
>      >>>>
>      >>>> Using showC() from ?complex
>      >>>>
>      >>>> ??? showC <- function(z) noquote(sprintf("(R
= %g, I >      >>>> %g)", Re(z), Im(z)))
>      >>>>
>      >>>> we see (in R-devel) quite consistently
>      >>>>
>      >>>>> showC(NA_real_ + 0i)
>      >>>> [1] (R = NA, I = NA)
>      >>>>> showC(NA?????? + 0i)? # NA is 'logical'
>      >>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
>      >>>> "R-patched" -- *in*consistently
>      >>>>
>      >>>>> showC(NA_real_ + 0i)
>      >>>> [1] (R = NA, I = 0)
>      >>>>> showC(NA + 0i)
>      >>>> [1] (R = NA, I = NA) .... and honestly, I do not see
>      >>>> *where* (and when) we changed the underlying code (in
>      >>>> arithmetic.c !?)? in R-devel to *also* produce
>      >>>> NA_complex_? in such complex *arithmetic*
>      >>>>
>      >>>>
>      >>>> ????? > Having said that, one might also (but more
>      >>>> naively) expect
>      >>>>
>      >>>> ????? >
>      >>>> identical(as.complex(as.double(NA_complex_)),
>      >>>> NA_complex_)
>      >>>>
>      >>>> ????? > to be TRUE.
>      >>>>
>      >>>> as in current R-devel
>      >>>>
>      >>>> ????? > Under my proposal it continues to be
FALSE.
>      >>>>
>      >>>> as in "R-release"
>      >>>>
>      >>>> ????? > Well, I'd prefer if it gave FALSE with
a
>      >>>> warning ????? > "imaginary parts discarded in
>      >>>> coercion", but it seems that ????? >
>      >>>> as.double(complex(r=a, i=b)) never warns when either
of
>      >>>> ????? > 'a' and 'b' is NA_real_ or
NaN, even where
>      >>>> "information" ????? > {nonzero
'b'} is clearly lost ...
>      >>>>
>      >>>> The question of *warning* here is related indeed, but
I
>      >>>> think we should try to look at it only *secondary* to
>      >>>> your first proposal.
>      >>>>
>      >>>> ????? > Whatever decision is made about
>      >>>> as.complex(NA_real_), ????? > maybe these points
should
>      >>>> be weighed before it becomes part of ????? >
R-release
>      >>>> ...
>      >>>>
>      >>>> ????? > Mikael
>      >>>>
>      >>>> Indeed.
>      >>>>
>      >>>> Can we please get other opinions / ideas here?
>      >>>>
>      >>>> Thank you in advance for your thoughts!  Martin
>      >>>>
>      >>>> ---
>      >>>>
>      >>>> PS:
>      >>>>
>      >>>> ?? Our *print()*ing? of complex NA's
("NA" here meaning
>      >>>> NA or NaN) ?? is also unsatisfactory, e.g. in the
case
>      >>>> where all entries of a ?? vector are NA in the sense
of
>      >>>> is.na(.), but their ?? Re() and Im() are not all NA:
>      >>>> ??? showC <- function(z) noquote(sprintf("(R
= %g, I >      >>>> %g)", Re(z), Im(z))) ??? z <-
complex(, c(11, NA, NA),
>      >>>> c(NA, 99, NA)) ??? z ??? showC(z)
>      >>>>
>      >>>> gives
>      >>>>
>      >>>> ??? > z ??? [1] NA NA NA ??? > showC(z) ??? [1]
(R >      >>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
>      >>>>
>      >>>> but that (printing of complex) *is* another issue, in
>      >>>> which we have the re-opened bugzilla PR#16752
>      >>>>
==>https://bugs.r-project.org/show_bug.cgi?id=16752
>      >>>>
>      >>>> on which we also worked during the R Sprint in
Warwick
>      >>>> three weeks ago, and where I want to commit changes
in
>      >>>> any case {but think we should change even a bit more
>      >>>> than we got to during the Sprint}.
>      >>>>
>      >>>> ______________________________________________
>      >>>> R-devel at r-project.org? mailing list
>      >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>      >>>
>      >>
>      > --
>      > Herv? Pag?s
> 
>      > Bioconductor Core Team hpages.on.github at gmail.com
> 
> 
>

Hervé Pagès

2023-Sep-23 23:52 UTC

head link

[Rd] Recent changes to as.complex(NA_real_)

Hi Martin,

On 9/23/23 06:43, Martin Maechler wrote:>>>>>> Herv? Pag?s
>>>>>>      on Fri, 22 Sep 2023 16:55:05 -0700 writes:
>      > The problem is that you have things that are
>      > **semantically** different but look exactly the same:
>
>      > They look the same:
>
>      >> x
>      > [1] NA
>      >> y
>      > [1] NA
>      >> z
>      > [1] NA
>
>      >> is.na(x)
>      > [1] TRUE
>      >> is.na(y)
>      > [1] TRUE
>      >> is.na(z)
>      > [1] TRUE
>
>      >> str(x)
>      >  ?cplx NA
>      >> str(y)
>      >  ?num NA
>      >> str(z)
>      >  ?cplx NA
>
>      > but they are semantically different e.g.
>
>      >> Re(x)
>      > [1] NA
>      >> Re(y)
>      > [1] -0.5? # surprise!
>
>      >> Im(x)? # surprise!
>      > [1] 2
>      >> Im(z)
>      > [1] NA
>
>      > so any expression involving Re() or Im() will produce
>      > different results on input that look the same on the
>      > surface.
>
>      > You can address this either by normalizing the internal
>      > representation of complex NA to always be complex(r=NaN,
>      > i=NA_real_), like for NA_complex_, or by allowing the
>      > infinite variations that are currently allowed and at the
>      > same time making sure that both Re() and Im()? always
>      > return NA_real_ on a complex NA.
>
>      > My point is that the behavior of complex NA should be
>      > predictable. Right now it's not. Once it's predictable
>      > (with Re() and Im() both returning NA_real_ regardless of
>      > internal representation), then it no longer matters what
>      > kind of complex NA is returned by as.complex(NA_real_),
>      > because they are no onger distinguishable.
>
>      > H.
>
>      > On 9/22/23 13:43, Duncan Murdoch wrote:
>      >> Since the result of is.na(x) is the same on each of
>      >> those, I don't see a problem.? As long as that is
>      >> consistent, I don't see a problem. You shouldn't be
using
>      >> any other test for NA-ness.? You should never be
>      >> expecting identical() to treat different types as the
>      >> same (e.g.  identical(NA, NA_real_) is FALSE, as it
>      >> should be).? If you are using a different test, that's
>      >> user error.
>      >>
>      >> Duncan Murdoch
>      >>
>      >> On 22/09/2023 2:41 p.m., Herv? Pag?s wrote:
>      >>> We could also question the value of having an infinite
>      >>> number of NA representations in the complex space. For
>      >>> example all these complex values are displayed the same
>      >>> way (as NA), are considered NAs by is.na(), but are not
>      >>> identical or semantically equivalent (from an Re() or
>      >>> Im() point of view):
>      >>>
>      >>> ? ??? NA_real_ + 0i
>      >>>
>      >>> ? ??? complex(r=NA_real_, i=Inf)
>      >>>
>      >>> ? ??? complex(r=2, i=NA_real_)
>      >>>
>      >>> ? ??? complex(r=NaN, i=NA_real_)
>      >>>
>      >>> In other words, using a single representation for
>      >>> complex NA (i.e.  complex(r=NA_real_, i=NA_real_)) would
>      >>> avoid a lot of unnecessary complications and surprises.
>      >>>
>      >>> Once you do that, whether as.complex(NA_real_) should
>      >>> return complex(r=NA_real_, i=0) or complex(r=NA_real_,
>      >>> i=NA_real_) becomes a moot point.
>      >>>
>      >>> Best,
>      >>>
>      >>> H.
>
> Thank you, Herv?.
> Your proposition is yet another one,
> to declare that all complex NA's should be treated as identical
> (almost/fully?) everywhere.
>
> This would be a possibility, but I think a drastic one.
>
> I think there are too many cases, where I want to keep the
> information of the real part independent of the values of the
> imaginary part (e.g. think of the Riemann hypothesis), and
> typically vice versa.
Use NaN for that, not NA.>
> With your proposal, for a (potentially large) vector of complex numbers,
> after
>        Re(z)  <-  1/2
>
> I could no longer rely on   Re(z) == 1/2,
> because it would be wrong for those z where (the imaginary part/ the
number)
> was NA/NaN.
My proposal is to do this only if the Re and/or Im parts are NAs, not if 
they are NaNs.

BTW the difference between how NAs and NaNs are treated in complex 
vectors is another issue that adds to the confusion:

 ? > complex(r=NA, i=2)
 ? [1] NA
 ? > complex(r=NaN, i=2)
 ? [1] NaN+2i

Not displaying the real + imaginary parts in the NA case kind of 
suggests that somehow they are gone i.e. that Re(z) and Im(z) are both NA.

Note that my proposal is not to change the display but to change Re() 
and Im() to make them consistent with the display.

In your Re(z) <- 1/2 example (which seems to be theoretical only because 
I don't see `Re<-` in base R), any NA in 'z' would be replaced
with
complex(r=NaN, i=1/2), so you could rely on Re(z) == 1/2.
> Also, in a similar case, a
>
>        Im(z) <- NA
>
> would have to "destroy" all real parts  Re(z);
> not really typically in memory, but effectively for the user,  Re(z)
> would be all NA/NaN.Yes, setting a value to NA destroys it beyond repair in the sense that 
there's no way you can retrieve any original parts of it. I'm fine with 
that. I'm not fine with an NA being used to store hidden
information.>
> And I think there are quite a few other situations
> where looking at Re() and Im() separately makes a lot of sense.
Still doable if the Re or Im parts contain NaNs.>
> Spencer also made a remark in this direction.
>
> All in all I'd be very reluctant to move in this direction;
> but yes, I'm just one person ... let's continue musing and
> considering !
I understand the reluctance since this would not be a light move, but 
thanks for considering.

Best,

H.
>
> Martin
>
>      >>> On 9/22/23 03:38, Martin Maechler wrote:
>      >>>>>>>>> Mikael Jagan ????? on Thu, 21 Sep
2023 00:47:39
>      >>>>>>>>> -0400 writes:
>      >>>> ????? > Revisiting this thread from April:
>      >>>>
>      >>>>
>https://stat.ethz.ch/pipermail/r-devel/2023-April/082545.html
>      >>>>
>      >>>> ????? > where the decision (not yet backported)
was
>      >>>> made for ????? > as.complex(NA_real_) to give
>      >>>> NA_complex_ instead of ????? > complex(r=NA_real_,
>      >>>> i=0), to be consistent with ????? >
help("as.complex")
>      >>>> and as.complex(NA) and as.complex(NA_integer_).
>      >>>>
>      >>>> ????? > Was any consideration given to the
alternative?
>      >>>> ????? > That is, to changing as.complex(NA) and
>      >>>> as.complex(NA_integer_) to ????? > give
>      >>>> complex(r=NA_real_, i=0), consistent with ????? >
>      >>>> as.complex(NA_real_), then amending
help("as.complex")
>      >>>> ????? > accordingly?
>      >>>>
>      >>>> Hmm, as, from R-core, mostly I was involved, I admit
to
>      >>>> say "no", to my knowledge the (above)
alternative
>      >>>> wasn't considered.
>      >>>>
>      >>>> ??? > The principle that ??? >
>      >>>> Im(as.complex(<real=(double|integer|logical)>))
should
>      >>>> be zero ??? > is quite fundamental, in my view,
hence
>      >>>> the "new" behaviour ??? > seems to
really violate the
>      >>>> principle of least surprise ...
>      >>>>
>      >>>> of course "least surprise"? is somewhat
subjective.
>      >>>> Still, I clearly agree that the above would be one
>      >>>> desirable property.
>      >>>>
>      >>>> I think that any solution will lead to *some*
surprise
>      >>>> for some cases, I think primarily because there are
>      >>>> *many* different values z? for which? is.na(z)? is
>      >>>> true,? and in any case NA_complex_? is only of the
>      >>>> many.
>      >>>>
>      >>>> I also agree with Mikael that we should reconsider
the
>      >>>> issue that was raised by Davis Vaughan here ("on
>      >>>> R-devel") last April.
>      >>>>
>      >>>> ????? > Another (but maybe weaker) argument is
that
>      >>>> ????? > double->complex coercions happen more
often
>      >>>> than ????? > logical->complex and
integer->complex
>      >>>> ones. Changing the ????? > behaviour of the more
>      >>>> frequently performed coercion is ????? > more
likely to
>      >>>> affect code "out there".
>      >>>>
>      >>>> ????? > Yet another argument is that one expects
>      >>>>
>      >>>> ????? >????? identical(as.complex(NA_real_),
NA_real_ +
>      >>>> (0+0i))
>      >>>>
>      >>>> ????? > to be TRUE, i.e., that coercing from
double to
>      >>>> complex is ????? > equivalent to adding a complex
>      >>>> zero.? The new behaviour ????? > makes the above
FALSE,
>      >>>> since NA_real_ + (0+0i) gives ????? >
>      >>>> complex(r=NA_real_, i=0).
>      >>>>
>      >>>> No!? --- To my own surprise (!) --- in current
R-devel
>      >>>> the above is TRUE, and ??????? NA_real_ + (0+0i)? ,
the
>      >>>> same as ??????? NA_real_ + 0i????? , really gives
>      >>>> complex(r=NA, i=NA) :
>      >>>>
>      >>>> Using showC() from ?complex
>      >>>>
>      >>>> ??? showC <- function(z) noquote(sprintf("(R
= %g, I >      >>>> %g)", Re(z), Im(z)))
>      >>>>
>      >>>> we see (in R-devel) quite consistently
>      >>>>
>      >>>>> showC(NA_real_ + 0i)
>      >>>> [1] (R = NA, I = NA)
>      >>>>> showC(NA?????? + 0i)? # NA is 'logical'
>      >>>> [1] (R = NA, I = NA) where as in R 4.3.1 and
>      >>>> "R-patched" -- *in*consistently
>      >>>>
>      >>>>> showC(NA_real_ + 0i)
>      >>>> [1] (R = NA, I = 0)
>      >>>>> showC(NA + 0i)
>      >>>> [1] (R = NA, I = NA) .... and honestly, I do not see
>      >>>> *where* (and when) we changed the underlying code (in
>      >>>> arithmetic.c !?)? in R-devel to *also* produce
>      >>>> NA_complex_? in such complex *arithmetic*
>      >>>>
>      >>>>
>      >>>> ????? > Having said that, one might also (but more
>      >>>> naively) expect
>      >>>>
>      >>>> ????? >
>      >>>> identical(as.complex(as.double(NA_complex_)),
>      >>>> NA_complex_)
>      >>>>
>      >>>> ????? > to be TRUE.
>      >>>>
>      >>>> as in current R-devel
>      >>>>
>      >>>> ????? > Under my proposal it continues to be
FALSE.
>      >>>>
>      >>>> as in "R-release"
>      >>>>
>      >>>> ????? > Well, I'd prefer if it gave FALSE with
a
>      >>>> warning ????? > "imaginary parts discarded in
>      >>>> coercion", but it seems that ????? >
>      >>>> as.double(complex(r=a, i=b)) never warns when either
of
>      >>>> ????? > 'a' and 'b' is NA_real_ or
NaN, even where
>      >>>> "information" ????? > {nonzero
'b'} is clearly lost ...
>      >>>>
>      >>>> The question of *warning* here is related indeed, but
I
>      >>>> think we should try to look at it only *secondary* to
>      >>>> your first proposal.
>      >>>>
>      >>>> ????? > Whatever decision is made about
>      >>>> as.complex(NA_real_), ????? > maybe these points
should
>      >>>> be weighed before it becomes part of ????? >
R-release
>      >>>> ...
>      >>>>
>      >>>> ????? > Mikael
>      >>>>
>      >>>> Indeed.
>      >>>>
>      >>>> Can we please get other opinions / ideas here?
>      >>>>
>      >>>> Thank you in advance for your thoughts!  Martin
>      >>>>
>      >>>> ---
>      >>>>
>      >>>> PS:
>      >>>>
>      >>>> ?? Our *print()*ing? of complex NA's
("NA" here meaning
>      >>>> NA or NaN) ?? is also unsatisfactory, e.g. in the
case
>      >>>> where all entries of a ?? vector are NA in the sense
of
>      >>>> is.na(.), but their ?? Re() and Im() are not all NA:
>      >>>> ??? showC <- function(z) noquote(sprintf("(R
= %g, I >      >>>> %g)", Re(z), Im(z))) ??? z <-
complex(, c(11, NA, NA),
>      >>>> c(NA, 99, NA)) ??? z ??? showC(z)
>      >>>>
>      >>>> gives
>      >>>>
>      >>>> ??? > z ??? [1] NA NA NA ??? > showC(z) ??? [1]
(R >      >>>> 11, I = NA) (R = NA, I = 99) (R = NA, I = NA)
>      >>>>
>      >>>> but that (printing of complex) *is* another issue, in
>      >>>> which we have the re-opened bugzilla PR#16752
>      >>>>
==>https://bugs.r-project.org/show_bug.cgi?id=16752
>      >>>>
>      >>>> on which we also worked during the R Sprint in
Warwick
>      >>>> three weeks ago, and where I want to commit changes
in
>      >>>> any case {but think we should change even a bit more
>      >>>> than we got to during the Sprint}.
>      >>>>
>      >>>> ______________________________________________
>      >>>>R-devel at r-project.org ? mailing list
>      >>>>https://stat.ethz.ch/mailman/listinfo/r-devel
>      >>>
>      >>
>      > --
>      > Herv? Pag?s
>
>      > Bioconductor Core Teamhpages.on.github at gmail.com
>
>
>-- 
Herv? Pag?s

Bioconductor Core Team
hpages.on.github at gmail.com

	[[alternative HTML version deleted]]

R devel - Sep 2023 - Recent changes to as.complex(NA_real_)

[Rd] Recent changes to as.complex(NA_real_)

[Rd] Recent changes to as.complex(NA_real_)

[Rd] Recent changes to as.complex(NA_real_)