thr3ads.net - R devel - [Rd] 1954 from NA [May 2021]

If this information is useful, please help other people find it:
Share via:

Adrian Dușa

2021-May-24 09:46 UTC

[Rd] 1954 from NA

On Sun, May 23, 2021 at 10:14 PM Tomas Kalibera <tomas.kalibera at
gmail.com>
wrote:
> [...]
>
> Good, but unfortunately the delineation between computation and
> non-computation is not always transparent. Even if an operation doesn't
> look like "computation" on the high-level, it may internally
involve
> computation - so, really, an R NA can become R NaN and vice versa, at any
> point (this is not a "feature", but it is how things are now).
>
I see.
Well, this is a risk we'll have to consider when the time comes. For the
moment, storing some metadata within the payload seems to work.


> [...]
>
> Ok, then I would probably keep the meta-data on the missing values on the
> side to implement such missing values in such code, and treat them
> explicitly in supported operations.
>
> But. in principle, you can use the floating-point NaN payloads, and you
> can pass such values to R. You just need to be prepared that not only you
> would loose your payloads/tags, but also the difference between R NA and R
> NaNs. Thanks to value semantics of R, you would not loose the tags in input
> values with proper reference counts (e.g. marked immutable), because those
> values will not be modified.
>NaNs are fine of course, but then some (social science?) users might get
confused about the difference between NAs and NaNs, and for this reason
only I would still like to preserve the 1954 payload.
If at all possible, however, the extra 16 bits from this payload would make
a whole lot of a difference.

Please forgive my persistence, but would it be possible to use an unsigned
short instead of an unsigned int for the 1954 payload?
That is, if it doesn't break anything, but I don't really see what it
could. The corresponding check function seems to work just fine and it
doesn't need to be changed at all:

int R_IsNA(double x)
{
    if (isnan(x)) {
ieee_double y;
y.value = x;
return (y.word[lw] == 1954);
    }
    return 0;
}

Best wishes,
Adrian

	[[alternative HTML version deleted]]

Tomas Kalibera

2021-May-24 10:31 UTC

head link

[Rd] 1954 from NA

On 5/24/21 11:46 AM, Adrian Du?a wrote:> On Sun, May 23, 2021 at 10:14 PM Tomas Kalibera 
> <tomas.kalibera at gmail.com <mailto:tomas.kalibera at
gmail.com>> wrote:
>
>     [...]
>
>     Good, but unfortunately the delineation between computation and
>     non-computation is not always transparent. Even if an operation
>     doesn't look like "computation" on the high-level, it may
>     internally involve computation - so, really, an R NA can become R
>     NaN and vice versa, at any point (this is not a "feature",
but it
>     is how things are now).
>
>
> I see.
> Well, this is a risk we'll have to consider when the time?comes. For 
> the moment, storing some metadata within the payload seems to work.
>
>>     [...]
>
>     Ok, then I would probably keep the meta-data on the missing values
>     on the side to implement such missing values in such code, and
>     treat them explicitly in supported operations.
>
>     But. in principle, you can use the floating-point NaN payloads,
>     and you can pass such values to R. You just need to be prepared
>     that not only you would loose your payloads/tags, but also the
>     difference between R NA and R NaNs. Thanks to value semantics of
>     R, you would not loose the tags in input values with proper
>     reference counts (e.g. marked immutable), because those values
>     will not be modified.
>
> NaNs are fine of?course, but then some (social science?) users might 
> get confused about the difference between NAs and NaNs, and for this 
> reason only I would still like to preserve the 1954 payload.
> If at all possible, however, the extra 16 bits from this payload would 
> make a whole lot of a difference.
>
> Please forgive my persistence, but would it be possible to use an 
> unsigned short instead of an unsigned int for the 1954 payload?
> That is, if it doesn't break anything, but I don't really see what
it
> could. The corresponding check function seems to work just fine and it 
> doesn't need to be changed at all:
>
> int R_IsNA(double x)
> {
> ? ? if (isnan(x)) {
> ieee_double y;
> y.value = x;
> return (y.word[lw] == 1954);
> ? ? }
> ? ? return 0;
> }
For the reasons I explained, I would be against such a change. Keeping 
the data on the side, as also recommended by others on this list, would 
allow you for a reliable implementation. I don't want to support fragile 
package code building on unspecified R internals, and in this case 
particularly internals that themselves have not stood the test of time, 
so are at high risk of change.

Best
Tomas
>
> Best wishes,
> Adrian
>
>
>
	[[alternative HTML version deleted]]

R devel - May 2021 - 1954 from NA

[Rd] 1954 from NA

[Rd] 1954 from NA