On Linux when I compile R 2.10.0(devel) (src/main/arithmetic.c in particular) with gcc 3.4.5 using the flags -g -O2 I get noncommutative behavior when adding NA and NaN: > NA_real_ + NaN [1] NaN > NaN + NA_real_ [1] NA If I compile src/main/arithmetic.c without optimization (just -g) then both of those return NA. On Windows, using a precompiled R 2.8.1 from CRAN I get NA for both answers. On Linux, after compiling src/main/arithmetic.c with -g -O2 the bit patterns for NA_real_ and as.numeric(NA) are different: > my_numeric_NA <- as.numeric(NA) > writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp) 0000000 07a2 0000 0000 7ff8 0000010 > writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp) 0000000 07a2 0000 0000 7ff0 0000010 On Linux, after compiling with -g the bit patterns for NA_real_ and as.numeric(NA) are identical. > my_numeric_NA <- as.numeric(NA) > writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp) 0000000 07a2 0000 0000 7ff8 0000010 > writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp) 0000000 07a2 0000 0000 7ff8 0000010 On Windows, using precompiled R 2.8.1 and cygwin/bin/od, both of those gave the 7ff8 version. Is this confounding of NA and NaN of concern or does R not promise to keep NA and NaN distinct? I haven't followed all the macros, but it looks like arithmetic.c just does result[i]=x[i]+y[i] and lets the compiler/floating point unit decide what to do when x[i] and y[i] are different NaN values (NA is a NaN value). I haven't looked at the C code for the initialization of NA_real_. Adding explicit tests for NA-ness in the binary operators (as S+ does) adds a fairly significant cost. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com
Martin Maechler
2009-May-01 12:14 UTC
[Rd] NA_real_ <op> NaN -> NA or NaN, should we care?
>>>>> William Dunlap <wdunlap at tibco.com> >>>>> on Thu, 30 Apr 2009 10:51:43 -0700 writes:> On Linux when I compile R 2.10.0(devel) (src/main/arithmetic.c in > particular) > with gcc 3.4.5 using the flags -g -O2 I get noncommutative behavior when is this really gcc 3.4.5 (which is quite old) ? Without being an expert, I'd tend to claim this to be a compiler (optimization) bug .... but most probably the ANSI / ISO C (and libc ?) standards would not define the exact behavior of arithmetic with NaNs. > adding NA and NaN: >> NA_real_ + NaN > [1] NaN >> NaN + NA_real_ > [1] NA > If I compile src/main/arithmetic.c without optimization (just -g) > then both of those return NA. > On Windows, using a precompiled R 2.8.1 from CRAN I get > NA for both answers. > On Linux, after compiling src/main/arithmetic.c with -g -O2 the bit > patterns for NA_real_ and as.numeric(NA) are different: >> my_numeric_NA <- as.numeric(NA) >> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp) > 0000000 07a2 0000 0000 7ff8 > 0000010 >> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp) > 0000000 07a2 0000 0000 7ff0 > 0000010 > On Linux, after compiling with -g the bit patterns for NA_real_ > and as.numeric(NA) are identical. >> my_numeric_NA <- as.numeric(NA) >> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp) > 0000000 07a2 0000 0000 7ff8 > 0000010 >> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp) > 0000000 07a2 0000 0000 7ff8 > 0000010 > On Windows, using precompiled R 2.8.1 and cygwin/bin/od, both of those > gave the 7ff8 version. > Is this confounding of NA and NaN of concern or does R not promise to > keep NA and NaN distinct? Hmm, I'd say it *is* of some concern that "+" is not commutative in the narrow sense, even if I don't know what exactly "R promises". > I haven't followed all the macros, but it looks like arithmetic.c just > does > result[i]=x[i]+y[i] > and lets the compiler/floating point unit decide what to do when x[i] > and y[i] > are different NaN values (NA is a NaN value). I haven't looked at the C > code > for the initialization of NA_real_. Adding explicit tests for NA-ness > in the > binary operators (as S+ does) adds a fairly significant cost. Yes, I would be quite reluctant to add such tests, because such costs are to be expected. Maybe we ("R" :-) should explicitly state that operations mixing NA & NaN give a result which is NA in the sense of fulfilling is.na(.) but *not* promise anything further. Martin Maechler, ETH Zurich > Bill Dunlap > TIBCO Software Inc - Spotfire Division > wdunlap tibco.com
> From: Martin Maechler [mailto:maechler at stat.math.ethz.ch] > Sent: Friday, May 01, 2009 5:15 AM > To: William Dunlap > Cc: r-devel at r-project.org > Subject: Re: [Rd] NA_real_ <op> NaN -> NA or NaN, should we care? > > >>>>> William Dunlap <wdunlap at tibco.com> > >>>>> on Thu, 30 Apr 2009 10:51:43 -0700 writes: > > > On Linux when I compile R 2.10.0(devel) > (src/main/arithmetic.c in > > particular) > > with gcc 3.4.5 using the flags -g -O2 I get > noncommutative behavior when > > is this really gcc 3.4.5 (which is quite old) ?Yes, it was 3.4.5, but here is a self-contained example of the same issue using gcc 4.1.3 on an Ubuntu Linux machine: % gcc -O2 t.c -o a.out ; ./a.out NA : 7ff00000000007a2 NaN: fff8000000000000 NA+NaN: 7ff80000000007a2 NaN+NA: fff8000000000000 % gcc t.c -o a.out ; ./a.out NA : 7ff00000000007a2 NaN: fff8000000000000 NA+NaN: 7ff80000000007a2 NaN+NA: 7ff80000000007a2 % gcc -v Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.1.3 --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release i486-linux-gnu Thread model: posix gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2) % cat t.c #include <stdio.h> #include <stdint.h> #include <string.h> int main(int argc, char *argv[]) { int64_t NA_int64 = 0x7ff00000000007a2LL ; int64_t NaN_int64 = 0xfff8000000000000LL ; int64_t sum_int64 ; double NA_double, NaN_double, sum_double ; memcpy((void*)&NA_double, (void*)&NA_int64, 8) ; memcpy((void*)&NaN_double, (void*)&NaN_int64, 8) ; NaN_double = 1/0.0 - 1/0.0 ; printf("NA : %Lx\n", *(int64_t*)&NA_double); printf("NaN: %Lx\n", *(int64_t*)&NaN_double); sum_double = NA_double + NaN_double ; memcpy((void*)&sum_int64, (void*)&sum_double, 8) ; printf("NA+NaN: %Lx\n", sum_int64) ; sum_double = NaN_double + NA_double ; memcpy((void*)&sum_int64, (void*)&sum_double, 8) ; printf("NaN+NA: %Lx\n", sum_int64); return 0 ; } When I add -Wall to the -O2 then it gives me some warnings about the *(int64_t)&doubleVal in the printf statements for the inputs, but I used memcpy() to avoid the warnings when printing the outputs. % gcc -Wall -O2 t.c -o a.out ; ./a.out t.c: In function ?main?: t.c:17: warning: dereferencing type-punned pointer will break strict-aliasing rules t.c:18: warning: dereferencing type-punned pointer will break strict-aliasing rules NA : 7ff00000000007a2 NaN: fff8000000000000 NA+NaN: 7ff80000000007a2 NaN+NA: fff8000000000000 Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com
Apparently Analagous Threads
- Possible NA Propagation Failure in RISC-V64 CPU?
- Problem with graphics on latest CentOS 6
- Possible inconsistency between `as.complex(NA_real_)` and the docs
- Package MuMIn (dredge): Error in ret[, ] <- cbind(x, se, rep(if (is.null(df)) NA_real_ else df, : number of items to replace is not a multiple of replacement length.
- sum(..., na.rm=FALSE): Summing over NA_real_ values much more expensive than non-NAs for na.rm=FALSE? Hmm...