Saptarshi Guha
2009-Dec-31 20:43 UTC
[Rd] Benefit of treating NA and NaN differently for numerics
Hello, I notice in main/arithmetic.c, that NA and NaN are encoded differently(since every numeric NA comes from R_NaReal which is defined via ValueOfNA) . What is the benefit of treating these two differently? Why can't NA be a synonym for NaN? Thank you Saptarshi (R-2.9)
(Ted Harding)
2009-Dec-31 21:05 UTC
[Rd] Benefit of treating NA and NaN differently for numerics
On 31-Dec-09 20:43:43, Saptarshi Guha wrote:> Hello, > I notice in main/arithmetic.c, that NA and NaN are encoded > differently(since every numeric NA comes from R_NaReal which is > defined via ValueOfNA) > What is the benefit of treating these two differently? Why can't NA > be a synonym for NaN? > > Thank you > Saptarshi > (R-2.9)Because they are used to represent different things. Others will be able to give you a much more comprehensive account than I can of their uses in R, but essentially: NaN represents a result which is not valid (i.e. "Not a Number") in the domain of quantities being evaluated. For example, R does its arithmetic by default in the domain of "double", i.e. the machine representation of real numbers. In this domain, sqrt(-1) does not exist -- it is not a number in the domain of real numbers. Hence: sqrt(-1) # [1] NaN # Warning message: # In sqrt(-1) : NaNs produced In order to obtain a result which does exist, you need to switch domain to complex numbers: > sqrt(as.complex(-1)) # [1] 0+1i NA, on the other hand, represents a value (in whatever domain: double, logical, character, ...) which is not known, which is why it is typically used to represent missing data. It would be a valid entity in the current domain if its value were known, but the value is not known. Hence the result of any expression involving NA quantities is NA, since the value if the expression would depend on the unkown elements, and hence the value of the expression is unknown. This distinction is important and useful, so it should not be done away with by merging NaN and NA! Best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 31-Dec-09 Time: 21:05:06 ------------------------------ XFMail ------------------------------
Duncan Murdoch
2009-Dec-31 21:10 UTC
[Rd] Benefit of treating NA and NaN differently for numerics
On 31/12/2009 3:43 PM, Saptarshi Guha wrote:> Hello, > I notice in main/arithmetic.c, that NA and NaN are encoded > differently(since every numeric NA comes from R_NaReal which is > defined via ValueOfNA) > . What is the benefit of treating these two differently? Why can't NA > be a synonym for NaN?I don't know of any cases where a useful distinction is made between NA and NaN, but I suppose it could be useful to know where the bad value came from. R functions rarely generate NaN directly, it usually comes from the hardware or runtime library. And by the way, as the thread containing this message shows, http://finzi.psych.upenn.edu/R/R-devel/2009-August/054319.html there are several different encodings which are displayed as NA, and a huge number (more than 2^50, I seem to recall) of different encodings displayed as NaN. Duncan Murdoch