On Sun, 28 May 2000 cstrato at EUnet.at wrote:> Dear R-people > > Sorry for asking a question only indirectly related to S/R but since > data containing NA values > can so easily be handled in S/R, and you can write functions for S/R in > C, my question is: > How do you handle data containing NA in C/C++ ? > > Although I know that IEEE floating point arithmetics supports NaN and > Inf, I cannot find > any information about this (e.g. in any of my many C++ books) > > Thank you in advance for your help > Christian Stratowa, ViennaIt''s in Writing R Extensions (sections 3.7.3 and 4.4 in the copy I have to hand, but it''s in the concept index). You cannot assume in R that NA is represented by an NaN, although on most machines it is. Conversely, most NaNs are not NA. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear R-people Sorry for asking a question only indirectly related to S/R but since data containing NA values can so easily be handled in S/R, and you can write functions for S/R in C, my question is: How do you handle data containing NA in C/C++ ? Although I know that IEEE floating point arithmetics supports NaN and Inf, I cannot find any information about this (e.g. in any of my many C++ books) Thank you in advance for your help Christian Stratowa, Vienna -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley <ripley at stats.ox.ac.uk> writes:> > Although I know that IEEE floating point arithmetics supports NaN and > > Inf, I cannot find > > any information about this (e.g. in any of my many C++ books)...> It''s in Writing R Extensions (sections 3.7.3 and 4.4 in the copy I have to > hand, but it''s in the concept index). You cannot assume in R that NA is > represented by an NaN, although on most machines it is. Conversely, most > NaNs are not NA.Perhaps it is necessary to be a little more specific here: The IEEE NaN is not a single value, but a set of values characterized by having an all-ones exponent and a non-zero significand (the cases with a zero significand are +Inf and -Inf). Have a look at http://www.linuxsupportline.com/~billm/index.html for the details. The double NA in R on IEEE-supporting systems is the NaN with significand 1954 (no, I don''t know who was born that year...). However integers have no definition of NaN, so NaInt is INT_MIN and for systems that don''t support IEEE, we have some special hacks too. Have a look in src/include/R_ext/Arith.h and src/main/arithmetic.c. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /''_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Prof Ripley Thank you very much for your fast response. I did not realize that meanwhile CRAN has official documents. Section 3.7.3 mentions macros in "Arith.h" for handling NAs. I assume that these macros can also be used in normal C programs.>From your answer I assume that C-libraries like the STL-library are normallynot able to deal with NAs? Best regards Christian Stratowa Prof Brian D Ripley wrote:> On Sun, 28 May 2000 cstrato at EUnet.at wrote: > > > Dear R-people > > > > Sorry for asking a question only indirectly related to S/R but since > > data containing NA values > > can so easily be handled in S/R, and you can write functions for S/R in > > C, my question is: > > How do you handle data containing NA in C/C++ ? > > > > Although I know that IEEE floating point arithmetics supports NaN and > > Inf, I cannot find > > any information about this (e.g. in any of my many C++ books) > > > > Thank you in advance for your help > > Christian Stratowa, Vienna > > It''s in Writing R Extensions (sections 3.7.3 and 4.4 in the copy I have to > hand, but it''s in the concept index). You cannot assume in R that NA is > represented by an NaN, although on most machines it is. Conversely, most > NaNs are not NA. > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272860 (secr) > Oxford OX1 3TG, UK Fax: +44 1865 272595-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Dr. Dalgaard Thank you, too, for your fast response. I have checked the web-site you mentioned. There is a function: int isnan(floating-type x) for floating point numbers, which I could use. For integers I will check Arith.h and Arithmetic.c Best regards Christian Stratowa Peter Dalgaard BSA wrote:> Prof Brian D Ripley <ripley at stats.ox.ac.uk> writes: > > > > Although I know that IEEE floating point arithmetics supports NaN and > > > Inf, I cannot find > > > any information about this (e.g. in any of my many C++ books) > ... > > It''s in Writing R Extensions (sections 3.7.3 and 4.4 in the copy I have to > > hand, but it''s in the concept index). You cannot assume in R that NA is > > represented by an NaN, although on most machines it is. Conversely, most > > NaNs are not NA. > > Perhaps it is necessary to be a little more specific here: The IEEE > NaN is not a single value, but a set of values characterized by having > an all-ones exponent and a non-zero significand (the cases with a zero > significand are +Inf and -Inf). Have a look at > > http://www.linuxsupportline.com/~billm/index.html > > for the details. > > The double NA in R on IEEE-supporting systems is the NaN with > significand 1954 (no, I don''t know who was born that year...). > > However integers have no definition of NaN, so NaInt is INT_MIN and > for systems that don''t support IEEE, we have some special hacks too. > Have a look in src/include/R_ext/Arith.h and src/main/arithmetic.c. > > -- > O__ ---- Peter Dalgaard Blegdamsvej 3 > c/ /''_ --- Dept. of Biostatistics 2200 Cph. N > (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 > ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Sun, 28 May 2000 cstrato at EUnet.at wrote:> Dear Dr. Dalgaard > > Thank you, too, for your fast response. > > I have checked the web-site you mentioned. There is a function: > int isnan(floating-type x) > for floating point numbers, which I could use. > For integers I will check Arith.h and Arithmetic.cUnfortunately, we found isnan is not 100% reliable. R defines int R_IsNaNorNA(double); for that purpose, and makes sure it works on all the platforms. (Note that some platforms that have isnan are marked as non-IEEE by the configure process.) My memory is that the real mess came with finite() and Inf/-Inf (at least one compiler had Inf < 3 true), but we also had problems with isnan returning a value which differed from true (as in 1 == 1). -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I''m looking for a way to do something like Kernel Density Estimation in R. I have some very big sets of things like packet interarrival times and would like to make plots of the estimated density function. Can anyone help me? Murray Jorgensen, Department of Statistics, U of Waikato, Hamilton, NZ -----[+64-7-838-4773]---------------------------[maj at waikato.ac.nz]----- "Doubt everything or believe everything:these are two equally convenient strategies. With either we dispense with the need to think." http://www.stats.waikato.ac.nz/Staff/maj.html - Henri Poincare'' -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Mon, 29 May 2000, Murray Jorgensen wrote:> I''m looking for a way to do something like Kernel Density Estimation in R. > I have some very big sets of things like packet interarrival times and > would like to make plots of the estimated density function. > > Can anyone help me?Well, R itself has density, and the MASS library has bandwidth selection techniques for it. That should work well for large datasets (up to the limits of R, anyway). A similar and perhaps even more capable approach is binning as taken by the library KernSmooth, an R version of which is packaged on CRAN. logspline and locfit (also on CRAN) have more sophisticated approaches to density estimation. There are comparisons and details of what is available in V&R3 and in particular in our on-line statistics complements. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
A few general comments. The IEEE 754 floating-point standard is one of the more striking successes in getting computer hardware to be more useful for those who program. There are, of course, glitches, both in non-compliance and in holes in the standard, but if we can work within the standard as far as possible, while complaining about the glitches, we''ll be better off, and C/C++ software produced will be more likely to port gracefully to other environments.>From that view, using C routines that are part of the standard (whileperhaps overriding them on machines that don''t conform) has advantages inside your own C/C++ code, IF that code is not intrinsically R/S dependent. So isnan() would be better in that case, R_IsNaNorNA() better for code that is R-dependent. Where it makes sense, there is also an advantage to doing the relevant testing in the S language and passing the result to the C code, either directly, say as a logical vector argument, or indirectly by doing the selection outside and leaving the C code to just grind away on the selected subset of the data. Within the S language, is.na() is the best test, because it deals with either floating point or integer data. Anyone interested in the relevance of the standard, or just a read through some insightful if eccentric ranting about numerical computation generally should eventually encounter W. Kahan, "the father of IEEE 754". There is a directory on the web at the Berkeley CS department: www.cs.berkeley.edu/%7Ewkahan/ieee754status/ All the papers in that directory are worth looking at, allowing for Kahan''s legendary rages at all those who failed his standards. Having had the privilege (well, looking back on it anyway) of taking a course from Kahan, I can verify that his personality comes across well in the papers. -- John M. Chambers jmc at bell-labs.com Bell Labs, Lucent Technologies office: (908)582-2681 700 Mountain Avenue, Room 2C-282 fax: (908)582-3340 Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Mon, 29 May 2000, John Chambers wrote:> A few general comments. > > The IEEE 754 floating-point standard is one of the more striking > successes in getting computer hardware to be more useful for those who > program. There are, of course, glitches, both in non-compliance and > in holes in the standard, but if we can work within the standard as > far as possible, while complaining about the glitches, we''ll be better > off, and C/C++ software produced will be more likely to port > gracefully to other environments. > > From that view, using C routines that are part of the standard (while > perhaps overriding them on machines that don''t conform) has advantages > inside your own C/C++ code, IF that code is not intrinsically R/S > dependent. So isnan() would be better in that case, R_IsNaNorNA() > better for code that is R-dependent.Since John may not have read *all* the R code yet, that is precisely what R_IsNaNorNA is: a wrapper for isnan on machines which have a working isnan, and something else otherwise. Namely: #ifdef IEEE_754 int R_IsNaNorNA(double x) { /* NOTE: some systems do not return 1 for TRUE. */ return (isnan(x) != 0); } #else .... where IEEE_754 is only set after testing (somewhat) functionality. For finite, which is buggier, the 1997 draft revision to the ANSI C standard promised isfinite and defined it tightly. It''s just that neither that revision not isfinite seem to be making any progress. One thing the R project keeps on teaching me is the importance of pragmatism here: a large proportion of the bug-fixing time is actually bug-avoidance over all of an increasing range of machines. Even so. I was unprepared for the truth of Inf < 3 on Visual C++ ! I would prefer to be pragmatic with correct answers than purist with wrong ones. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear experts Thank you all for this interesting information. I have learned a lot and hopefully can use it in C++. A personal note: I liked especially the document: http://www.cs.berkeley.edu/%7Ewkahan/ieee754status/754story.html since it mentions SANE as one of the few environments supporting the IEEE standard. The phonebook edition of "Inside Macintosh" was my only source until now. Luckily I got from you now a lot of information on this issue. Best regards Christian Stratowa, Vienna John Chambers wrote:> A few general comments. > > The IEEE 754 floating-point standard is one of the more striking > successes in getting computer hardware to be more useful for those who > program. There are, of course, glitches, both in non-compliance and > in holes in the standard, but if we can work within the standard as > far as possible, while complaining about the glitches, we''ll be better > off, and C/C++ software produced will be more likely to port > gracefully to other environments. > > >From that view, using C routines that are part of the standard (while > perhaps overriding them on machines that don''t conform) has advantages > inside your own C/C++ code, IF that code is not intrinsically R/S > dependent. So isnan() would be better in that case, R_IsNaNorNA() > better for code that is R-dependent. > > Where it makes sense, there is also an advantage to doing the relevant > testing in the S language and passing the result to the C code, either > directly, say as a logical vector argument, or indirectly by doing the > selection outside and leaving the C code to just grind away on the > selected subset of the data. Within the S language, is.na() is the > best test, because it deals with either floating point or integer > data. > > Anyone interested in the relevance of the standard, or just a read > through some insightful if eccentric ranting about numerical > computation generally should eventually encounter W. Kahan, "the > father of IEEE 754". > > There is a directory on the web at the Berkeley CS department: > www.cs.berkeley.edu/%7Ewkahan/ieee754status/ > All the papers in that directory are worth looking at, allowing > for Kahan''s legendary rages at all those who failed his standards. > Having had the privilege (well, looking back on it anyway) of taking a > course from Kahan, I can verify that his personality comes across well > in the papers. > > -- > John M. Chambers jmc at bell-labs.com > Bell Labs, Lucent Technologies office: (908)582-2681 > 700 Mountain Avenue, Room 2C-282 fax: (908)582-3340 > Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._