Full_Name: Bill Simpson Version: 0.64.1 OS: linux Submission from: (NULL) (193.62.250.209) Here is the data file: x y z 1 1 1 1 2 2 2 1 NaN 2 2 4>data<-read.table("~/junk.dat",header=TRUE)> datax y z 1 1 1 1 2 1 2 2 3 2 1 NaN 4 2 2 4> matrix(data$z,length(y),length(x))[,1] [,2] [1,] 1 4 [2,] 2 3 This is not the correct matrix. It seems that NaNs screw up matrix(). -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>>>>> wsi writes:> Full_Name: Bill Simpson > Version: 0.64.1 > OS: linux > Submission from: (NULL) (193.62.250.209)> Here is the data file: > x y z > 1 1 1 > 1 2 2 > 2 1 NaN > 2 2 4>> data<-read.table("~/junk.dat",header=TRUE)>> data > x y z > 1 1 1 1 > 2 1 2 2 > 3 2 1 NaN > 4 2 2 4>> matrix(data$z,length(y),length(x)) > [,1] [,2] > [1,] 1 4 > [2,] 2 3> This is not the correct matrix. It seems that NaNs screw up matrix().Actually, data$z is a factor with one level NaN. R> data$z [1] 1 2 NaN 4 Levels: 1 2 4 NaN What you get is the codes of that, which I think is what you want. -k -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Tue, 11 May 1999, Bill Simpson wrote: [examples with read.table(...,na.strings="NaN") snipped]> > > "NaN" (Not a Number) as seen in debuggers, etc., is not the same as "NA" > > as a "not available/applicable" value in R. > > Yes, I realize that. > However, since there is an R function is.nan(), and through experiments at > the command line like this > > xx<-NaN > > xx+1 > NaN > I thought that NaN was recognized by the R system and obeyed the proper > rules. Same for Inf and -Inf. > > Or are these things only sometimes recognized and sometimes not? Shouldn't > they be employed consistently? > > Bill >I'm going to defer this to people who know more than I about consistency and the differences between NA and NaN. I will note that the help for is.na() says: The generic function `is.na' returns a logical vector of the same ``form'' as its argument `x', containing `TRUE' for those elements marked `NA' or `NaN' (!) and `FALSE' otherwise. The (!) suggests that whoever wrote the help file recognized some subtlety here ... [similarly, is.finite() recognizes NA as a non-finite number] This seems to fall in the general category of "how do we help people avoid 'gotchas' in R that, while logical, are not necessarily what they expect? Can we do this without blurring logical distinctions that are interesting/important/significant to more knowledgeable users?" One *could* extend the default na.strings in read.table to c("NA","NaN"), which would take care of the immediate problem but might further blur the subtle distinction between NA and NaN. I don't know how/whether one would want to extend read.table() to deal with Inf/-Inf in data files ... (I think not). Ben -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Bill argues that NaN should be treated as a number, even by read.table() and I think he is right. { scan(.) does fine ! it *does* treat NaN as numbers ! } At the moment, I don't even see where the problem is; the read.table internal function type.convert() does well with NaN's : > cc <- as.character(c(pi,NaN)) > str(type.convert(cc)) num [1:2] 3.14 NaN Martin -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> Date: Tue, 11 May 1999 16:26:43 +0200 (MET DST) > From: maechler@stat.math.ethz.ch > To: r-devel@stat.math.ethz.ch > Subject: Re: matrix() can't handle NaN (PR#193) > > Bill [Simpson] argues that NaN should be treated as a number, > even by read.table() and I think he is right. > { scan(.) does fine ! it *does* treat NaN as numbers ! } > > At the moment, I don't even see where the problem is; > the read.table internal function type.convert() > does well with NaN's : > > cc <- as.character(c(pi,NaN)) > > str(type.convert(cc)) > num [1:2] 3.14 NaNThe field is regarded as a character field unless na.strings is extended to include NaN. I am not in favour of treating NaN as a number, which is linguistic nonsense! (What does NaN stand for?) First, in IEEE arithmetic, NaN is a class of numbers as I understand it (and internally in R NA is one of those NaNs). So just which one do you mean? And what are you going to achieve by having essentially a second NA, except to blur the real purposes of NaNs? Secondly, on non-IEEE implementations of R, I believe NaN = NA: in any case we should not lightly be making IEEE features an essential part of the language. (Or with the gradual demise of Vaxen is IEEE arithmetic now universal: I still have a Sun running with a special-purpose non-IEEE floating-point unit.) -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Can we backtrack here. I have just tried the original example on Solaris, and matrix() _can_ handle NaN with a mature ANSI C library. toucan% cat > junk.dat x y z 1 1 1 1 2 2 2 1 NaN 2 2 4 toucan% R R : Copyright 1999, The R Development Core Team Version 0.64.1 (May 8, 1999) ...> data <- read.table("junk.dat",header=TRUE) > x<-unique(data$x) > y<-unique(data$y) > matrix(data$z,length(y),length(x))[,1] [,2] [1,] 1 NaN [2,] 2 4 So on Solaris NaNs are read as such, and on Linux they are not.> From: Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk> > Date: 12 May 1999 11:27:17 +0200> We do allow them as constants in the language itself, so I suppose we > could do it in read.table as well - this really means inside > as.double, I suppose. The prototype does this: > > Splus> is.nan(as.double("NaN")) > [1] Tand so does my R on Solaris. On Linux> is.nan(as.double("NaN"))Warning: NAs introduced by coercion [1] FALSE so I think there is a problem with Linux versions specifically. The conversion is done by strtod. Solaris says (same man page) If str is NaN, then atof() returns NaN. This sort of mess is my underlying objection: should we be assuming IEEE arithmetic, let alone a particular implementation of it? -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._