Hi: When I import a Stata (version 7) data set into R, the missing values for factor and numeric variables are represented as "NaN", but the missing values for date variables are represented as "NA". Can "NaN" for numeric variables be treated as the same as "NA" ? Are there situations when these two representations are not equivalent? The following are the details of my system: platform i386-pc-mingw32 arch x86 os Win32 system x86, Win32 status major 1 minor 4.1 year 2002 month 01 day 30 language R Thanks for your help, Ravi Varadhan. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Wed, 24 Jul 2002, Ravi Varadhan wrote:> Hi: > > When I import a Stata (version 7) data set into R, the missing values > for factor and numeric variables are represented as "NaN", but the > missing values for date variables are represented as "NA". Can "NaN" > for numeric variables be treated as the same as "NA" ? Are there > situations when these two representations are not equivalent? >They are almost always treated the same way. The few exceptions are things like the function is.nan(), which is FALSE for NA, and the fact that unique() on a vector containing NA and NaN will report them as different values. However, I'm surprised that you are getting NaNs. I get NA for missing numeric data under both Windows and Linux (though the Windows version is 1.5.1), and the C source code looks as if it should give NA (I certainly intended it to). -thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Wed, 24 Jul 2002, Ravi Varadhan wrote:> When I import a Stata (version 7) data set into R, the missing values > for factor and numeric variables are represented as "NaN", but the > missing values for date variables are represented as "NA". Can "NaN" > for numeric variables be treated as the same as "NA" ? Are there > situations when these two representations are not equivalent?The following may help:> is.na(c(NA, NaN))[1] TRUE TRUE> is.nan(c(NA, NaN))[1] FALSE TRUE So for almost all purposes NaN is treated the same as missing NA, but if you really need to the two can be distinguished. It looks to me as if this is something we should fix in the foreign package (unless it has already been given that your version of R is two versions old). Could you provide an example file where you get NaN and expected NA? -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._