David Kane <David Kane
2002-Apr-30 14:47 UTC
[R] display of character NA's in a dataframe in 1.5.0
I understand that NA's in character vectors are displayed differently than NA's in factor vectors.> c("x", NA, "y")[1] "x" NA "y"> as.factor(c("x", NA, "y"))[1] x <NA> y Levels: x y That seems sensible enough. But shouldn't I see the same behavior in a dataframe?> test <- data.frame(a = c("x", NA, "y")) > testa 1 x 2 <NA> 3 y> is.factor(test$a)[1] TRUE> is.character(test$a)[1] FALSE This behavior is correct since R coerces `a' to be a factor as it constructs the test dataframe. But consider what happens when I force `a' to be character:> test$a <- as.character(test$a) > is.factor(test$a)[1] FALSE> is.character(test$a)[1] TRUE> testa 1 x 2 <NA> 3 y The display is the same. I would have expected it to be something like:> testa 1 x 2 NA 3 y If this is a bug, please let me know and I would be happy to submit it as such. But, I suspect that it is more likely that there is something that I don't fully understand about NA's and dataframes.> R.version_ platform sparc-sun-solaris2.6 arch sparc os solaris2.6 system sparc, solaris2.6 status major 1 minor 5.0 year 2002 month 04 day 29 language R Thanks, Dave Kane -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian Ripley
2002-Apr-30 15:38 UTC
[R] display of character NA's in a dataframe in 1.5.0
This is not a bug, and it is described in the section on USER-VISIBLE CHANGES in the NEWS file! Hint: consider the setting of the quote argument when printing. On Tue, 30 Apr 2002, David Kane <David Kane wrote:> I understand that NA's in character vectors are displayed differently than NA's > in factor vectors. > > > c("x", NA, "y") > [1] "x" NA "y" > > as.factor(c("x", NA, "y")) > [1] x <NA> y > Levels: x y > > That seems sensible enough. But shouldn't I see the same behavior in a dataframe? > > > test <- data.frame(a = c("x", NA, "y")) > > test > a > 1 x > 2 <NA> > 3 y > > is.factor(test$a) > [1] TRUE > > is.character(test$a) > [1] FALSE > > This behavior is correct since R coerces `a' to be a factor as it constructs > the test dataframe. But consider what happens when I force `a' to be character: > > > test$a <- as.character(test$a) > > is.factor(test$a) > [1] FALSE > > is.character(test$a) > [1] TRUE > > test > a > 1 x > 2 <NA> > 3 y > > The display is the same. I would have expected it to be something like: > > > test > a > 1 x > 2 NA > 3 yBut then you would have though NA meant Nabisco.> > If this is a bug, please let me know and I would be happy to submit it as > such. But, I suspect that it is more likely that there is something that I > don't fully understand about NA's and dataframes. > > > > R.version > _ > platform sparc-sun-solaris2.6 > arch sparc > os solaris2.6 > system sparc, solaris2.6 > status > major 1 > minor 5.0 > year 2002 > month 04 > day 29 > language R > > Thanks, > > Dave Kane > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
David Kane <David Kane
2002-Apr-30 19:13 UTC
[R] display of character NA's in a dataframe in 1.5.0
Thanks to Douglas Bates, Brian Ripley and Don MacQueen for taking the time to answer my question. I think that Doug Bates puts it best when he writes:>The difference in the display of character NA's is according to >whether the quote option is on. By default it is on for character >vectors, so an NA is displayed as NA and the character string "NA" is >displayed as "NA", and off for data frames, so an NA is displayed as ><NA> and the character string "NA" is displayed as NA.So, in my example, we can see:> c("x", NA, "y")[1] "x" NA "y"> print(c("x", NA, "y"), quote = FALSE)[1] x <NA> y> data.frame(a = c("x", NA, "y"))a 1 x 2 <NA> 3 y> print(data.frame(a = c("x", NA, "y")), quote = TRUE)a 1 "x" 2 NA 3 "y" As best I can tell, this is not an "option," in the sense of something that the user can set with options(). That is, unless you want to override the default print methods, the display will be different for vectors and for dataframes. There was also information about this in the NEWS file for 1.5.0. Thanks to all, Dave Kane -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._