Dear R-Helpers, Why does R show character missing values in vectors as NA and when stored in a data frame as <NA>? I've searched but did not find an explanation. Thanks, Bob> gender <- c("f","f","f",NA,"m","m","m","m") > gender[1] "f" "f" "f" NA "m" "m" "m" "m" #here it lacks brackets.> > q1 <- c(1,2,2,3,4,5,5,4) > q1[1] 1 2 2 3 4 5 5 4> > myDF <- data.frame(q1,gender) > myDFq1 gender 1 1 f 2 2 f 3 2 f 4 3 <NA> #here it has brackets. 5 4 m 6 5 m 7 5 m 8 4 m ========================================================Bob Muenchen (pronounced Min'-chen), Manager, Statistical Consulting Center U of TN Office of Information Technology Stokely Management Center, Suite 200 916 Volunteer Blvd., Knoxville, TN 37996-0520 Voice: (865) 974-5230 FAX: (865) 974-4810 Email: muenchen at utk.edu Web: http://oit.utk.edu/scc Map: http://www.utk.edu/maps News: http://listserv.utk.edu/archives/statnews.html =========================================================
On Fri, Apr 4, 2008 at 11:00 AM, Muenchen, Robert A (Bob) <muenchen at utk.edu> wrote:> Dear R-Helpers, > > Why does R show character missing values in vectors as NA and when > stored in a data frame as <NA>? I've searched but did not find an > explanation.It's because that character vector is automatically coerced into a factor:> factor(gender)[1] f f f <NA> m m m m Levels: f m I'd imagine the display is different so you can tell the difference between NA and level "NA". (This isn't a problem for strings because they have quotes around them) Hadley -- http://had.co.nz/
Muenchen, Robert A (Bob) wrote:> Dear R-Helpers, > > Why does R show character missing values in vectors as NA and when > stored in a data frame as <NA>? I've searched but did not find an > explanation. > > Thanks, > Bob > > >> gender <- c("f","f","f",NA,"m","m","m","m") >> gender >> > [1] "f" "f" "f" NA "m" "m" "m" "m" #here it lacks brackets. > >> q1 <- c(1,2,2,3,4,5,5,4) >> q1 >> > [1] 1 2 2 3 4 5 5 4 > >> myDF <- data.frame(q1,gender) >> myDF >> > q1 gender > 1 1 f > 2 2 f > 3 2 f > 4 3 <NA> #here it has brackets. > 5 4 m > 6 5 m > 7 5 m > 8 4 m >It is actually a factor in the latter case > data.frame(gender)$gender [1] f f f <NA> m m m m Levels: f m However, you have the same effect with > data.frame(gender,stringsAsFactors=FALSE) gender 1 f 2 f 3 f 4 <NA> 5 m 6 m 7 m 8 m The thing to notice is that the printing is without the quote character. We also have > noquote(gender) [1] f f f <NA> m m m m And the point in either case is that we need some way to distinguish between NA (missing) and "NA" (New Alliance, Noradrenalin, North America, Neil Armstrong, etc.) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907