Dear R-Users, I have a csv file that has NA tokens and these tokens are perfectly good values that need not to be converted to NA by read.table(). I tried to prevent the conversion by specifying the na.strings arg., but this seems to only add to the list of NA strings, not substitute.> system("cat foo")system("cat foo") 1 foo 2 NA> read.table("foo", na.strings="foo")read.table("foo", na.strings="foo") V1 V2 1 1 NA 2 2 NA This is R1.6.0 on Linux. What did I do wrong? Thanks, Vadim -------------------------------------------------- DISCLAIMER \ This e-mail, and any attachments thereto, is intend ... [[dropped]]
Peter Dalgaard BSA
2002-Dec-19 23:17 UTC
[R] disabling NA token as na.string in read.table
Vadim Ogranovich <vograno at arbitrade.com> writes:> Dear R-Users, > > I have a csv file that has NA tokens and these tokens are perfectly good > values that need not to be converted to NA by read.table(). I tried to > prevent the conversion by specifying the na.strings arg., but this seems to > only add to the list of NA strings, not substitute. > > > system("cat foo") > system("cat foo") > 1 foo > 2 NA > > read.table("foo", na.strings="foo") > read.table("foo", na.strings="foo") > V1 V2 > 1 1 NA > 2 2 NA > > > This is R1.6.0 on Linux. > > What did I do wrong?Hmm, this looks like a bit of a bug. read.table() ends up calling type.convert() with its default "NA" na.string. Now, if "NA" was in the na.string for read.table(), scan() would already have turned it into <NA> at that point, so I suspect you might have preferred na.strings=character(0), but that has the side effect of turning the real NA into a factor level:> x <- c(NA,"NA","foo") > type.convert(x)[1] <NA> <NA> foo Levels: foo> type.convert(x,na.strings=character(0))[1] <NA> NA foo Levels: NA foo NA> dput(type.convert(x,na.strings=character(0)))structure(c(3, 1, 2), .Label = c("NA", "foo", NA), class = "factor") I.e. it looks like the internals of type.convert needs some fixing up. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907