Thanks Gunter It seems that one has to know the structure of the data and adapt the read.table call accordingly. I am working on a framework that is meant to process data files with unknown structure, so I have to think a bit more about that... ________________________________ From: Bert Gunter <bgunter.4567 at gmail.com> Sent: Thursday, October 24, 2019 00:08 To: Sebastien Bihorel <Sebastien.Bihorel at cognigencorp.com> Cc: r-help at r-project.org <r-help at r-project.org> Subject: Re: [R] read.table and NaN Like this? con <- textConnection(object = 'A,B\n1,NaN\nNA,2')> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE,+ colClasses = c("numeric", "character"))> close.connection(con) > tmpA B 1 1 NaN 2 NA 2> class(tmp[,1])[1] "numeric"> class(tmp[,2])[1] "character"> tmp[,2][1] "NaN" "2" Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help <r-help at r-project.org<mailto:r-help at r-project.org>> wrote: Hi, Is there a way to make read.table consider NaN as a string of characters rather than the internal NaN? Changing the na.strings argument does not seems to have any effect on how R interprets the NaN string (while is does not the the NA string) con <- textConnection(object = 'A,B\n1,NaN\nNA,2') tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE) close.connection(con) tmp class(tmp[,1]) class(tmp[,2]) ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Not so. Read ?read.table carefully. You can use "NA" as a default. Moreover, you **specified** that you want NaN read as character, which means that any column containing NaN **must** be character. That's part of the specification for data frames (all columns must be one data type). So either change your specfication or change your data structure. And, incidentally, my first name is "Bert" . Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Oct 24, 2019 at 6:43 AM Sebastien Bihorel < Sebastien.Bihorel at cognigencorp.com> wrote:> Thanks Gunter > > It seems that one has to know the structure of the data and adapt the > read.table call accordingly. I am working on a framework that is meant to > process data files with unknown structure, so I have to think a bit more > about that... > ------------------------------ > *From:* Bert Gunter <bgunter.4567 at gmail.com> > *Sent:* Thursday, October 24, 2019 00:08 > *To:* Sebastien Bihorel <Sebastien.Bihorel at cognigencorp.com> > *Cc:* r-help at r-project.org <r-help at r-project.org> > *Subject:* Re: [R] read.table and NaN > > Like this? > > con <- textConnection(object = 'A,B\n1,NaN\nNA,2') > > tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', > stringsAsFactors = FALSE, > + colClasses = c("numeric", "character")) > > close.connection(con) > > tmp > A B > 1 1 NaN > 2 NA 2 > > class(tmp[,1]) > [1] "numeric" > > class(tmp[,2]) > [1] "character" > > tmp[,2] > [1] "NaN" "2" > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help < > r-help at r-project.org> wrote: > > Hi, > > Is there a way to make read.table consider NaN as a string of characters > rather than the internal NaN? Changing the na.strings argument does not > seems to have any effect on how R interprets the NaN string (while is does > not the the NA string) > > con <- textConnection(object = 'A,B\n1,NaN\nNA,2') > tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', > stringsAsFactors = FALSE) > close.connection(con) > tmp > class(tmp[,1]) > class(tmp[,2]) > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
Oh, and btw, I think you should omit the groups = argument. It's not needed since "groups" is already the conditioning variable, so only one group per panel, and using it seems to interact unfavorably with the way jittering is done. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Oct 24, 2019 at 7:39 AM Bert Gunter <bgunter.4567 at gmail.com> wrote:> Not so. Read ?read.table carefully. You can use "NA" as a default. > Moreover, you **specified** that you want NaN read as character, which > means that any column containing NaN **must** be character. That's part of > the specification for data frames (all columns must be one data type). So > either change your specfication or change your data structure. > > And, incidentally, my first name is "Bert" . > > Cheers, > Bert > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Thu, Oct 24, 2019 at 6:43 AM Sebastien Bihorel < > Sebastien.Bihorel at cognigencorp.com> wrote: > >> Thanks Gunter >> >> It seems that one has to know the structure of the data and adapt the >> read.table call accordingly. I am working on a framework that is meant to >> process data files with unknown structure, so I have to think a bit more >> about that... >> ------------------------------ >> *From:* Bert Gunter <bgunter.4567 at gmail.com> >> *Sent:* Thursday, October 24, 2019 00:08 >> *To:* Sebastien Bihorel <Sebastien.Bihorel at cognigencorp.com> >> *Cc:* r-help at r-project.org <r-help at r-project.org> >> *Subject:* Re: [R] read.table and NaN >> >> Like this? >> >> con <- textConnection(object = 'A,B\n1,NaN\nNA,2') >> > tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', >> stringsAsFactors = FALSE, >> + colClasses = c("numeric", "character")) >> > close.connection(con) >> > tmp >> A B >> 1 1 NaN >> 2 NA 2 >> > class(tmp[,1]) >> [1] "numeric" >> > class(tmp[,2]) >> [1] "character" >> > tmp[,2] >> [1] "NaN" "2" >> >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help < >> r-help at r-project.org> wrote: >> >> Hi, >> >> Is there a way to make read.table consider NaN as a string of characters >> rather than the internal NaN? Changing the na.strings argument does not >> seems to have any effect on how R interprets the NaN string (while is does >> not the the NA string) >> >> con <- textConnection(object = 'A,B\n1,NaN\nNA,2') >> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', >> stringsAsFactors = FALSE) >> close.connection(con) >> tmp >> class(tmp[,1]) >> class(tmp[,2]) >> >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >>[[alternative HTML version deleted]]
My bad, Bert ? My point is that my function/framework has very minimal expectations about the source data (mostly, that it is a rectangular shape table of data separated by some separator) and does not have any a-priori knowledge about what the first, second, etc columns in the data files must contain.... so while it would be possible to pass down some class vector which would be passed down as the colClasses argument to read.table, it is not necessarily reasonable in the context of the overall framework. I guess I was surprised that read.table interprets NaN in an input file as the internal "Not a number" rather than as a string... there is nothing in the ?read.table about that. Anyways, as I said, I need to think more about this in the context of the framework where this function operates... Thanks for the input ________________________________ From: Bert Gunter <bgunter.4567 at gmail.com> Sent: Thursday, October 24, 2019 10:39 To: Sebastien Bihorel <Sebastien.Bihorel at cognigencorp.com> Cc: r-help at r-project.org <r-help at r-project.org> Subject: Re: [R] read.table and NaN Not so. Read ?read.table carefully. You can use "NA" as a default. Moreover, you **specified** that you want NaN read as character, which means that any column containing NaN **must** be character. That's part of the specification for data frames (all columns must be one data type). So either change your specfication or change your data structure. And, incidentally, my first name is "Bert" . Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Oct 24, 2019 at 6:43 AM Sebastien Bihorel <Sebastien.Bihorel at cognigencorp.com<mailto:Sebastien.Bihorel at cognigencorp.com>> wrote: Thanks Gunter It seems that one has to know the structure of the data and adapt the read.table call accordingly. I am working on a framework that is meant to process data files with unknown structure, so I have to think a bit more about that... ________________________________ From: Bert Gunter <bgunter.4567 at gmail.com<mailto:bgunter.4567 at gmail.com>> Sent: Thursday, October 24, 2019 00:08 To: Sebastien Bihorel <Sebastien.Bihorel at cognigencorp.com<mailto:Sebastien.Bihorel at cognigencorp.com>> Cc: r-help at r-project.org<mailto:r-help at r-project.org> <r-help at r-project.org<mailto:r-help at r-project.org>> Subject: Re: [R] read.table and NaN Like this? con <- textConnection(object = 'A,B\n1,NaN\nNA,2')> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE,+ colClasses = c("numeric", "character"))> close.connection(con) > tmpA B 1 1 NaN 2 NA 2> class(tmp[,1])[1] "numeric"> class(tmp[,2])[1] "character"> tmp[,2][1] "NaN" "2" Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help <r-help at r-project.org<mailto:r-help at r-project.org>> wrote: Hi, Is there a way to make read.table consider NaN as a string of characters rather than the internal NaN? Changing the na.strings argument does not seems to have any effect on how R interprets the NaN string (while is does not the the NA string) con <- textConnection(object = 'A,B\n1,NaN\nNA,2') tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE) close.connection(con) tmp class(tmp[,1]) class(tmp[,2]) ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]