Hi, I thought it would be convenient if the check.names argument to read.table, which currently can only be TRUE/FALSE, could take a function value as well. If the function is supplied it should be used instead of the default make.names. Here is an example where it can come in handy. I tend to keep my data in coma-separated files with a header line. The header line is prefixed with a comment sign '#' to simplify identification of these lines. Now when I read.table the files the '#' is converted to '.' while I want it to be discarded. Thanks, Vadim P.S. I don't know if r-help is the right place for feature requests. If it's not please let me know where the right one is.
>>>>> "Vadim" == Vadim Ogranovich <vograno at evafunds.com> >>>>> on Wed, 3 Sep 2003 14:29:25 -0700 writes:Vadim> Hi, I thought it would be convenient if the Vadim> check.names argument to read.table, which currently Vadim> can only be TRUE/FALSE, could take a function value Vadim> as well. If the function is supplied it should be Vadim> used instead of the default make.names. One could, but it's not necessary in your case (see below), and it's a potential pit to fall in.. We want read.table() to return valid data frames. Vadim> Here is an example where it can come in handy. I tend Vadim> to keep my data in coma-separated files with a header Vadim> line. The header line is prefixed with a comment sign Vadim> '#' to simplify identification of these lines. Now Vadim> when I read.table the files the '#' is converted to Vadim> '.' while I want it to be discarded. Hmm, are you using a very old version of R, or haven't you seen the `comment.char = "#"' argument of read.table()? Reading "?read.table", also note the note about `blank.lines.skip' , and then realize that the default for blank.lines.skip is ` !fill ' and that `fill = TRUE' for all the read.csv* and read.delim* incantation of read.table(). In sum, it's very easy to use current read.table() for your situation! Vadim> P.S. I don't know if r-help is the right place for Vadim> feature requests. If it's not please let me know Vadim> where the right one is. Since your proposal can be interpreted as "How do I use read.table() when my file has comment lines?", r-help has been very appropriate. Otherwise, and particularly if the proposal is more technical, R-devel would be better suited. Regards, Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <><
I admit I should have been more clear in my original posting. Let me try again (and I do know that by deafulat read.table discards everything after '#' which is why I use comment.char="", my bad not to mention this). Here is a typical example of my data file: #key value foo 1.2 boo 1.3 As you see the header line begins with '#' and then lists the column names, however make.names will convert the raw names c("#key", "value") to c(".key", "value") while I need c("key", "value"), i.e. no dot before key. So I am asking to give us a hook to specify the function that will handle this situation. I am not sure I understand how having this hook can result in an invalid data frame? It can return invalid names, but check.names=FALSE can too. Thanks, Vadim -----Original Message----- From: Martin Maechler [mailto:maechler at stat.math.ethz.ch] Sent: Thursday, September 04, 2003 1:28 AM To: Vadim Ogranovich Cc: R-Help (E-mail) Subject: Re: [R] read.table: check.names arg - feature request>>>>> "Vadim" == Vadim Ogranovich <vograno at evafunds.com> >>>>> on Wed, 3 Sep 2003 14:29:25 -0700 writes:Vadim> Hi, I thought it would be convenient if the Vadim> check.names argument to read.table, which currently Vadim> can only be TRUE/FALSE, could take a function value Vadim> as well. If the function is supplied it should be Vadim> used instead of the default make.names. One could, but it's not necessary in your case (see below), and it's a potential pit to fall in.. We want read.table() to return valid data frames. Vadim> Here is an example where it can come in handy. I tend Vadim> to keep my data in coma-separated files with a header Vadim> line. The header line is prefixed with a comment sign Vadim> '#' to simplify identification of these lines. Now Vadim> when I read.table the files the '#' is converted to Vadim> '.' while I want it to be discarded. Hmm, are you using a very old version of R, or haven't you seen the `comment.char = "#"' argument of read.table()? Reading "?read.table", also note the note about `blank.lines.skip' , and then realize that the default for blank.lines.skip is ` !fill ' and that `fill = TRUE' for all the read.csv* and read.delim* incantation of read.table(). In sum, it's very easy to use current read.table() for your situation! Vadim> P.S. I don't know if r-help is the right place for Vadim> feature requests. If it's not please let me know Vadim> where the right one is. Since your proposal can be interpreted as "How do I use read.table() when my file has comment lines?", r-help has been very appropriate. Otherwise, and particularly if the proposal is more technical, R-devel would be better suited. Regards, Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <><
Gabor Grothendieck
2003-Sep-05 00:36 UTC
[R] read.table: check.names arg - feature request
How about this: read.table(textConnection(sub("^#","",readLines(myfile))),header=T) which simply stripts off any # which is first on any line and then passes the result to read.table . --- "Vadim Ogranovich" <vograno@evafunds.com> wrote:>I admit I should have been more clear in my original posting. Let me try again (and I do know that by deafulat read.table discards everything after '#' which is why I use comment.char="", my bad not to mention this). > > >Here is a typical example of my data file: > >#key value >foo 1.2 >boo 1.3 > >As you see the header line begins with '#' and then lists the column names, however make.names will convert the raw names c("#key", "value") to c(".key", "value") while I need c("key", "value"), i.e. no dot before key. So I am asking to give us a hook to specify the function that will handle this situation. > > > >I am not sure I understand how having this hook can result in an invalid data frame? It can return invalid names, but check.names=FALSE can too. > >Thanks, >Vadim > >-----Original Message----- >From: Martin Maechler [mailto:maechler@stat.math.ethz.ch] >Sent: Thursday, September 04, 2003 1:28 AM >To: Vadim Ogranovich >Cc: R-Help (E-mail) >Subject: Re: [R] read.table: check.names arg - feature request > > >>>>>> "Vadim" == Vadim Ogranovich <vograno@evafunds.com> >>>>>> on Wed, 3 Sep 2003 14:29:25 -0700 writes: > > Vadim> Hi, I thought it would be convenient if the > Vadim> check.names argument to read.table, which currently > Vadim> can only be TRUE/FALSE, could take a function value > Vadim> as well. If the function is supplied it should be > Vadim> used instead of the default make.names. > >One could, but it's not necessary in your case (see below), and >it's a potential pit to fall in.. We want read.table() to >return valid data frames. > > Vadim> Here is an example where it can come in handy. I tend > Vadim> to keep my data in coma-separated files with a header > Vadim> line. The header line is prefixed with a comment sign > Vadim> '#' to simplify identification of these lines. Now > Vadim> when I read.table the files the '#' is converted to > Vadim> '.' while I want it to be discarded. > >Hmm, are you using a very old version of R, >or haven't you seen the `comment.char = "#"' argument of >read.table()? > >Reading "?read.table", also note the note about >`blank.lines.skip' , and then realize that the default for >blank.lines.skip is ` !fill ' and that `fill = TRUE' for all >the read.csv* and read.delim* incantation of read.table(). > >In sum, it's very easy to use current read.table() for your >situation! > > Vadim> P.S. I don't know if r-help is the right place for > Vadim> feature requests. If it's not please let me know > Vadim> where the right one is. > >Since your proposal can be interpreted as "How do I use >read.table() when my file has comment lines?", >r-help has been very appropriate. > >Otherwise, and particularly if the proposal is more technical, >R-devel would be better suited. > >Regards, >Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ >Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 >ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND >phone: x-41-1-632-3408 fax: ...-1228 <>< > >______________________________________________ >R-help@stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help