Hi, I thought it would be convenient if the check.names argument to read.table, which currently can only be TRUE/FALSE, could take a function value as well. If the function is supplied it should be used instead of the default make.names. Here is an example where it can come in handy. I tend to keep my data in coma-separated files with a header line. The header line is prefixed with a comment sign '#' to simplify identification of these lines. Now when I read.table the files the '#' is converted to '.' while I want it to be discarded. Thanks, Vadim P.S. I don't know if r-help is the right place for feature requests. If it's not please let me know where the right one is.
>>>>> "Vadim" == Vadim Ogranovich <vograno at evafunds.com> >>>>> on Wed, 3 Sep 2003 14:29:25 -0700 writes:Vadim> Hi, I thought it would be convenient if the Vadim> check.names argument to read.table, which currently Vadim> can only be TRUE/FALSE, could take a function value Vadim> as well. If the function is supplied it should be Vadim> used instead of the default make.names. One could, but it's not necessary in your case (see below), and it's a potential pit to fall in.. We want read.table() to return valid data frames. Vadim> Here is an example where it can come in handy. I tend Vadim> to keep my data in coma-separated files with a header Vadim> line. The header line is prefixed with a comment sign Vadim> '#' to simplify identification of these lines. Now Vadim> when I read.table the files the '#' is converted to Vadim> '.' while I want it to be discarded. Hmm, are you using a very old version of R, or haven't you seen the `comment.char = "#"' argument of read.table()? Reading "?read.table", also note the note about `blank.lines.skip' , and then realize that the default for blank.lines.skip is ` !fill ' and that `fill = TRUE' for all the read.csv* and read.delim* incantation of read.table(). In sum, it's very easy to use current read.table() for your situation! Vadim> P.S. I don't know if r-help is the right place for Vadim> feature requests. If it's not please let me know Vadim> where the right one is. Since your proposal can be interpreted as "How do I use read.table() when my file has comment lines?", r-help has been very appropriate. Otherwise, and particularly if the proposal is more technical, R-devel would be better suited. Regards, Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <><
I admit I should have been more clear in my original posting. Let me try again
(and I do know that by deafulat read.table discards everything after '#'
which is why I use comment.char="", my bad not to mention this).
Here is a typical example of my data file:
#key value
foo 1.2
boo 1.3
As you see the header line begins with '#' and then lists the column
names, however make.names will convert the raw names c("#key",
"value") to c(".key", "value") while I need
c("key", "value"), i.e. no dot before key. So I am asking to
give us a hook to specify the function that will handle this situation.
I am not sure I understand how having this hook can result in an invalid data
frame? It can return invalid names, but check.names=FALSE can too.
Thanks,
Vadim
-----Original Message-----
From: Martin Maechler [mailto:maechler at stat.math.ethz.ch]
Sent: Thursday, September 04, 2003 1:28 AM
To: Vadim Ogranovich
Cc: R-Help (E-mail)
Subject: Re: [R] read.table: check.names arg - feature request
>>>>> "Vadim" == Vadim Ogranovich <vograno at
evafunds.com>
>>>>> on Wed, 3 Sep 2003 14:29:25 -0700 writes:
Vadim> Hi, I thought it would be convenient if the
Vadim> check.names argument to read.table, which currently
Vadim> can only be TRUE/FALSE, could take a function value
Vadim> as well. If the function is supplied it should be
Vadim> used instead of the default make.names.
One could, but it's not necessary in your case (see below), and
it's a potential pit to fall in.. We want read.table() to
return valid data frames.
Vadim> Here is an example where it can come in handy. I tend
Vadim> to keep my data in coma-separated files with a header
Vadim> line. The header line is prefixed with a comment sign
Vadim> '#' to simplify identification of these lines. Now
Vadim> when I read.table the files the '#' is converted to
Vadim> '.' while I want it to be discarded.
Hmm, are you using a very old version of R,
or haven't you seen the `comment.char = "#"' argument of
read.table()?
Reading "?read.table", also note the note about
`blank.lines.skip' , and then realize that the default for
blank.lines.skip is ` !fill ' and that `fill = TRUE' for all
the read.csv* and read.delim* incantation of read.table().
In sum, it's very easy to use current read.table() for your
situation!
Vadim> P.S. I don't know if r-help is the right place for
Vadim> feature requests. If it's not please let me know
Vadim> where the right one is.
Since your proposal can be interpreted as "How do I use
read.table() when my file has comment lines?",
r-help has been very appropriate.
Otherwise, and particularly if the proposal is more technical,
R-devel would be better suited.
Regards,
Martin Maechler <maechler at stat.math.ethz.ch>
http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27
ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND
phone: x-41-1-632-3408 fax: ...-1228 <><
Gabor Grothendieck
2003-Sep-05 00:36 UTC
[R] read.table: check.names arg - feature request
How about this:
read.table(textConnection(sub("^#","",readLines(myfile))),header=T)
which simply stripts off any # which is first on any line and then
passes the result to read.table .
--- "Vadim Ogranovich" <vograno@evafunds.com>
wrote:>I admit I should have been more clear in my original posting. Let me try
again (and I do know that by deafulat read.table discards everything after
'#' which is why I use comment.char="", my bad not to mention
this).
>
>
>Here is a typical example of my data file:
>
>#key value
>foo 1.2
>boo 1.3
>
>As you see the header line begins with '#' and then lists the column
names, however make.names will convert the raw names c("#key",
"value") to c(".key", "value") while I need
c("key", "value"), i.e. no dot before key. So I am asking to
give us a hook to specify the function that will handle this situation.
>
>
>
>I am not sure I understand how having this hook can result in an invalid
data frame? It can return invalid names, but check.names=FALSE can too.
>
>Thanks,
>Vadim
>
>-----Original Message-----
>From: Martin Maechler [mailto:maechler@stat.math.ethz.ch]
>Sent: Thursday, September 04, 2003 1:28 AM
>To: Vadim Ogranovich
>Cc: R-Help (E-mail)
>Subject: Re: [R] read.table: check.names arg - feature request
>
>
>>>>>> "Vadim" == Vadim Ogranovich
<vograno@evafunds.com>
>>>>>> on Wed, 3 Sep 2003 14:29:25 -0700 writes:
>
> Vadim> Hi, I thought it would be convenient if the
> Vadim> check.names argument to read.table, which currently
> Vadim> can only be TRUE/FALSE, could take a function value
> Vadim> as well. If the function is supplied it should be
> Vadim> used instead of the default make.names.
>
>One could, but it's not necessary in your case (see below), and
>it's a potential pit to fall in.. We want read.table() to
>return valid data frames.
>
> Vadim> Here is an example where it can come in handy. I tend
> Vadim> to keep my data in coma-separated files with a header
> Vadim> line. The header line is prefixed with a comment sign
> Vadim> '#' to simplify identification of these lines. Now
> Vadim> when I read.table the files the '#' is converted to
> Vadim> '.' while I want it to be discarded.
>
>Hmm, are you using a very old version of R,
>or haven't you seen the `comment.char = "#"' argument of
>read.table()?
>
>Reading "?read.table", also note the note about
>`blank.lines.skip' , and then realize that the default for
>blank.lines.skip is ` !fill ' and that `fill = TRUE' for all
>the read.csv* and read.delim* incantation of read.table().
>
>In sum, it's very easy to use current read.table() for your
>situation!
>
> Vadim> P.S. I don't know if r-help is the right place for
> Vadim> feature requests. If it's not please let me know
> Vadim> where the right one is.
>
>Since your proposal can be interpreted as "How do I use
>read.table() when my file has comment lines?",
>r-help has been very appropriate.
>
>Otherwise, and particularly if the proposal is more technical,
>R-devel would be better suited.
>
>Regards,
>Martin Maechler <maechler@stat.math.ethz.ch>
http://stat.ethz.ch/~maechler/
>Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27
>ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND
>phone: x-41-1-632-3408 fax: ...-1228 <><
>
>______________________________________________
>R-help@stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help