I have a file that I thought would be fairly simple to read in using read.table
but I am having problems ( as usual ).
each line of the file is of the form ( just 20 lines or so )
financials XXX, YYY, ZZZ
automobiles RTR, ABC, TGH
so the first field in the line is the industry and the other fields
( seperated by commas ) in the line are stock identifiers of stocks
in that industry. note that there is no comma between the industry
and the first stock identifier in the group which i guess might
complicate things ?
my goal is to make the row names the industries and the stock
identifiers the column data but i don't
have a header so , i am unclear ( reading the help on
read.table ) how to tell R that the first field in each line
should be used as the row name ? Thanks for any help
or for telling me tht this is not possible. This will be my last bother of the
day to the help group.
I am using R-2.20 on windows xp and i've tried various settings
of read.table without success. thanks.
mark
On 11 June 2006 at 16:24, markleeds at verizon.net wrote: | I have a file that I thought would be fairly simple to read in using read.table but I am having problems ( as usual ). | | each line of the file is of the form ( just 20 lines or so ) | | financials XXX, YYY, ZZZ | automobiles RTR, ABC, TGH | | so the first field in the line is the industry and the other fields | ( seperated by commas ) in the line are stock identifiers of stocks | in that industry. note that there is no comma between the industry | and the first stock identifier in the group which i guess might | complicate things ? Yup, because that makes it such that the comma is no longer a unique seperator between _all_ column. But if the file really looks the way you typed it here, you should be fine by postprocessing the data afterwards and just removing the comma. See below for a hack-ish solution. | my goal is to make the row names the industries and the stock | identifiers the column data but i don't | have a header so , i am unclear ( reading the help on | read.table ) how to tell R that the first field in each line | should be used as the row name ? Thanks for any help | or for telling me tht this is not possible. This will be my last bother of the day to the help group. This is a little clumsy, using an apply to sweep a regexp transformation [ hey, you get to use what we taught you earlier :) ] through.> rawData <- read.table("/tmp/leeds.txt", row.names=1) > data <- apply(rawData, 2, function(X)gsub(",$", "", X)) > rownames(data) <- rownames(rawData) > dataV2 V3 V4 financials "XXX" "YYY" "ZZZ" automobiles "RTR" "ABC" "TGH">I'm sure someone named Gabor will soon post something doing the same in half the lines ... Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison
Try this (which reads in the lines, replaces commas with spaces
and then rereads it as a data frame using column 1 as the rows:
read.table(textConnection(chartr(",", " ",
readLines("myfile.dat"))),
row = 1, as.is = TRUE)
I assume you want character data, not factors, but if you want factors
remove the as.is = TRUE.
On 6/11/06, markleeds at verizon.net <markleeds at verizon.net>
wrote:> I have a file that I thought would be fairly simple to read in using
read.table but I am having problems ( as usual ).
>
> each line of the file is of the form ( just 20 lines or so )
>
> financials XXX, YYY, ZZZ
> automobiles RTR, ABC, TGH
>
> so the first field in the line is the industry and the other fields
> ( seperated by commas ) in the line are stock identifiers of stocks
> in that industry. note that there is no comma between the industry
> and the first stock identifier in the group which i guess might
> complicate things ?
>
> my goal is to make the row names the industries and the stock
> identifiers the column data but i don't
> have a header so , i am unclear ( reading the help on
> read.table ) how to tell R that the first field in each line
> should be used as the row name ? Thanks for any help
> or for telling me tht this is not possible. This will be my last bother of
the day to the help group.
>
> I am using R-2.20 on windows xp and i've tried various settings
> of read.table without success. thanks.
>
> mark
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>