I have a file that I thought would be fairly simple to read in using read.table but I am having problems ( as usual ). each line of the file is of the form ( just 20 lines or so ) financials XXX, YYY, ZZZ automobiles RTR, ABC, TGH so the first field in the line is the industry and the other fields ( seperated by commas ) in the line are stock identifiers of stocks in that industry. note that there is no comma between the industry and the first stock identifier in the group which i guess might complicate things ? my goal is to make the row names the industries and the stock identifiers the column data but i don't have a header so , i am unclear ( reading the help on read.table ) how to tell R that the first field in each line should be used as the row name ? Thanks for any help or for telling me tht this is not possible. This will be my last bother of the day to the help group. I am using R-2.20 on windows xp and i've tried various settings of read.table without success. thanks. mark
On 11 June 2006 at 16:24, markleeds at verizon.net wrote: | I have a file that I thought would be fairly simple to read in using read.table but I am having problems ( as usual ). | | each line of the file is of the form ( just 20 lines or so ) | | financials XXX, YYY, ZZZ | automobiles RTR, ABC, TGH | | so the first field in the line is the industry and the other fields | ( seperated by commas ) in the line are stock identifiers of stocks | in that industry. note that there is no comma between the industry | and the first stock identifier in the group which i guess might | complicate things ? Yup, because that makes it such that the comma is no longer a unique seperator between _all_ column. But if the file really looks the way you typed it here, you should be fine by postprocessing the data afterwards and just removing the comma. See below for a hack-ish solution. | my goal is to make the row names the industries and the stock | identifiers the column data but i don't | have a header so , i am unclear ( reading the help on | read.table ) how to tell R that the first field in each line | should be used as the row name ? Thanks for any help | or for telling me tht this is not possible. This will be my last bother of the day to the help group. This is a little clumsy, using an apply to sweep a regexp transformation [ hey, you get to use what we taught you earlier :) ] through.> rawData <- read.table("/tmp/leeds.txt", row.names=1) > data <- apply(rawData, 2, function(X)gsub(",$", "", X)) > rownames(data) <- rownames(rawData) > dataV2 V3 V4 financials "XXX" "YYY" "ZZZ" automobiles "RTR" "ABC" "TGH">I'm sure someone named Gabor will soon post something doing the same in half the lines ... Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison
Try this (which reads in the lines, replaces commas with spaces and then rereads it as a data frame using column 1 as the rows: read.table(textConnection(chartr(",", " ", readLines("myfile.dat"))), row = 1, as.is = TRUE) I assume you want character data, not factors, but if you want factors remove the as.is = TRUE. On 6/11/06, markleeds at verizon.net <markleeds at verizon.net> wrote:> I have a file that I thought would be fairly simple to read in using read.table but I am having problems ( as usual ). > > each line of the file is of the form ( just 20 lines or so ) > > financials XXX, YYY, ZZZ > automobiles RTR, ABC, TGH > > so the first field in the line is the industry and the other fields > ( seperated by commas ) in the line are stock identifiers of stocks > in that industry. note that there is no comma between the industry > and the first stock identifier in the group which i guess might > complicate things ? > > my goal is to make the row names the industries and the stock > identifiers the column data but i don't > have a header so , i am unclear ( reading the help on > read.table ) how to tell R that the first field in each line > should be used as the row name ? Thanks for any help > or for telling me tht this is not possible. This will be my last bother of the day to the help group. > > I am using R-2.20 on windows xp and i've tried various settings > of read.table without success. thanks. > > mark > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >