Dear useR again, How can I read a dataset if lines in dataset did not have same elements (have different lengths), For example: 1 2, 4, 16, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 2 2, 13, 5, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 3 4, 5, 11, 1, 1, 6, 1, 1, 5, 14, 1, 1, 15, 1, 1 4 2, 5, 9, 1, 1, 14, 1, 1, 8, 16, 1, 1, 13, 1, 1 5 3, 7, 14, 1, 1, 14, 1, 1, 5, 21, 1, 1, 8, 1, 1 6 6, 3, 1, 12, 1, 1, 5, 8, 1, 1, 15, 1, 1 7 6, 3, 1, 11, 1, 1, 10, 7, 1, 1, 21, 1, 1 8 21, 20, 9, 1, 1, 6, 1, 1, 13, 10, 1, 1, 1 9 5, 7, 21, 1, 1, 13, 1, 1, 14, 2, 1, 1, 6, 1, 1 10 8, 14, 10, 1, 1, 5, 1, 1, 10, 5, 1, 1, 5, 1, 1 11 5, 20, 17, 1, 1, 19, 1, 1, 14, 7, 1, 1, 6, 1, 1 12 7, 4, 11, 1, 1, 2, 1, 1, 5, 13, 1, 1, 14, 1, 1 13 7, 14, 13, 1, 1, 6, 1, 1, 13, 16, 1, 1, 17, 1, 1 14 7, 14, 5, 1, 1, 5, 1, 1, 5, 17, 1, 1, 17, 1, 1 15 3, 9, 12, 1, 1, 18, 1, 1, 6, 1, 4, 1, 1 16 7, 10, 5, 1, 1, 12, 1, 1, 5, 17, 1, 1, 13, 1, 1 17 12, 8, 16, 1, 1, 5, 1, 1, 8, 10, 1, 1, 14, 1, 1 18 5, 11, 7, 1, 1, 5, 1, 1, 18, 13, 1, 1, 17, 1, 1 19 7, 13, 8, 1, 1, 14, 1, 1, 5, 17, 1, 1, 13, 1, 1 20 7, 18, 21, 1, 1, 16, 1, 1, 5, 17, 1, 1, 13, 1, 1 I know that in BioC package rmutil have a function (read.list) to handle different lengths sets of lines but it did not work.> library(rmutil)Error in library(rmutil) : 'rmutil' is not a valid package -- installed < 2.0.0?>Are there any others function to handle this. Best regards Xiyan Lon> version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 0.1 year 2004 month 11 day 15 language R>
If the file is formatted as you've shown, you should be able to read it with read.fwf(). Andy> From: Xiyan Lon > > Dear useR again, > How can I read a dataset if lines in dataset did not have same > elements (have different lengths), For example: > > 1 2, 4, 16, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 > 2 2, 13, 5, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 > 3 4, 5, 11, 1, 1, 6, 1, 1, 5, 14, 1, 1, 15, 1, 1 > 4 2, 5, 9, 1, 1, 14, 1, 1, 8, 16, 1, 1, 13, 1, 1 > 5 3, 7, 14, 1, 1, 14, 1, 1, 5, 21, 1, 1, 8, 1, 1 > 6 6, 3, 1, 12, 1, 1, 5, 8, 1, 1, 15, 1, 1 > 7 6, 3, 1, 11, 1, 1, 10, 7, 1, 1, 21, 1, 1 > 8 21, 20, 9, 1, 1, 6, 1, 1, 13, 10, 1, 1, 1 > 9 5, 7, 21, 1, 1, 13, 1, 1, 14, 2, 1, 1, 6, 1, 1 > 10 8, 14, 10, 1, 1, 5, 1, 1, 10, 5, 1, 1, 5, 1, 1 > 11 5, 20, 17, 1, 1, 19, 1, 1, 14, 7, 1, 1, 6, 1, 1 > 12 7, 4, 11, 1, 1, 2, 1, 1, 5, 13, 1, 1, 14, 1, 1 > 13 7, 14, 13, 1, 1, 6, 1, 1, 13, 16, 1, 1, 17, 1, 1 > 14 7, 14, 5, 1, 1, 5, 1, 1, 5, 17, 1, 1, 17, 1, 1 > 15 3, 9, 12, 1, 1, 18, 1, 1, 6, 1, 4, 1, 1 > 16 7, 10, 5, 1, 1, 12, 1, 1, 5, 17, 1, 1, 13, 1, 1 > 17 12, 8, 16, 1, 1, 5, 1, 1, 8, 10, 1, 1, 14, 1, 1 > 18 5, 11, 7, 1, 1, 5, 1, 1, 18, 13, 1, 1, 17, 1, 1 > 19 7, 13, 8, 1, 1, 14, 1, 1, 5, 17, 1, 1, 13, 1, 1 > 20 7, 18, 21, 1, 1, 16, 1, 1, 5, 17, 1, 1, 13, 1, 1 > > I know that in BioC package rmutil have a function (read.list) to > handle different lengths sets of lines but it did not work. > > library(rmutil) > Error in library(rmutil) : 'rmutil' is not a valid package -- > installed < 2.0.0? > > > > Are there any others function to handle this. > > Best regards > Xiyan Lon > > > version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 0.1 > year 2004 > month 11 > day 15 > language R > > > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > >
Xiyan Lon wrote:> Dear useR again, > How can I read a dataset if lines in dataset did not have same > elements (have different lengths), For example: > > 1 2, 4, 16, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 > 2 2, 13, 5, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 > 3 4, 5, 11, 1, 1, 6, 1, 1, 5, 14, 1, 1, 15, 1, 1 > 4 2, 5, 9, 1, 1, 14, 1, 1, 8, 16, 1, 1, 13, 1, 1 > 5 3, 7, 14, 1, 1, 14, 1, 1, 5, 21, 1, 1, 8, 1, 1 > 6 6, 3, 1, 12, 1, 1, 5, 8, 1, 1, 15, 1, 1 > 7 6, 3, 1, 11, 1, 1, 10, 7, 1, 1, 21, 1, 1 > 8 21, 20, 9, 1, 1, 6, 1, 1, 13, 10, 1, 1, 1 > 9 5, 7, 21, 1, 1, 13, 1, 1, 14, 2, 1, 1, 6, 1, 1 > 10 8, 14, 10, 1, 1, 5, 1, 1, 10, 5, 1, 1, 5, 1, 1 > 11 5, 20, 17, 1, 1, 19, 1, 1, 14, 7, 1, 1, 6, 1, 1 > 12 7, 4, 11, 1, 1, 2, 1, 1, 5, 13, 1, 1, 14, 1, 1 > 13 7, 14, 13, 1, 1, 6, 1, 1, 13, 16, 1, 1, 17, 1, 1 > 14 7, 14, 5, 1, 1, 5, 1, 1, 5, 17, 1, 1, 17, 1, 1 > 15 3, 9, 12, 1, 1, 18, 1, 1, 6, 1, 4, 1, 1 > 16 7, 10, 5, 1, 1, 12, 1, 1, 5, 17, 1, 1, 13, 1, 1 > 17 12, 8, 16, 1, 1, 5, 1, 1, 8, 10, 1, 1, 14, 1, 1 > 18 5, 11, 7, 1, 1, 5, 1, 1, 18, 13, 1, 1, 17, 1, 1 > 19 7, 13, 8, 1, 1, 14, 1, 1, 5, 17, 1, 1, 13, 1, 1 > 20 7, 18, 21, 1, 1, 16, 1, 1, 5, 17, 1, 1, 13, 1, 1 >For data structured as above, read.fwf() should work.> I know that in BioC package rmutil have a function (read.list) to > handle different lengths sets of lines but it did not work. > >>library(rmutil) > > Error in library(rmutil) : 'rmutil' is not a valid package -- installed < 2.0.0?You have to install a version that has been compiled for R-2.0.x Uwe Ligges> > Are there any others function to handle this. > > Best regards > Xiyan Lon > > >>version > > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 0.1 > year 2004 > month 11 > day 15 > language R > > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Without some sort of formatting or prior knowledge to indicate which fields are present and which are missing, I don't see how such a file can be properly read. With such formatting present, there are several ways. e.g. See ?read.table, ?readLines, ?scan, ?connections, ... -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Xiyan Lon > Sent: Monday, March 21, 2005 9:41 AM > To: R-help at stat.math.ethz.ch > Subject: [R] Read a dataset with different lengths > > Dear useR again, > How can I read a dataset if lines in dataset did not have same > elements (have different lengths), For example: > > 1 2, 4, 16, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 > 2 2, 13, 5, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 > 3 4, 5, 11, 1, 1, 6, 1, 1, 5, 14, 1, 1, 15, 1, 1 > 4 2, 5, 9, 1, 1, 14, 1, 1, 8, 16, 1, 1, 13, 1, 1 > 5 3, 7, 14, 1, 1, 14, 1, 1, 5, 21, 1, 1, 8, 1, 1 > 6 6, 3, 1, 12, 1, 1, 5, 8, 1, 1, 15, 1, 1 > 7 6, 3, 1, 11, 1, 1, 10, 7, 1, 1, 21, 1, 1 > 8 21, 20, 9, 1, 1, 6, 1, 1, 13, 10, 1, 1, 1 > 9 5, 7, 21, 1, 1, 13, 1, 1, 14, 2, 1, 1, 6, 1, 1 > 10 8, 14, 10, 1, 1, 5, 1, 1, 10, 5, 1, 1, 5, 1, 1 > 11 5, 20, 17, 1, 1, 19, 1, 1, 14, 7, 1, 1, 6, 1, 1 > 12 7, 4, 11, 1, 1, 2, 1, 1, 5, 13, 1, 1, 14, 1, 1 > 13 7, 14, 13, 1, 1, 6, 1, 1, 13, 16, 1, 1, 17, 1, 1 > 14 7, 14, 5, 1, 1, 5, 1, 1, 5, 17, 1, 1, 17, 1, 1 > 15 3, 9, 12, 1, 1, 18, 1, 1, 6, 1, 4, 1, 1 > 16 7, 10, 5, 1, 1, 12, 1, 1, 5, 17, 1, 1, 13, 1, 1 > 17 12, 8, 16, 1, 1, 5, 1, 1, 8, 10, 1, 1, 14, 1, 1 > 18 5, 11, 7, 1, 1, 5, 1, 1, 18, 13, 1, 1, 17, 1, 1 > 19 7, 13, 8, 1, 1, 14, 1, 1, 5, 17, 1, 1, 13, 1, 1 > 20 7, 18, 21, 1, 1, 16, 1, 1, 5, 17, 1, 1, 13, 1, 1 > > I know that in BioC package rmutil have a function (read.list) to > handle different lengths sets of lines but it did not work. > > library(rmutil) > Error in library(rmutil) : 'rmutil' is not a valid package -- > installed < 2.0.0? > > > > Are there any others function to handle this. > > Best regards > Xiyan Lon > > > version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 0.1 > year 2004 > month 11 > day 15 > language R > > > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >
Xiyan Lon <xiyanlon <at> gmail.com> writes: : : Dear useR again, : How can I read a dataset if lines in dataset did not have same : elements (have different lengths), For example: : : 1 2, 4, 16, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 : 2 2, 13, 5, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 : 3 4, 5, 11, 1, 1, 6, 1, 1, 5, 14, 1, 1, 15, 1, 1 : 4 2, 5, 9, 1, 1, 14, 1, 1, 8, 16, 1, 1, 13, 1, 1 : 5 3, 7, 14, 1, 1, 14, 1, 1, 5, 21, 1, 1, 8, 1, 1 : 6 6, 3, 1, 12, 1, 1, 5, 8, 1, 1, 15, 1, 1 : 7 6, 3, 1, 11, 1, 1, 10, 7, 1, 1, 21, 1, 1 : 8 21, 20, 9, 1, 1, 6, 1, 1, 13, 10, 1, 1, 1 : 9 5, 7, 21, 1, 1, 13, 1, 1, 14, 2, 1, 1, 6, 1, 1 : 10 8, 14, 10, 1, 1, 5, 1, 1, 10, 5, 1, 1, 5, 1, 1 : 11 5, 20, 17, 1, 1, 19, 1, 1, 14, 7, 1, 1, 6, 1, 1 : 12 7, 4, 11, 1, 1, 2, 1, 1, 5, 13, 1, 1, 14, 1, 1 : 13 7, 14, 13, 1, 1, 6, 1, 1, 13, 16, 1, 1, 17, 1, 1 : 14 7, 14, 5, 1, 1, 5, 1, 1, 5, 17, 1, 1, 17, 1, 1 : 15 3, 9, 12, 1, 1, 18, 1, 1, 6, 1, 4, 1, 1 : 16 7, 10, 5, 1, 1, 12, 1, 1, 5, 17, 1, 1, 13, 1, 1 : 17 12, 8, 16, 1, 1, 5, 1, 1, 8, 10, 1, 1, 14, 1, 1 : 18 5, 11, 7, 1, 1, 5, 1, 1, 18, 13, 1, 1, 17, 1, 1 : 19 7, 13, 8, 1, 1, 14, 1, 1, 5, 17, 1, 1, 13, 1, 1 : 20 7, 18, 21, 1, 1, 16, 1, 1, 5, 17, 1, 1, 13, 1, 1 : : I know that in BioC package rmutil have a function (read.list) to : handle different lengths sets of lines but it did not work. : > library(rmutil) : Error in library(rmutil) : 'rmutil' is not a valid package -- installed < 2.0.0? : > rmutil can be found here: http://popgen.unimaas.nl/~jlindsey/rcode.html : : Are there any others function to handle this. nf <- count.fields(myfile, sep = ",") z <- read.table(myfile, sep = ",", fill = TRUE, colClass = rep(numeric(), nf)) If the first line is longest you can omit the colClass argument and the nf computation. The above returns a data frame with one line per row and NAs at the end to fill it out as necessary. If you need a list of rows without the NAs: lapply(as.data.frame(t(data.matrix(z))), na.omit)