mails
2012-Feb-08 12:09 UTC
[R] Problems reading tab-delim files using read.table and read.delim
Hello, I used read.xlsx to read in Excel files but for large files it turned out to be not very efficient. For that reason I use a programme which writes each sheet in an Excel file into tab-delim txt files. After that I tried using read.table and read.delim to read in those txt files. Unfortunately, the results are not as expected. To show you what I mean I created a tiny Excel sheet with some rows and columns and read it in using read.xlsx. I also used my script to write that sheet to a tab-delim txt file and read that one it with read.table and read.delim. Here is the R output:> (test <- read.table(Sheet1.txt, header=TRUE, sep="\t"))Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 1 did not have 5 elements> (test <- read.delim(Sheet1.txt, header=TRUE, sep="\t"))c1 c2 c3 X 123 213 NA NA NA 234 asd NA NA NA> (test <- read.xlsx(file.path(data), "Sheet1"))c1 c2 c3 NA. NA..1 NA..2 1 123 <NA> 213 <NA> <NA> 2 234 asd NA <NA> The last output is what I would expect the file to be read in. Columns 4 to 6 do not have any header rows. in R1C4 I added some white spaces as well as into R2C5 and R2C6 which a read in correctly by the read.xlsx function. read.table and read.delim seem not to be able to handle such files. Is there any workaround for that? Cheers -- View this message in context: http://r.789695.n4.nabble.com/Problems-reading-tab-delim-files-using-read-table-and-read-delim-tp4369195p4369195.html Sent from the R help mailing list archive at Nabble.com.
Jan van der Laan
2012-Feb-08 16:36 UTC
[R] Problems reading tab-delim files using read.table and read.delim
I don't know if this completely solves your problem, but here are some arguments to read.table/read.delim you might try: row.names=FALSE fill=TRUE The details section also suggests using the colClasses argument as the number of columns is determined from the first 5 rows which may not be correct. HTH Jan mails <mails00000 at gmail.com> schreef:> Hello, > > I used read.xlsx to read in Excel files but for large files it turned out to > be not very efficient. > For that reason I use a programme which writes each sheet in an Excel file > into tab-delim txt files. > After that I tried using read.table and read.delim to read in those txt > files. Unfortunately, the results > are not as expected. To show you what I mean I created a tiny Excel sheet > with some rows and columns and > read it in using read.xlsx. I also used my script to write that sheet to a > tab-delim txt file and read that one it with > read.table and read.delim. Here is the R output: > > > >> (test <- read.table(Sheet1.txt, header=TRUE, sep="\t")) > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, > : > line 1 did not have 5 elements > >> (test <- read.delim(Sheet1.txt, header=TRUE, sep="\t")) > c1 c2 c3 X > 123 213 NA NA NA > 234 asd NA NA NA > >> (test <- read.xlsx(file.path(data), "Sheet1")) > c1 c2 c3 NA. NA..1 NA..2 > 1 123 <NA> 213 <NA> <NA> > 2 234 asd NA <NA> > > > The last output is what I would expect the file to be read in. Columns 4 to > 6 do not have any header rows. in R1C4 I added some white spaces as well as > into R2C5 and R2C6 which a read in correctly by the read.xlsx function. > > read.table and read.delim seem not to be able to handle such files. Is there > any workaround for that? > > > Cheers > > -- > View this message in context: > http://r.789695.n4.nabble.com/Problems-reading-tab-delim-files-using-read-table-and-read-delim-tp4369195p4369195.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Gabor Grothendieck
2012-Feb-08 16:48 UTC
[R] Problems reading tab-delim files using read.table and read.delim
On Wed, Feb 8, 2012 at 7:09 AM, mails <mails00000 at gmail.com> wrote:> Hello, > > I used read.xlsx to read in Excel files but for large files it turned out to > be not very efficient. > For that reason I use a programme which writes each sheet in an Excel file > into tab-delim txt files.Note that that is how read.xls in the gdata package works - it uses a perl program to convert the spreadsheet to a text file and then reads in the text file. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com