I have several hundred Excel 2007 data files in a folder. I would like to read every file in a single given folder using a loop. I have searched the FAQ, the forum archives here, other or older R boards and the R Import / Export documentation, and have asked some very knowledgeable R users without learning of a solution. I hope someone here can help. I understand that the most common suggestion is to convert the files to csv format. However, there are so many files in my case (ultimately > 1000) I would rather avoid doing that. I have also found many solutions to this problem for txt files and files in additional formats other than Excel 2007. I can read three Excel 2007 files one at a time with the following example code using R 2.10.0 on a computer running Windows (XP, I think): library(RODBC) channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb); DBQ=U:\\test folder\\testA.xlsx; ReadOnly=False") sqlTables(channel) my.data.A <- sqlFetch(channel, "Sheet1") odbcClose(channel) channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb); DBQ=U:\\test folder\\testB.xlsx; ReadOnly=False") sqlTables(channel) my.data.B <- sqlFetch(channel, "Sheet1") odbcClose(channel) channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb); DBQ=U:\\test folder\\testC.xlsx; ReadOnly=False") sqlTables(channel) my.data.C <- sqlFetch(channel, "Sheet1") odbcClose(channel) # However, when I attempt to read the same three files with the loop below I receive an error: library(RODBC) setwd("U:/test folder") fname <- list.files(pattern=".\\.xlsx", full.names = FALSE, recursive TRUE, ignore.case = TRUE) z <- length(fname) print(z) for (sp in 1:z) { channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb); DBQ=U:\\test folder\\fname[sp]; ReadOnly=False") sqlTables(channel) my.data <- sqlFetch(channel, "Sheet1") print(my.data) odbcClose(channel) } # The error I receive states: Error in odbcTableExists(channel, sqtable) : ?Sheet1?: table not found on channel # Thank you sincerely in advance for any help with this problem. Mark Miller Gainesville, Florida -- View this message in context: http://old.nabble.com/Reading-multiple-Excel-2007-files-with-a-loop-tp26414828p26414828.html Sent from the R help mailing list archive at Nabble.com.
Have you looked at the read.xls() function from the gdata package? It automates the conversion to *.csv for you. It has worked seamlessly for me on the occasions on which I've needed to use it. cheers, Rolf Turner On 19/11/2009, at 9:09 AM, Mark W. Miller wrote:> > > I have several hundred Excel 2007 data files in a folder. I would > like to > read every file in a single given folder using a loop. > > I have searched the FAQ, the forum archives here, other or older R > boards > and the R Import / Export documentation, and have asked some very > knowledgeable R users without learning of a solution. I hope > someone here > can help. > > I understand that the most common suggestion is to convert the > files to csv > format. However, there are so many files in my case (ultimately > > 1000) I > would rather avoid doing that. > > I have also found many solutions to this problem for txt files and > files in > additional formats other than Excel 2007. > > I can read three Excel 2007 files one at a time with the following > example > code using R 2.10.0 on a computer running Windows (XP, I think): > > > > > library(RODBC) > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > *.xlsx, > *.xlsm, *.xlsb); > DBQ=U:\\test folder\\testA.xlsx; ReadOnly=False") > > sqlTables(channel) > > my.data.A <- sqlFetch(channel, "Sheet1") > > odbcClose(channel) > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > *.xlsx, > *.xlsm, *.xlsb); > DBQ=U:\\test folder\\testB.xlsx; ReadOnly=False") > > sqlTables(channel) > > my.data.B <- sqlFetch(channel, "Sheet1") > > odbcClose(channel) > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > *.xlsx, > *.xlsm, *.xlsb); > DBQ=U:\\test folder\\testC.xlsx; ReadOnly=False") > > sqlTables(channel) > > my.data.C <- sqlFetch(channel, "Sheet1") > > odbcClose(channel) > > > > > > # However, when I attempt to read the same three files with the > loop below I > receive an error: > > > > > library(RODBC) > > > setwd("U:/test folder") > > > fname <- list.files(pattern=".\\.xlsx", full.names = FALSE, > recursive > TRUE, ignore.case = TRUE) > > z <- length(fname) > > print(z) > > > for (sp in 1:z) { > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > *.xlsx, > *.xlsm, *.xlsb); > > DBQ=U:\\test folder\\fname[sp]; ReadOnly=False") > > sqlTables(channel) > > my.data <- sqlFetch(channel, "Sheet1") > > print(my.data) > > odbcClose(channel) > } > > > > > # The error I receive states: > > Error in odbcTableExists(channel, sqtable) : > ?Sheet1?: table not found on channel > > > # Thank you sincerely in advance for any help with this problem. > > Mark Miller > > Gainesville, Florida###################################################################### Attention: This e-mail message is privileged and confidential. If you are not the intended recipient please delete the message and notify the sender. Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshal www.marshalsoftware.com ######################################################################
A small example. regards Christian> library(gdata) > fname <- list.files("C:/dm/test",pattern=".xls", full.names = TRUE, recursive =TRUE, ignore.case = TRUE) > > for (sp in 1:length(fname)) {+ print(fname[sp]) + data <- read.xls(fname[sp], sheet=1, verbose=FALSE,perl="perl") + print(data) + } [1] "C:/dm/test/xls1/file1.xls" Converting xls file to csv file... Done. Reading csv file... Done. A B 1 100 100 2 200 200 [1] "C:/dm/test/xls1/file2.xls" Converting xls file to csv file... Done. Reading csv file... Done. A B 1 100 100 2 200 300 [1] "C:/dm/test/xls2/file5.xls" Converting xls file to csv file... Done. Reading csv file... Done. A B 1 100 100 2 200 300 3 200 100 regards Christian> -----Urspr?ngliche Nachricht----- > Von: "Rolf Turner" <r.turner at auckland.ac.nz> > Gesendet: 18.11.09 21:38:12 > An: "Mark W.Miller" <mark_wayne_miller at yahoo.com> > CC: "r-help at r-project.org" <r-help at r-project.org> > Betreff: Re: [R] Re ading multiple Excel 2007 files with a loop> > Have you looked at the read.xls() function from the gdata package? > It automates the conversion to *.csv for you. It has worked seamlessly > for me on the occasions on which I've needed to use it. > > cheers, > > Rolf Turner > > On 19/11/2009, at 9:09 AM, Mark W. Miller wrote: > > > > > > > I have several hundred Excel 2007 data files in a folder. I would > > like to > > read every file in a single given folder using a loop. > > > > I have searched the FAQ, the forum archives here, other or older R > > boards > > and the R Import / Export documentation, and have asked some very > > knowledgeable R users without learning of a solution. I hope > > someone here > > can help. > > > > I understand that the most common suggestion is to convert the > > files to csv > > format. However, there are so many files in my case (ultimately > > > 1000) I > > would rather avoid doing that. > > > > I have also found many solutions to this problem for txt files and > > files in > > additional formats other than Excel 2007. > > > > I can read three Excel 2007 files one at a time with the following > > example > > code using R 2.10.0 on a computer running Windows (XP, I think): > > > > > > > > > > library(RODBC) > > > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > > *.xlsx, > > *.xlsm, *.xlsb); > > DBQ=U:\\test folder\\testA.xlsx; ReadOnly=False") > > > > sqlTables(channel) > > > > my.data.A <- sqlFetch(channel, "Sheet1") > > > > odbcClose(channel) > > > > > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > > *.xlsx, > > *.xlsm, *.xlsb); > > DBQ=U:\\test folder\\testB.xlsx; ReadOnly=False") > > > > sqlTables(channel) > > > > my.data.B <- sqlFetch(channel, "Sheet1") > > > > odbcClose(channel) > > > > > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > > *.xlsx, > > *.xlsm, *.xlsb); > > DBQ=U:\\test folder\\testC.xlsx; ReadOnly=False") > > > > sqlTables(channel) > > > > my.data.C <- sqlFetch(channel, "Sheet1") > > > > odbcClose(channel) > > > > > > > > > > > > # However, when I attempt to read the same three files with the > > loop below I > > receive an error: > > > > > > > > > > library(RODBC) > > > > > > setwd("U:/test folder") > > > > > > fname <- list.files(pattern=".\\.xlsx", full.names = FALSE, > > recursive > > TRUE, ignore.case = TRUE) > > > > z <- length(fname) > > > > print(z) > > > > > > for (sp in 1:z) { > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > > *.xlsx, > > *.xlsm, *.xlsb); > > > > DBQ=U:\\test folder\\fname[sp]; ReadOnly=False") > > > > sqlTables(channel) > > > > my.data <- sqlFetch(channel, "Sheet1") > > > > print(my.data) > > > > odbcClose(channel) > > } > > > > > > > > > > # The error I receive states: > > > > Error in odbcTableExists(channel, sqtable) : > > ?Sheet1?: table not found on channel > > > > > > # Thank you sincerely in advance for any help with this problem. > > > > Mark Miller > > > > Gainesville, Florida > ###################################################################### > Attention: > This e-mail message is privileged and confidential. If you are not the > intended recipient please delete the message and notify the sender. > Any views or opinions presented are solely those of the author. > > This e-mail has been scanned and cleared by MailMarshal > www.marshalsoftware.com > ###################################################################### > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Thank you for all of the responses. They were all very helpful. The best response came from a gentleman at Berkeley who suggested I change the channel statement to that used below: Mark Miller Gainesville, Florida library(RODBC) setwd("U:/test folder") fname <- list.files(pattern=".\\.xlsx", full.names = FALSE, recursive TRUE, ignore.case = TRUE) z <- length(fname) print(z) for (sp in 1:z) { channel <- odbcDriverConnect(paste("DRIVER=Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb); DBQ=U:\\test folder\\",fname[sp],"; ReadOnly=False",sep='')) sqlTables(channel) my.data <- sqlFetch(channel, "Sheet1") print(my.data) odbcClose(channel) } -- View this message in context: http://old.nabble.com/Reading-multiple-Excel-2007-files-with-a-loop-tp26414828p26415864.html Sent from the R help mailing list archive at Nabble.com.