I have several hundred Excel 2007 data files in a folder. I would like to
read every file in a single given folder using a loop.
I have searched the FAQ, the forum archives here, other or older R boards
and the R Import / Export documentation, and have asked some very
knowledgeable R users without learning of a solution. I hope someone here
can help.
I understand that the most common suggestion is to convert the files to csv
format. However, there are so many files in my case (ultimately > 1000) I
would rather avoid doing that.
I have also found many solutions to this problem for txt files and files in
additional formats other than Excel 2007.
I can read three Excel 2007 files one at a time with the following example
code using R 2.10.0 on a computer running Windows (XP, I think):
library(RODBC)
channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls,
*.xlsx,
*.xlsm, *.xlsb);
DBQ=U:\\test folder\\testA.xlsx; ReadOnly=False")
sqlTables(channel)
my.data.A <- sqlFetch(channel, "Sheet1")
odbcClose(channel)
channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls,
*.xlsx,
*.xlsm, *.xlsb);
DBQ=U:\\test folder\\testB.xlsx; ReadOnly=False")
sqlTables(channel)
my.data.B <- sqlFetch(channel, "Sheet1")
odbcClose(channel)
channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls,
*.xlsx,
*.xlsm, *.xlsb);
DBQ=U:\\test folder\\testC.xlsx; ReadOnly=False")
sqlTables(channel)
my.data.C <- sqlFetch(channel, "Sheet1")
odbcClose(channel)
# However, when I attempt to read the same three files with the loop below I
receive an error:
library(RODBC)
setwd("U:/test folder")
fname <- list.files(pattern=".\\.xlsx", full.names = FALSE,
recursive TRUE, ignore.case = TRUE)
z <- length(fname)
print(z)
for (sp in 1:z) {
channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls,
*.xlsx,
*.xlsm, *.xlsb);
DBQ=U:\\test folder\\fname[sp]; ReadOnly=False")
sqlTables(channel)
my.data <- sqlFetch(channel, "Sheet1")
print(my.data)
odbcClose(channel)
}
# The error I receive states:
Error in odbcTableExists(channel, sqtable) :
?Sheet1?: table not found on channel
# Thank you sincerely in advance for any help with this problem.
Mark Miller
Gainesville, Florida
--
View this message in context:
http://old.nabble.com/Reading-multiple-Excel-2007-files-with-a-loop-tp26414828p26414828.html
Sent from the R help mailing list archive at Nabble.com.
Have you looked at the read.xls() function from the gdata package? It automates the conversion to *.csv for you. It has worked seamlessly for me on the occasions on which I've needed to use it. cheers, Rolf Turner On 19/11/2009, at 9:09 AM, Mark W. Miller wrote:> > > I have several hundred Excel 2007 data files in a folder. I would > like to > read every file in a single given folder using a loop. > > I have searched the FAQ, the forum archives here, other or older R > boards > and the R Import / Export documentation, and have asked some very > knowledgeable R users without learning of a solution. I hope > someone here > can help. > > I understand that the most common suggestion is to convert the > files to csv > format. However, there are so many files in my case (ultimately > > 1000) I > would rather avoid doing that. > > I have also found many solutions to this problem for txt files and > files in > additional formats other than Excel 2007. > > I can read three Excel 2007 files one at a time with the following > example > code using R 2.10.0 on a computer running Windows (XP, I think): > > > > > library(RODBC) > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > *.xlsx, > *.xlsm, *.xlsb); > DBQ=U:\\test folder\\testA.xlsx; ReadOnly=False") > > sqlTables(channel) > > my.data.A <- sqlFetch(channel, "Sheet1") > > odbcClose(channel) > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > *.xlsx, > *.xlsm, *.xlsb); > DBQ=U:\\test folder\\testB.xlsx; ReadOnly=False") > > sqlTables(channel) > > my.data.B <- sqlFetch(channel, "Sheet1") > > odbcClose(channel) > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > *.xlsx, > *.xlsm, *.xlsb); > DBQ=U:\\test folder\\testC.xlsx; ReadOnly=False") > > sqlTables(channel) > > my.data.C <- sqlFetch(channel, "Sheet1") > > odbcClose(channel) > > > > > > # However, when I attempt to read the same three files with the > loop below I > receive an error: > > > > > library(RODBC) > > > setwd("U:/test folder") > > > fname <- list.files(pattern=".\\.xlsx", full.names = FALSE, > recursive > TRUE, ignore.case = TRUE) > > z <- length(fname) > > print(z) > > > for (sp in 1:z) { > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > *.xlsx, > *.xlsm, *.xlsb); > > DBQ=U:\\test folder\\fname[sp]; ReadOnly=False") > > sqlTables(channel) > > my.data <- sqlFetch(channel, "Sheet1") > > print(my.data) > > odbcClose(channel) > } > > > > > # The error I receive states: > > Error in odbcTableExists(channel, sqtable) : > ?Sheet1?: table not found on channel > > > # Thank you sincerely in advance for any help with this problem. > > Mark Miller > > Gainesville, Florida###################################################################### Attention: This e-mail message is privileged and confidential. If you are not the intended recipient please delete the message and notify the sender. Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshal www.marshalsoftware.com ######################################################################
A small example. regards Christian> library(gdata) > fname <- list.files("C:/dm/test",pattern=".xls", full.names = TRUE, recursive =TRUE, ignore.case = TRUE) > > for (sp in 1:length(fname)) {+ print(fname[sp]) + data <- read.xls(fname[sp], sheet=1, verbose=FALSE,perl="perl") + print(data) + } [1] "C:/dm/test/xls1/file1.xls" Converting xls file to csv file... Done. Reading csv file... Done. A B 1 100 100 2 200 200 [1] "C:/dm/test/xls1/file2.xls" Converting xls file to csv file... Done. Reading csv file... Done. A B 1 100 100 2 200 300 [1] "C:/dm/test/xls2/file5.xls" Converting xls file to csv file... Done. Reading csv file... Done. A B 1 100 100 2 200 300 3 200 100 regards Christian> -----Urspr?ngliche Nachricht----- > Von: "Rolf Turner" <r.turner at auckland.ac.nz> > Gesendet: 18.11.09 21:38:12 > An: "Mark W.Miller" <mark_wayne_miller at yahoo.com> > CC: "r-help at r-project.org" <r-help at r-project.org> > Betreff: Re: [R] Re ading multiple Excel 2007 files with a loop> > Have you looked at the read.xls() function from the gdata package? > It automates the conversion to *.csv for you. It has worked seamlessly > for me on the occasions on which I've needed to use it. > > cheers, > > Rolf Turner > > On 19/11/2009, at 9:09 AM, Mark W. Miller wrote: > > > > > > > I have several hundred Excel 2007 data files in a folder. I would > > like to > > read every file in a single given folder using a loop. > > > > I have searched the FAQ, the forum archives here, other or older R > > boards > > and the R Import / Export documentation, and have asked some very > > knowledgeable R users without learning of a solution. I hope > > someone here > > can help. > > > > I understand that the most common suggestion is to convert the > > files to csv > > format. However, there are so many files in my case (ultimately > > > 1000) I > > would rather avoid doing that. > > > > I have also found many solutions to this problem for txt files and > > files in > > additional formats other than Excel 2007. > > > > I can read three Excel 2007 files one at a time with the following > > example > > code using R 2.10.0 on a computer running Windows (XP, I think): > > > > > > > > > > library(RODBC) > > > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > > *.xlsx, > > *.xlsm, *.xlsb); > > DBQ=U:\\test folder\\testA.xlsx; ReadOnly=False") > > > > sqlTables(channel) > > > > my.data.A <- sqlFetch(channel, "Sheet1") > > > > odbcClose(channel) > > > > > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > > *.xlsx, > > *.xlsm, *.xlsb); > > DBQ=U:\\test folder\\testB.xlsx; ReadOnly=False") > > > > sqlTables(channel) > > > > my.data.B <- sqlFetch(channel, "Sheet1") > > > > odbcClose(channel) > > > > > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > > *.xlsx, > > *.xlsm, *.xlsb); > > DBQ=U:\\test folder\\testC.xlsx; ReadOnly=False") > > > > sqlTables(channel) > > > > my.data.C <- sqlFetch(channel, "Sheet1") > > > > odbcClose(channel) > > > > > > > > > > > > # However, when I attempt to read the same three files with the > > loop below I > > receive an error: > > > > > > > > > > library(RODBC) > > > > > > setwd("U:/test folder") > > > > > > fname <- list.files(pattern=".\\.xlsx", full.names = FALSE, > > recursive > > TRUE, ignore.case = TRUE) > > > > z <- length(fname) > > > > print(z) > > > > > > for (sp in 1:z) { > > > > channel <- odbcDriverConnect("DRIVER=Microsoft Excel Driver (*.xls, > > *.xlsx, > > *.xlsm, *.xlsb); > > > > DBQ=U:\\test folder\\fname[sp]; ReadOnly=False") > > > > sqlTables(channel) > > > > my.data <- sqlFetch(channel, "Sheet1") > > > > print(my.data) > > > > odbcClose(channel) > > } > > > > > > > > > > # The error I receive states: > > > > Error in odbcTableExists(channel, sqtable) : > > ?Sheet1?: table not found on channel > > > > > > # Thank you sincerely in advance for any help with this problem. > > > > Mark Miller > > > > Gainesville, Florida > ###################################################################### > Attention: > This e-mail message is privileged and confidential. If you are not the > intended recipient please delete the message and notify the sender. > Any views or opinions presented are solely those of the author. > > This e-mail has been scanned and cleared by MailMarshal > www.marshalsoftware.com > ###################################################################### > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Thank you for all of the responses. They were all very helpful. The best
response came from a gentleman at Berkeley who suggested I change the
channel statement to that used below:
Mark Miller
Gainesville, Florida
library(RODBC)
setwd("U:/test folder")
fname <- list.files(pattern=".\\.xlsx", full.names = FALSE,
recursive TRUE, ignore.case = TRUE)
z <- length(fname)
print(z)
for (sp in 1:z) {
channel <- odbcDriverConnect(paste("DRIVER=Microsoft Excel Driver
(*.xls,
*.xlsx, *.xlsm, *.xlsb); DBQ=U:\\test folder\\",fname[sp],";
ReadOnly=False",sep=''))
sqlTables(channel)
my.data <- sqlFetch(channel, "Sheet1")
print(my.data)
odbcClose(channel)
}
--
View this message in context:
http://old.nabble.com/Reading-multiple-Excel-2007-files-with-a-loop-tp26414828p26415864.html
Sent from the R help mailing list archive at Nabble.com.