Amir Kasaeian
2012-Aug-17 21:19 UTC
[R] Appending many different and separate Excel files using R
Dear all, Good day! I have a problem in reading Excel files in R and appending them to each other. Suppose we have several Excel files in a directory with headers and want to use R to append them in a single file with an additional variable in the final file indicating from which files the data come from. As I have many Excel files and their sizes are very big I should write a loop in R to do the work. I will be very happy if you guide and help me to write the codes. Thank you very much in advance. Kind regards, Amir Kasaeian Amir Kasaeian, PhD Student in Biostatistics, Dept. of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences (TUMS). P.B. : 14155-6446 Cell Phone: +98-912-2063511 E-mail: akasaeian@razi.tums.ac.ir amir_kasaeian@yahoo.com [[alternative HTML version deleted]]
Rui Barradas
2012-Aug-17 22:07 UTC
[R] Appending many different and separate Excel files using R
Hello, You need to provide us with a bit more information: Do all the files have the same structure, i.e., tables with the same columns? Are they xls, xlsx or csv files? Do their names share something in common such as a preffix or are they alone in the directory, or...? When you write "in the final file" does this mean that you want to output two files, one with the data and the other with the filenames? It would be nice if (a) the files were csv files, (b) if you answer to some or all of the rest, (c) anything that I might have forgotten and you find usefull. Rui Barradas Em 17-08-2012 22:19, Amir Kasaeian escreveu:> Dear all, > Good day! > I have a problem in reading Excel files in R and appending them to each other. Suppose we have several Excel files in a directory with headers and want to use R to append them in a single file with an additional variable in the final file indicating from which files the data come from. > As I have many Excel files and their sizes are very big I should write a loop in R to do the work. > I will be very happy if you guide and help me to write the codes. > Thank you very much in advance. > > Kind regards, > Amir Kasaeian > > > > > Amir Kasaeian, > PhD Student in Biostatistics, > Dept. of Epidemiology and Biostatistics, > School of Public Health, > Tehran University of Medical Sciences (TUMS). > P.B. : 14155-6446 > Cell Phone: +98-912-2063511 > E-mail: akasaeian@razi.tums.ac.ir > amir_kasaeian@yahoo.com > > [[alternative HTML version deleted]] > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
R. Michael Weylandt <michael.weylandt@gmail.com>
2012-Aug-17 22:10 UTC
[R] Appending many different and separate Excel files using R
On Aug 17, 2012, at 4:19 PM, Amir Kasaeian <amir_kasaeian at yahoo.com> wrote:> Dear all, > Good day! > I have a problem in reading Excel files in R and appending them to each other. Suppose we have several Excel files in a directory with headers and want to use R to append them in a single file with an additional variable in the final file indicating from which files the data come from. > As I have many Excel files and their sizes are very big I should write a loop in R to do the work. > I will be very happy if you guide and help me to write the codes. > Thank you very much in advance. >Given the overhead of most R/Excel interfaces, I think you might want to look into doing this within Excel only using some basic VBA. That said, do.call(rbind, lapply(dir(pattern = "xls"), function(n) read.xls(n))) should get you started. You can find read.xls in either the gdata or or xlsReadWrite packages. For more advanced control, look at the XLConnect package, which is, unfortunately, not currently available on OS X. Cheers, Michael> Kind regards, > Amir Kasaeian > > > > > Amir Kasaeian, > PhD Student in Biostatistics, > Dept. of Epidemiology and Biostatistics, > School of Public Health, > Tehran University of Medical Sciences (TUMS). > P.B. : 14155-6446 > Cell Phone: +98-912-2063511 > E-mail: akasaeian at razi.tums.ac.ir > ??????????? amir_kasaeian at yahoo.com?? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rui Barradas
2012-Aug-18 17:03 UTC
[R] Appending many different and separate Excel files using R
Hello, Try the following. # This needs several other packages # install.packages("XLConnect") require(XLConnect) fpattern <- "Book.*.xls?" # pattern for filenames output.file <- "Test.xls" lfiles <- list.files(pattern = fpattern) # Read first worksheet from each file data.lst <-lapply(lfiles, readWorksheetFromFile, sheet = 1) # Get column names from data.frames data.names <- lapply(data.lst, names) # Create a result data.frame, we are going # to put everything here result <- data.frame(Book = 1, data.lst[[1]]) # Work variable, which column names do we # already have in 'result' current.names <- c("Book", data.names[[1]]) # for(i in seq_along(data.lst)[-1]){ new.names <- setdiff(data.names[[i]], current.names) current.names <- union(current.names, new.names) for(nm in new.names) result[[ nm ]] <- NA tmp <- as.data.frame(matrix(nrow = nrow(data.lst[[i]]), ncol = ncol(result))) names(tmp) <- current.names tmp[[ "Book" ]] <- i tmp[ , data.names[[i]] ] <- data.lst[[i]] result <- rbind(result, tmp) } writeWorksheetToFile(output.file, data = result, sheet = "AllBooks", startRow = 1, startCol = 1) Hope this helps, Rui Barradas Em 18-08-2012 00:04, Amir Kasaeian escreveu:> Dear Rui, > > Thank you very much for your message. > Suppose we have 3 Excel files: Book1, Book2 and Book3 in a single directory. Please note the headers and their order. I need an output EXCEL file like Amir which includes all the information of the 3 Booki files. I think with the attached file I explain my problem. > As a matter of fact, I need to creat a new file named Amir which has the information of the same and different variable from different excel file with a single spreadsheet below each others and it has the new variable named "Book" indicate which row come from which file. > Is my information enough ? > > Again, thank you very much. > > Best, > Amir > > > > Amir Kasaeian, > PhD Student in Biostatistics, > Dept. of Epidemiology and Biostatistics, > School of Public Health, > Tehran University of Medical Sciences (TUMS). > P.B. : 14155-6446 > Cell Phone: +98-912-2063511 > E-mail: akasaeian at razi.tums.ac.ir > amir_kasaeian at yahoo.com > > > --- On Sat, 18/8/12, Rui Barradas <ruipbarradas at sapo.pt> wrote: > > > From: Rui Barradas <ruipbarradas at sapo.pt> > Subject: Re: [R] Appending many different and separate Excel files using R > To: "Amir Kasaeian" <amir_kasaeian at yahoo.com> > Cc: "r-help" <r-help at r-project.org> > Date: Saturday, 18 August, 2012, 3:37 AM > > > > Hello, > > You need to provide us with a bit more information: > Do all the files have the same structure, i.e., tables with the same columns? No Are they xls, xlsx or csv files? they may be in each of the three types Do their names share something in common such as a preffix or are they alone in the directory, or...? No, they are with different name without sth in common When you write "in the final file" does this mean that you want to output two files, one with the data and the other with the filenames No, Just one file including all the variable from different files with new variable indicating which row of data come from which file > It would be nice if (a) the files were csv files, (b) if you answer to some or all of the rest, (c) anything that I might have forgotten and you find usefull. > I think nothing left to say. > Rui Barradas thank you very much Rui > > > Em 17-08-2012 22:19, Amir Kasaeian escreveu: > > Dear all, > Good day! > I have a problem in reading Excel files in R and appending them to each other. Suppose we have several Excel files in a directory with headers and want to use R to append them in a single file with an additional variable in the final file indicating from which files the data come from. > As I have many Excel files and their sizes are very big I should write a loop in R to do the work. > I will be very happy if you guide and help me to write the codes. > Thank you very much in advance. > > Kind regards, > Amir Kasaeian > > > > > Amir Kasaeian, > PhD Student in Biostatistics, > Dept. of Epidemiology and Biostatistics, > School of Public Health, > Tehran University of Medical Sciences (TUMS). > P.B. : 14155-6446 > Cell Phone: +98-912-2063511 > E-mail: akasaeian at razi.tums.ac.ir > amir_kasaeian at yahoo.com > > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
Rui Barradas
2012-Aug-22 12:44 UTC
[R] Appending many different and separate Excel files using R
Hello, I'm glad it helped. As for books on R, there are so many nowadays that it's difficult to recommend one. I'd start by seeing the R home page http://www.r-project.org/ and the links on the lower left, under the title Documentation > Books or, for free downloads, Documentation > Other > contributed documentation The links also point to the R Wiki, http://rwiki.sciviews.org/doku.php. In fact, the large number of documentation, code examples, etc, make of Google a very good friend :) To make it better, and more complete, some of the web pages are very general, others dedicated to certain fields Examples of usefull links, one of each type, are, for instance, http://stackoverflow.com/questions/tagged/r http://addictedtor.free.fr/graphiques/ And many others. And, of course, the manuals that come with R. Rui Barradas Em 22-08-2012 06:20, Amir Kasaeian escreveu:> Dear Rui, > > Good day! > > Thank you very much. The great codes! > My other question is that which book or books do you introduce me If I ask you to recommend some R best books, especially in the field of programming? > Thanks all your attention. > > Kind regards, > Amir Kasaeian > > > > Amir Kasaeian, > PhD Student in Biostatistics, > Dept. of Epidemiology and Biostatistics, > School of Public Health, > Tehran University of Medical Sciences (TUMS). > P.B. : 14155-6446 > Cell Phone: +98-912-2063511 > E-mail: akasaeian at razi.tums.ac.ir > amir_kasaeian at yahoo.com > > > --- On Sat, 18/8/12, Rui Barradas <ruipbarradas at sapo.pt> wrote: > > > From: Rui Barradas <ruipbarradas at sapo.pt> > Subject: Re: [R] Appending many different and separate Excel files using R > To: "Amir Kasaeian" <amir_kasaeian at yahoo.com> > Cc: "r-help" <r-help at r-project.org> > Date: Saturday, 18 August, 2012, 10:33 PM > > > Hello, > > Try the following. > > > # This needs several other packages > # install.packages("XLConnect") > require(XLConnect) > > fpattern <- "Book.*.xls?" # pattern for filenames > output.file <- "Test.xls" > lfiles <- list.files(pattern = fpattern) > > # Read first worksheet from each file > data.lst <-lapply(lfiles, readWorksheetFromFile, sheet = 1) > > # Get column names from data.frames > data.names <- lapply(data.lst, names) > > # Create a result data.frame, we are going > # to put everything here > result <- data.frame(Book = 1, data.lst[[1]]) > > # Work variable, which column names do we > # already have in 'result' > current.names <- c("Book", data.names[[1]]) > # > for(i in seq_along(data.lst)[-1]){ > new.names <- setdiff(data.names[[i]], current.names) > current.names <- union(current.names, new.names) > for(nm in new.names) > result[[ nm ]] <- NA > tmp <- as.data.frame(matrix(nrow = nrow(data.lst[[i]]), ncol > ncol(result))) > names(tmp) <- current.names > tmp[[ "Book" ]] <- i > tmp[ , data.names[[i]] ] <- data.lst[[i]] > result <- rbind(result, tmp) > } > > writeWorksheetToFile(output.file, data = result, sheet = "AllBooks", > startRow = 1, startCol = 1) > > Hope this helps, > > Rui Barradas > Em 18-08-2012 00:04, Amir Kasaeian escreveu: >> Dear Rui, >> >> Thank you very much for your message. >> Suppose we have 3 Excel files: Book1, Book2 and Book3 in a single directory. Please note the headers and their order. I need an output EXCEL file like Amir which includes all the information of the 3 Booki files. I think with the attached file I explain my problem. >> As a matter of fact, I need to creat a new file named Amir which has the information of the same and different variable from different excel file with a single spreadsheet below each others and it has the new variable named "Book" indicate which row come from which file. >> Is my information enough ? >> >> Again, thank you very much. >> >> Best, >> Amir >> >> >> >> Amir Kasaeian, >> PhD Student in Biostatistics, >> Dept. of Epidemiology and Biostatistics, >> School of Public Health, >> Tehran University of Medical Sciences (TUMS). >> P.B. : 14155-6446 >> Cell Phone: +98-912-2063511 >> E-mail: akasaeian at razi.tums.ac.ir >> amir_kasaeian at yahoo.com >> >> >> --- On Sat, 18/8/12, Rui Barradas <ruipbarradas at sapo.pt> wrote: >> >> >> From: Rui Barradas <ruipbarradas at sapo.pt> >> Subject: Re: [R] Appending many different and separate Excel files using R >> To: "Amir Kasaeian" <amir_kasaeian at yahoo.com> >> Cc: "r-help" <r-help at r-project.org> >> Date: Saturday, 18 August, 2012, 3:37 AM >> >> >> >> Hello, >> >> You need to provide us with a bit more information: >> Do all the files have the same structure, i.e., tables with the same columns? No Are they xls, xlsx or csv files? they may be in each of the three types Do their names share something in common such as a preffix or are they alone in the directory, or...? No, they are with different name without sth in common When you write "in the final file" does this mean that you want to output two files, one with the data and the other with the filenames No, Just one file including all the variable from different files with new variable indicating which row of data come from which file >> It would be nice if (a) the files were csv files, (b) if you answer to some or all of the rest, (c) anything that I might have forgotten and you find usefull. >> I think nothing left to say. >> Rui Barradas thank you very much Rui >> >> >> Em 17-08-2012 22:19, Amir Kasaeian escreveu: >> >> Dear all, >> Good day! >> I have a problem in reading Excel files in R and appending them to each other. Suppose we have several Excel files in a directory with headers and want to use R to append them in a single file with an additional variable in the final file indicating from which files the data come from. >> As I have many Excel files and their sizes are very big I should write a loop in R to do the work. >> I will be very happy if you guide and help me to write the codes. >> Thank you very much in advance. >> >> Kind regards, >> Amir Kasaeian >> >> >> >> >> Amir Kasaeian, >> PhD Student in Biostatistics, >> Dept. of Epidemiology and Biostatistics, >> School of Public Health, >> Tehran University of Medical Sciences (TUMS). >> P.B. : 14155-6446 >> Cell Phone: +98-912-2063511 >> E-mail: akasaeian at razi.tums.ac.ir >> amir_kasaeian at yahoo.com >> >> [[alternative HTML version deleted]] >> >> >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >