I apologize if this is a FAQ -- I kind of recall seeing something along these lines before, but I couldn't find the message when I searched the archives. Problem: 1. I have hundreds of small files in a subdirectory ("c:\\temp") and I would like to combine the files into a single data frame. 2. Individually, it is easy to read each file>DATA<-read.csv("c:\\temp\\file1a.csv",header=T)3. It is also fairly easy to add new files to the data frame one at a time:>DATA<-rbind(DATA,read.csv("c:\\temp\\file1b.csv",header=T))What is tedious about this solution is that we have to change the file name in step 3 every time. Is there a way to have R identify all the files in a directory and create one big data frame? I'm working in Windows with R 1.6.2. Thanks Paul MAJ Paul Bliese, Ph.D. Walter Reed Army Institute of Research Phone: (301) 319-9873 Fax: (301) 319-9484 paul.bliese at na.amedd.army.mil
Dear Dr. Bliese, One way is to loop over the results of list.files(). Look it up in the manual. /Fredrik On Wed, Apr 09, 2003 at 09:51:38AM -0400, Bliese, Paul D MAJ WRAIR-Wash DC wrote:> I apologize if this is a FAQ -- I kind of recall seeing something along > these lines before, but I couldn't find the message when I searched the > archives. > > Problem: > 1. I have hundreds of small files in a subdirectory ("c:\\temp") and I would > like to combine the files into a single data frame. > 2. Individually, it is easy to read each file > >DATA<-read.csv("c:\\temp\\file1a.csv",header=T) > 3. It is also fairly easy to add new files to the data frame one at a time: > >DATA<-rbind(DATA,read.csv("c:\\temp\\file1b.csv",header=T)) > > What is tedious about this solution is that we have to change the file name > in step 3 every time. > > Is there a way to have R identify all the files in a directory and create > one big data frame? > > I'm working in Windows with R 1.6.2. > > Thanks > > Paul > MAJ Paul Bliese, Ph.D. > Walter Reed Army Institute of Research > Phone: (301) 319-9873 > Fax: (301) 319-9484 > paul.bliese at na.amedd.army.mil > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Try: tmpf <- list.files("c:/temp") dat <- read.csv(tmpf[1]) for (f in tmpf[-1]) { dat <- rbind(dat, read.csv(f) } If the data are all numeric, reading them as matrices could be a lot more efficient. Or you can specify the colclass argument to speed up read.csv(). Another way is to concatenate all the files into one (using things like `cat') outside of R, and read it in at once. HTH, Andy> -----Original Message----- > From: Bliese, Paul D MAJ WRAIR-Wash DC > [mailto:Paul.Bliese at na.amedd.army.mil] > Sent: Wednesday, April 09, 2003 9:52 AM > To: r-help at stat.math.ethz.ch > Subject: [R] Reading in multiple files > > > I apologize if this is a FAQ -- I kind of recall seeing > something along > these lines before, but I couldn't find the message when I > searched the > archives. > > Problem: > 1. I have hundreds of small files in a subdirectory > ("c:\\temp") and I would > like to combine the files into a single data frame. > 2. Individually, it is easy to read each file > >DATA<-read.csv("c:\\temp\\file1a.csv",header=T) > 3. It is also fairly easy to add new files to the data frame > one at a time: > >DATA<-rbind(DATA,read.csv("c:\\temp\\file1b.csv",header=T)) > > What is tedious about this solution is that we have to change > the file name > in step 3 every time. > > Is there a way to have R identify all the files in a > directory and create > one big data frame? > > I'm working in Windows with R 1.6.2. > > Thanks > > Paul > MAJ Paul Bliese, Ph.D. > Walter Reed Army Institute of Research > Phone: (301) 319-9873 > Fax: (301) 319-9484 > paul.bliese at na.amedd.army.mil > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >------------------------------------------------------------------------------
On Wed, 9 Apr 2003, Bliese, Paul D MAJ WRAIR-Wash DC wrote:> I apologize if this is a FAQ -- I kind of recall seeing something along > these lines before, but I couldn't find the message when I searched the > archives. > > Problem: > 1. I have hundreds of small files in a subdirectory ("c:\\temp") and I would > like to combine the files into a single data frame. > 2. Individually, it is easy to read each file > >DATA<-read.csv("c:\\temp\\file1a.csv",header=T) > 3. It is also fairly easy to add new files to the data frame one at a time: > >DATA<-rbind(DATA,read.csv("c:\\temp\\file1b.csv",header=T)) > > What is tedious about this solution is that we have to change the file name > in step 3 every time. > > Is there a way to have R identify all the files in a directory and create > one big data frame?You can get the file list with all.the.files <- list.files("C:/temp",full=TRUE) where full=TRUE asks for absolute file paths, which will be useful if this isn't your working directory. You could also add pattern="\\.csv$" to ensure that you only get .csv files. Then you could read them all in all.the.data <- lapply( all.the.files, read.csv, header=TRUE) and then rbind them into a data frame DATA <- do.call("rbind", all.the.data) In one line this would be DATA <- do.call("rbind", lapply( list.files("C:/temp",full=TRUE), read.csv, header=TRUE)) It should be faster to use do.call("rbind",) rather than a loop, but I don't know if it actually is. -thomas