I apologize if this is a FAQ -- I kind of recall seeing something along
these lines before, but I couldn't find the message when I searched the
archives.
Problem:
1. I have hundreds of small files in a subdirectory ("c:\\temp") and I
would
like to combine the files into a single data frame.
2. Individually, it is easy to read each file>DATA<-read.csv("c:\\temp\\file1a.csv",header=T)
3. It is also fairly easy to add new files to the data frame one at a
time:>DATA<-rbind(DATA,read.csv("c:\\temp\\file1b.csv",header=T))
What is tedious about this solution is that we have to change the file name
in step 3 every time.
Is there a way to have R identify all the files in a directory and create
one big data frame?
I'm working in Windows with R 1.6.2.
Thanks
Paul
MAJ Paul Bliese, Ph.D.
Walter Reed Army Institute of Research
Phone: (301) 319-9873
Fax: (301) 319-9484
paul.bliese at na.amedd.army.mil
Dear Dr. Bliese, One way is to loop over the results of list.files(). Look it up in the manual. /Fredrik On Wed, Apr 09, 2003 at 09:51:38AM -0400, Bliese, Paul D MAJ WRAIR-Wash DC wrote:> I apologize if this is a FAQ -- I kind of recall seeing something along > these lines before, but I couldn't find the message when I searched the > archives. > > Problem: > 1. I have hundreds of small files in a subdirectory ("c:\\temp") and I would > like to combine the files into a single data frame. > 2. Individually, it is easy to read each file > >DATA<-read.csv("c:\\temp\\file1a.csv",header=T) > 3. It is also fairly easy to add new files to the data frame one at a time: > >DATA<-rbind(DATA,read.csv("c:\\temp\\file1b.csv",header=T)) > > What is tedious about this solution is that we have to change the file name > in step 3 every time. > > Is there a way to have R identify all the files in a directory and create > one big data frame? > > I'm working in Windows with R 1.6.2. > > Thanks > > Paul > MAJ Paul Bliese, Ph.D. > Walter Reed Army Institute of Research > Phone: (301) 319-9873 > Fax: (301) 319-9484 > paul.bliese at na.amedd.army.mil > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Try:
tmpf <- list.files("c:/temp")
dat <- read.csv(tmpf[1])
for (f in tmpf[-1]) {
dat <- rbind(dat, read.csv(f)
}
If the data are all numeric, reading them as matrices could be a lot more
efficient. Or you can specify the colclass argument to speed up read.csv().
Another way is to concatenate all the files into one (using things like
`cat') outside of R, and read it in at once.
HTH,
Andy
> -----Original Message-----
> From: Bliese, Paul D MAJ WRAIR-Wash DC
> [mailto:Paul.Bliese at na.amedd.army.mil]
> Sent: Wednesday, April 09, 2003 9:52 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Reading in multiple files
>
>
> I apologize if this is a FAQ -- I kind of recall seeing
> something along
> these lines before, but I couldn't find the message when I
> searched the
> archives.
>
> Problem:
> 1. I have hundreds of small files in a subdirectory
> ("c:\\temp") and I would
> like to combine the files into a single data frame.
> 2. Individually, it is easy to read each file
> >DATA<-read.csv("c:\\temp\\file1a.csv",header=T)
> 3. It is also fairly easy to add new files to the data frame
> one at a time:
>
>DATA<-rbind(DATA,read.csv("c:\\temp\\file1b.csv",header=T))
>
> What is tedious about this solution is that we have to change
> the file name
> in step 3 every time.
>
> Is there a way to have R identify all the files in a
> directory and create
> one big data frame?
>
> I'm working in Windows with R 1.6.2.
>
> Thanks
>
> Paul
> MAJ Paul Bliese, Ph.D.
> Walter Reed Army Institute of Research
> Phone: (301) 319-9873
> Fax: (301) 319-9484
> paul.bliese at na.amedd.army.mil
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
------------------------------------------------------------------------------
On Wed, 9 Apr 2003, Bliese, Paul D MAJ WRAIR-Wash DC wrote:> I apologize if this is a FAQ -- I kind of recall seeing something along > these lines before, but I couldn't find the message when I searched the > archives. > > Problem: > 1. I have hundreds of small files in a subdirectory ("c:\\temp") and I would > like to combine the files into a single data frame. > 2. Individually, it is easy to read each file > >DATA<-read.csv("c:\\temp\\file1a.csv",header=T) > 3. It is also fairly easy to add new files to the data frame one at a time: > >DATA<-rbind(DATA,read.csv("c:\\temp\\file1b.csv",header=T)) > > What is tedious about this solution is that we have to change the file name > in step 3 every time. > > Is there a way to have R identify all the files in a directory and create > one big data frame?You can get the file list with all.the.files <- list.files("C:/temp",full=TRUE) where full=TRUE asks for absolute file paths, which will be useful if this isn't your working directory. You could also add pattern="\\.csv$" to ensure that you only get .csv files. Then you could read them all in all.the.data <- lapply( all.the.files, read.csv, header=TRUE) and then rbind them into a data frame DATA <- do.call("rbind", all.the.data) In one line this would be DATA <- do.call("rbind", lapply( list.files("C:/temp",full=TRUE), read.csv, header=TRUE)) It should be faster to use do.call("rbind",) rather than a loop, but I don't know if it actually is. -thomas