Hi, I have about 900 files that I need to run the same R script on. I looked over the R Data Import/Export Manual and couldn't come up with a way to read in a sequence of files. The files all have unique names and are in the same directory. What I want to do is: 1) Create a list of the file names in the directory (this is really what I need help with) 2) For each item in the list... a) open the file with read.table b) perform some analysis c) append some results to an array or save them to another file 3) Next File My initial instinct is to use Python to rename all the files with numbers 1:900 and then read them all, but the file names contain some information that I would like to keep intact and having to keep a separate database of original names and numbers seems inefficient. Is there a way to have R read all the files in a directory one at a time? - Chris
Have you thought about using one of the Python/R interface modules? http://www.omegahat.org/RSPython/ http://rpy.sourceforge.net/ Admittedly, I have not had much success in getting these to work on my machine, but I know others who have. Kyle H. Ambert Graduate Student Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University ambertk@gmail.com On Fri, Dec 5, 2008 at 10:01 AM, Chris Poliquin <poliquin@sas.upenn.edu>wrote:> Hi, > > I have about 900 files that I need to run the same R script on. I looked > over the R Data Import/Export Manual and couldn't come up with a way to > read in a sequence of files. > > The files all have unique names and are in the same directory. What I want > to do is: > 1) Create a list of the file names in the directory (this is really what I > need help with) > 2) For each item in the list... > a) open the file with read.table > b) perform some analysis > c) append some results to an array or save them to another file > 3) Next File > > My initial instinct is to use Python to rename all the files with numbers > 1:900 and then read them all, but the file names contain some information > that I would like to keep intact and having to keep a separate database of > original names and numbers seems inefficient. Is there a way to have R read > all the files in a directory one at a time? > > - Chris > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
This is almost a macro problem. It could be done in SAS language using the WPS product (660 USD) I think. It is a familiar problem and I would be quite interested in the result. Is there any concept of Macros in R or a package to do the same. Regards, Ajay On Fri, Dec 5, 2008 at 11:31 PM, Chris Poliquin <poliquin@sas.upenn.edu>wrote:> Hi, > > I have about 900 files that I need to run the same R script on. I looked > over the R Data Import/Export Manual and couldn't come up with a way to > read in a sequence of files. > > The files all have unique names and are in the same directory. What I want > to do is: > 1) Create a list of the file names in the directory (this is really what I > need help with) > 2) For each item in the list... > a) open the file with read.table > b) perform some analysis > c) append some results to an array or save them to another file > 3) Next File > > My initial instinct is to use Python to rename all the files with numbers > 1:900 and then read them all, but the file names contain some information > that I would like to keep intact and having to keep a separate database of > original names and numbers seems inefficient. Is there a way to have R read > all the files in a directory one at a time? > > - Chris > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
2008/12/5 Chris Poliquin <poliquin at sas.upenn.edu>:> Hi, > > I have about 900 files that I need to run the same R script on. I looked > over the R Data Import/Export Manual and couldn't come up with a way to > read in a sequence of files. > > The files all have unique names and are in the same directory. What I want > to do is: > 1) Create a list of the file names in the directory (this is really what I > need help with) > 2) For each item in the list... > a) open the file with read.table > b) perform some analysis > c) append some results to an array or save them to another file > 3) Next File > > My initial instinct is to use Python to rename all the files with numbers > 1:900 and then read them all, but the file names contain some information > that I would like to keep intact and having to keep a separate database of > original names and numbers seems inefficient. Is there a way to have R read > all the files in a directory one at a time?I can't believe the two 'solutions' already posted. It's easy: ?list.files Barry
R has quite a few functions to get and manipulate filenames to facilitate exactly what you want to do. See ?files and especially the links at the end to the file name manipulation functions. e.g. dir("pathname") lists all file names in the directory "pathname." ?list.files gives details. -- Bert Gunter -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Chris Poliquin Sent: Friday, December 05, 2008 10:02 AM To: r-help at r-project.org Subject: [R] Running R Script on a Sequence of Files Hi, I have about 900 files that I need to run the same R script on. I looked over the R Data Import/Export Manual and couldn't come up with a way to read in a sequence of files. The files all have unique names and are in the same directory. What I want to do is: 1) Create a list of the file names in the directory (this is really what I need help with) 2) For each item in the list... a) open the file with read.table b) perform some analysis c) append some results to an array or save them to another file 3) Next File My initial instinct is to use Python to rename all the files with numbers 1:900 and then read them all, but the file names contain some information that I would like to keep intact and having to keep a separate database of original names and numbers seems inefficient. Is there a way to have R read all the files in a directory one at a time? - Chris ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Use dir to get the names and then lapply over them with a custom anonymous function where L is a list of the returned values: # assumes file names are those in # current directory that end in .dat filenames <- dir(pattern = "\\.dat$") L <- lapply(filenames, function(x) { DF <- read.table(x, ...whatever...) somefunction(DF) }) Now L is a list of the returned 900 values. Alternately you could use a loop. On Fri, Dec 5, 2008 at 1:01 PM, Chris Poliquin <poliquin at sas.upenn.edu> wrote:> Hi, > > I have about 900 files that I need to run the same R script on. I looked > over the R Data Import/Export Manual and couldn't come up with a way to > read in a sequence of files. > > The files all have unique names and are in the same directory. What I want > to do is: > 1) Create a list of the file names in the directory (this is really what I > need help with) > 2) For each item in the list... > a) open the file with read.table > b) perform some analysis > c) append some results to an array or save them to another file > 3) Next File > > My initial instinct is to use Python to rename all the files with numbers > 1:900 and then read them all, but the file names contain some information > that I would like to keep intact and having to keep a separate database of > original names and numbers seems inefficient. Is there a way to have R read > all the files in a directory one at a time? > > - Chris > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Steve_Friedman at nps.gov
2008-Dec-05 20:12 UTC
[R] Running R Script on a Sequence of Files
It seems that you have 900 files with the same parameters in each file (I might be reading more between the lines here than you inferred). However if this is the case, why not import each of the files into a common database and then link the database using ODBC connectivity options. If that is practical, you could then code a series of subsetting options to select the data you need for specific analysis, write reports, and then iteratively select the next set of records. I may be suggesting a very simple solution, so forgive me if this trivializes your problem too greatly. Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 Steve_Friedman at nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 Chris Poliquin <poliquin at sas.upe nn.edu> To Sent by: r-help at r-project.org r-help-bounces at r- cc project.org Subject [R] Running R Script on a Sequence 12/05/2008 01:01 of Files PM EST Hi, I have about 900 files that I need to run the same R script on. I looked over the R Data Import/Export Manual and couldn't come up with a way to read in a sequence of files. The files all have unique names and are in the same directory. What I want to do is: 1) Create a list of the file names in the directory (this is really what I need help with) 2) For each item in the list... a) open the file with read.table b) perform some analysis c) append some results to an array or save them to another file 3) Next File My initial instinct is to use Python to rename all the files with numbers 1:900 and then read them all, but the file names contain some information that I would like to keep intact and having to keep a separate database of original names and numbers seems inefficient. Is there a way to have R read all the files in a directory one at a time? - Chris ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.