Trying To learn again
2011-Jun-27 22:01 UTC
[R] Executing the same function on consecutive files
Hi all,
I have the next problem: I have a matrix with size 8,000,000x18. My personal
computer...blocks...so I have cut my original file into 100 different file.
I have written a function that should be run on each of this file.
So imagine
I need to read data from q1 to q100 file
data<-read.table("q1.txt",sep="")
and each time I read 1 file execute my personal function (I get some stats)
and my last target is to add each partial stats...
My question is:
Is posible to say something similar to this?
for (i in 1:100){
data[i]<-read.table("q[i].txt", sep="")
execute .....
}
Many thanks in advance
[[alternative HTML version deleted]]
This looks something like what you want. http://r.789695.n4.nabble.com/Reading-in-a-series-of-files-using-a-for-loop-td906101.html --- On Mon, 6/27/11, Trying To learn again <tryingtolearnagain at gmail.com> wrote:> From: Trying To learn again <tryingtolearnagain at gmail.com> > Subject: [R] Executing the same function on consecutive files > To: r-help at r-project.org > Received: Monday, June 27, 2011, 6:01 PM > Hi all, > > I have the next problem: I have a matrix with size > 8,000,000x18. My personal > computer...blocks...so I have cut my original file into 100 > different file. > > I have written a function that should be run on each of > this file. > > So imagine > > I need to read data from q1 to q100 file > > data<-read.table("q1.txt",sep="") > > and each time I read 1 file execute my personal function (I > get some stats) > and my last target is to add each partial stats... > > My question is: > > Is posible to say something similar to this? > > for (i in 1:100){ > > data[i]<-read.table("q[i].txt", sep="") > > execute ..... > > } > > Many thanks in advance > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >
Hi:
One approach:
(1) Put your files into a separate directory.
(2) Use list.files() to grab the individual file names.
(3) Write a function that takes a data frame as an argument and does
the necessary processing.
(4) Use lapply() or ldply/llply from the plyr package to recursively
run the function on each file in the list. lapply() and llply() will
return lists, ldply() would return a data frame. If you intend to use
ldply(), then the function in (3) needs to return a data frame.
Here's a small demo. I have five data sets in my starting directory
with variables x1, x2, y. The function reads in the data and returns
the output of a regression model; when lapply() is run on it, the
output of the five models is returned as a list. One can then cherry
pick output from the list of models.
files <- paste('dat', 1:5, '.csv', sep = '')
myfun <- function(d) {
df <- read.csv(d, header = TRUE)
lm(y ~ ., data = df)
}
lout <- lapply(files, myfun)
library(plyr)
ldply(lout, function(x) coef(x)) # coefficients
ldply(lout, function(x) summary(x)$r.squared) # R^2
One could also use
do.call(rbind, lapply(lout, function(x) coef(x))
do.call(rbind, lapply(lout, function(x) summary(x)$r.squared))
but ldply() has a somewhat simpler syntax.
Hopefully, you can adapt these steps to your problem.
Dennis
On Mon, Jun 27, 2011 at 3:01 PM, Trying To learn again
<tryingtolearnagain at gmail.com> wrote:> Hi all,
>
> I have the next problem: I have a matrix with size 8,000,000x18. My
personal
> computer...blocks...so I have cut my original file into 100 different file.
>
> I have written a function that should be run on each of this file.
>
> So imagine
>
> I need to read data from q1 to q100 file
>
> data<-read.table("q1.txt",sep="")
>
> and each time I read 1 file execute my personal function (I get some stats)
> and my last target is to add each partial stats...
>
> My question is:
>
> Is posible to say something similar to this?
>
> for (i in 1:100){
>
> data[i]<-read.table("q[i].txt", sep="")
>
> execute .....
>
> }
>
> Many thanks in advance
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>