Hi I rbind data frames in a loop in a cumulative way and the performance detriorates very quickly. My code looks like this: for( k in 1:N) { filename <- paste("/tmp/myData_",as.character(k),".txt",sep="") myDataTmp <- read.table(filename,header=TRUE,sep=",") if( k == 1) { myData <- myDataTmp } else{ myData <- rbind(myData,myDataTmp) } } Some more details: - the size of the stored text files is about 100,000 rows and 50 columns each - for k=1: rbind takes 0.0004 seconds - for k=2: rbind takes 13 seconds - for k=3: rbind takes 30 seconds - for k=4: rbind takes 36 seconds etc Any suggestions to improve speed? Thanks Zava -------------------------------------------------------- This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
Read the data into a list and then: do.call('rbind', myList) at the end so you do it only once. You are having to reallocate memory each iteration, so no wonder it is slow. On 7/17/07, Aydemir, Zava (FID) <Zava.Aydemir at morganstanley.com> wrote:> Hi > > I rbind data frames in a loop in a cumulative way and the performance > detriorates very quickly. > > My code looks like this: > > for( k in 1:N) > { > filename <- paste("/tmp/myData_",as.character(k),".txt",sep="") > myDataTmp <- read.table(filename,header=TRUE,sep=",") > if( k == 1) { > myData <- myDataTmp > } > else{ > myData <- rbind(myData,myDataTmp) > } > } > > Some more details: > - the size of the stored text files is about 100,000 rows and 50 columns > each > - for k=1: rbind takes 0.0004 seconds > - for k=2: rbind takes 13 seconds > - for k=3: rbind takes 30 seconds > - for k=4: rbind takes 36 seconds > etc > > Any suggestions to improve speed? > > Thanks > > Zava > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
As Jim points out, building up a data frame by rbinding in a loop can be a slow way to do things in R. Here's an example of how you can easily read data frames into a list: > # Create 3 files > invisible(lapply(1:3, function(i) write.csv(file=paste("tmp",i,".csv",sep=""), data.frame(i=2*i+(1:2),c=letters[2*i+(1:2)])))) > # Read the files into a list of data frames > list.of.dfs <- lapply(paste("tmp",1:3,".csv",sep=""), read.csv, row.names=1) > # rbind the data frames > myData <- do.call("rbind", list.of.dfs) > myData i c 1 3 c 2 4 d 3 5 e 4 6 f 5 7 g 6 8 h > (and of course, these last two expressions can be composed into a single expression if you want) -- Tony Plate Aydemir, Zava (FID) wrote:> Hi > > I rbind data frames in a loop in a cumulative way and the performance > detriorates very quickly. > > My code looks like this: > > for( k in 1:N) > { > filename <- paste("/tmp/myData_",as.character(k),".txt",sep="") > myDataTmp <- read.table(filename,header=TRUE,sep=",") > if( k == 1) { > myData <- myDataTmp > } > else{ > myData <- rbind(myData,myDataTmp) > } > } > > Some more details: > - the size of the stored text files is about 100,000 rows and 50 columns > each > - for k=1: rbind takes 0.0004 seconds > - for k=2: rbind takes 13 seconds > - for k=3: rbind takes 30 seconds > - for k=4: rbind takes 36 seconds > etc > > Any suggestions to improve speed? > > Thanks > > Zava > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >