Ken Termiso
2005-Mar-07 16:37 UTC
[R] Faster way of binding multiple rows of data than rbind?
Hi all, I have a vector that contains the row numbers of data taken from several filtering operations performed on a large data frame (20,000rows x 500cols). In order to output this subset of data, I've been looping through the vector containing the row numbers (keepRows). output <- data.frame(row.names = rownames(bigMatrix)) for(i in keepRows) { output <- rbind(output, bigMatrix[i, ]) } As you may guess, doing all of these rbinds takes a LOT of time, so I'm wondering if there's a workaround where I can maybe use an intermediate matrix-like object to store the loop output, and then coerce it back to a data frame after the loop is complete?? Thanks in advance, Ken
Douglas Bates
2005-Mar-07 17:28 UTC
[R] Faster way of binding multiple rows of data than rbind?
Ken Termiso wrote:> Hi all, > > I have a vector that contains the row numbers of data taken from several > filtering operations performed on a large data frame (20,000rows x > 500cols). > > In order to output this subset of data, I've been looping through the > vector containing the row numbers (keepRows). > > output <- data.frame(row.names = rownames(bigMatrix)) > > for(i in keepRows) > { > output <- rbind(output, bigMatrix[i, ]) > } > > > As you may guess, doing all of these rbinds takes a LOT of time, so I'm > wondering if there's a workaround where I can maybe use an intermediate > matrix-like object to store the loop output, and then coerce it back to > a data frame after the loop is complete??The indexing operations in R are very flexible. You can do this in a single operation as output <- bigMatrix[keepRows, ]