Adaikalavan Ramasamy
2008-Aug-24 22:35 UTC
[R] howto optimize operations between pairs of rows in a single matrix like cor and pairs
Hi, I calculating the output of a function when applied to pairs of row from a single matrix or dataframe similar to how cor() and pairs() work. This is the code that I have been using: pairwise.apply <- function(x, FUN, ...){ n <- nrow(x) r <- rownames(x) output <- matrix(NA, nc=n, nr=n, dimnames=list(r, r)) for(i in 1:n){ for(j in 1:n){ if(i >= j) next() output[i, j] <- FUN( x[i,], x[j,] ) } } return(output) } I realize that the output of the pairwise operation needs to be scalar. Here is an example. The actual function and dataset I want to use is more complicated and thus the function runs slow for large datasets. m <- iris[ 1:5, 1:4 ] pairwise.apply(m, sum) 1 2 3 4 5 1 NA 19.7 19.6 19.6 20.4 2 NA NA 18.9 18.9 19.7 3 NA NA NA 18.8 19.6 4 NA NA NA NA 19.6 5 NA NA NA NA NA Can I use apply() or any of it's family to optimize the codes? I have tried playing around with outer, kronecker, mapply without any sucess. Any suggestions? Thank you. Regards, Adai
jim holtman
2008-Aug-25 02:32 UTC
[R] howto optimize operations between pairs of rows in a single matrix like cor and pairs
Use Rprof to see where time is being spent. If it is in FUN, then there is probably no way to "optimize" outside of changing the way FUN works. So the first thing is to decide where time is being spent. On Sun, Aug 24, 2008 at 6:35 PM, Adaikalavan Ramasamy <a.ramasamy at imperial.ac.uk> wrote:> Hi, > > I calculating the output of a function when applied to pairs of row from a > single matrix or dataframe similar to how cor() and pairs() work. This is > the code that I have been using: > > pairwise.apply <- function(x, FUN, ...){ > > > n <- nrow(x) > r <- rownames(x) > output <- matrix(NA, nc=n, nr=n, dimnames=list(r, r)) > > > for(i in 1:n){ > for(j in 1:n){ > if(i >= j) next() > output[i, j] <- FUN( x[i,], x[j,] ) > } > } > return(output) > } > > I realize that the output of the pairwise operation needs to be scalar. Here > is an example. The actual function and dataset I want to use is more > complicated and thus the function runs slow for large datasets. > > m <- iris[ 1:5, 1:4 ] > > pairwise.apply(m, sum) > 1 2 3 4 5 > 1 NA 19.7 19.6 19.6 20.4 > 2 NA NA 18.9 18.9 19.7 > 3 NA NA NA 18.8 19.6 > 4 NA NA NA NA 19.6 > 5 NA NA NA NA NA > > Can I use apply() or any of it's family to optimize the codes? I have tried > playing around with outer, kronecker, mapply without any sucess. > > Any suggestions? Thank you. > > Regards, Adai > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?