Hi there, I would like to find a more efficient way of permuting the rows and columns of a symmetrical matrix that represents ecological or actual distances between objects in space. The permutation is of the type used in a Mantel test. Specifically, the permutation has to accomplish something like this: Original matrix addresses: a11 a12 a13 a21 a22 a23 a31 a32 a33 Example permutation a22 a23 a21 a32 a33 a31 a12 a13 a11 that is relative positions of rows and columns are conserved in the permutation. Basically, I have been doing this in a "for" loop by (1) permuting the raw data vector using "sample", (2) generating a lower triangular distance matrix from the permuted raw data using the "distance" function from "ecodist', and (3) calculating a bunch of statistics including the Mantel correlation and multiple regression statistics, which are then stored in blank matrices that were declared prior to beginning the loop. The whole procedure needs to repeat at least 999 times but 1999 times would be better and 9999 times would be ideal. The problem is, R-users will know, is that using "for" loops like this is slow, and gets slower the further into the loop you get. However, I am not a sophisticated programmer, and cannot think of a more efficient way to do this. Thanks in advance, Andy Park (University of Winnipeg).
On Nov 16, 2007, at 6:42 PM, Andrew Park wrote:> Hi there, > > I would like to find a more efficient way of permuting the rows and > columns of a symmetrical matrix that represents ecological or > actual distances between objects in space. The permutation is of > the type used in a Mantel test. > > Specifically, the permutation has to accomplish something like this: > > > Original matrix addresses: > > a11 a12 a13 > > a21 a22 a23 > > a31 a32 a33 > > > Example permutation > > a22 a23 a21 > > a32 a33 a31 > > a12 a13 a11 > > that is relative positions of rows and columns are conserved in the > permutation. > > The problem is, R-users will know, is that using "for" loops like > this is slow, and gets slower the further into the loop you get. > > However, I am not a sophisticated programmer, and cannot think of a > more efficient way to do this.Would this work do what you want? (Main point: You can index a vector by an array): n <- 3 x <- apply(expand.grid(1:n,1:n),1,paste,collapse="") x <- paste("a",x,sep="") dim(x) <- c(n,n) prm <- sample(1:n) h<-apply(expand.grid(prm,prm),1,function(x) x[1]+n*x[2]-n) matrix(x[h],c(n,n))> Thanks in advance, > > Andy Park (University of Winnipeg). >Haris Skiadas Department of Mathematics and Computer Science Hanover College
Andy, As you have noted, there are issues related to looping in R. There are a couple of possible solutions. 1) code the permutation routine in FORTRAN or C and only call it once. If you don't know either of those languages then this won't help. 2) avoid recalculating the raw distances and simple permute the existing matrix a suitable number of times. E.g. > library(labdsv) > dis.bc <- dsvdis(bryceveg,'bray') # bray/curtis dissimilarity matrix > dis.mat <- as.matrix(dis.bc) > size <- nrow(dis.mat) > for (i in 1:999) { > tmp <- dis.mat[sample(1:size,size,replace=FALSE),] # permute rows > tmp <- tmp[,sample(1:size,size,replace=FALSE)] # permute columns > # calculate mantel or whatever > } This still requires looping, but avoids the call to ecodist to continually recalculate distances that you already know. Since sample() is optimized R code, even in a loop it's pretty fast. By permuting rows first, and then columns in the same loop you avoid nested loops which is really slow. On my fairly old PC the above code took a few seconds, and dis.mat is 160x160. Dave Roberts Andrew Park wrote:> > Hi there, > > I would like to find a more efficient way of permuting the rows and columns of a symmetrical matrix that represents ecological or actual distances between objects in space. The permutation is of the type used in a Mantel test. > > Specifically, the permutation has to accomplish something like this: > > > Original matrix addresses: > > a11 a12 a13 > > a21 a22 a23 > > a31 a32 a33 > > > Example permutation > > a22 a23 a21 > > a32 a33 a31 > > a12 a13 a11 > > that is relative positions of rows and columns are conserved in the permutation. > > Basically, I have been doing this in a "for" loop by (1) permuting the raw data vector using "sample", (2) generating a lower triangular distance matrix from the permuted raw data using the "distance" function from "ecodist', and (3) calculating a bunch of statistics including the Mantel correlation and multiple regression statistics, which are then stored in blank matrices that were declared prior to beginning the loop. The whole procedure needs to repeat at least 999 times but 1999 times would be better and 9999 times would be ideal. > > The problem is, R-users will know, is that using "for" loops like this is slow, and gets slower the further into the loop you get. > > However, I am not a sophisticated programmer, and cannot think of a more efficient way to do this. > > Thanks in advance, > > Andy Park (University of Winnipeg). > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ David W. Roberts office 406-994-4548 Professor and Head FAX 406-994-3190 Department of Ecology email droberts at montana.edu Montana State University Bozeman, MT 59717-3460
Andy, Sorry, my first response was a little hasty. I see you're trying to preserve the distance of a sample to itself along the diagonal and the symmetry of the matrix. This is actually simpler. > library(labdsv) > dis.bc <- dsvdis(bryceveg,'bray') # bray/curtis dissimilarity matrix > dis.mat <- as.matrix(dis.bc) > size <- nrow(dis.mat) > for (i in 1:999) { > z <- sample(1:size,size,replace=FALSE) > tmp <- dis.mat[z,] > tmp <- tmp[,z] > # calculate mantel or other > } In this case we use the same permuted vector for both the row and column and preserve the symmetry. Dave Roberts Andrew Park wrote:> > Hi there, > > I would like to find a more efficient way of permuting the rows and columns of a symmetrical matrix that represents ecological or actual distances between objects in space. The permutation is of the type used in a Mantel test. > > Specifically, the permutation has to accomplish something like this: > > > Original matrix addresses: > > a11 a12 a13 > > a21 a22 a23 > > a31 a32 a33 > > > Example permutation > > a22 a23 a21 > > a32 a33 a31 > > a12 a13 a11 > > that is relative positions of rows and columns are conserved in the permutation. > > Basically, I have been doing this in a "for" loop by (1) permuting the raw data vector using "sample", (2) generating a lower triangular distance matrix from the permuted raw data using the "distance" function from "ecodist', and (3) calculating a bunch of statistics including the Mantel correlation and multiple regression statistics, which are then stored in blank matrices that were declared prior to beginning the loop. The whole procedure needs to repeat at least 999 times but 1999 times would be better and 9999 times would be ideal. > > The problem is, R-users will know, is that using "for" loops like this is slow, and gets slower the further into the loop you get. > > However, I am not a sophisticated programmer, and cannot think of a more efficient way to do this. > > Thanks in advance, > > Andy Park (University of Winnipeg). > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ David W. Roberts office 406-994-4548 Professor and Head FAX 406-994-3190 Department of Ecology email droberts at montana.edu Montana State University Bozeman, MT 59717-3460
On 16/11/2007 6:42 PM, Andrew Park wrote:> > Hi there, > > I would like to find a more efficient way of permuting the rows and columns of a symmetrical matrix that represents ecological or actual distances between objects in space. The permutation is of the type used in a Mantel test. > > Specifically, the permutation has to accomplish something like this: > > > Original matrix addresses: > > a11 a12 a13 > > a21 a22 a23 > > a31 a32 a33 > > > Example permutation > > a22 a23 a21 > > a32 a33 a31 > > a12 a13 a11 > > that is relative positions of rows and columns are conserved in the permutation. > > Basically, I have been doing this in a "for" loop by (1) permuting the raw data vector using "sample", (2) generating a lower triangular distance matrix from the permuted raw data using the "distance" function from "ecodist', and (3) calculating a bunch of statistics including the Mantel correlation and multiple regression statistics, which are then stored in blank matrices that were declared prior to beginning the loop. The whole procedure needs to repeat at least 999 times but 1999 times would be better and 9999 times would be ideal. > > The problem is, R-users will know, is that using "for" loops like this is slow, and gets slower the further into the loop you get.I don't think for loops should slow down. What you may be doing is gradually growing a result vector; that does slow down over time. For example, this is slow: result <- c() for (i in 1:100000) result <- c(result, i) but this is very quick: result <- numeric(100000) for (i in 1:100000) result[i] <- i Duncan Murdoch> > However, I am not a sophisticated programmer, and cannot think of a more efficient way to do this. > > Thanks in advance, > > Andy Park (University of Winnipeg). > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.