thr3ads.net - R help - [R] Permutation of a distance matrix [Nov 2007]

If this information is useful, please help other people find it:
Share via:

Andrew Park

2007-Nov-16 23:42 UTC

[R] Permutation of a distance matrix

Hi there,

I would like to find a more efficient way of permuting the rows and columns of a
symmetrical matrix that represents ecological or actual distances between
objects in space.  The permutation is of the type used in a Mantel test.

Specifically, the permutation has to accomplish something like this:


Original matrix addresses:

a11   a12   a13

a21   a22   a23

a31   a32   a33


Example permutation

a22   a23   a21

a32   a33   a31

a12   a13   a11

that is relative positions of rows and columns are conserved in the permutation.

Basically, I have been doing this in a "for" loop by (1) permuting the
raw data vector using "sample", (2) generating a lower triangular
distance matrix from the permuted raw data using the "distance"
function from "ecodist', and (3) calculating a bunch of statistics
including the Mantel correlation and multiple regression statistics, which are
then stored in blank matrices that were declared prior to beginning the loop. 
The whole procedure needs to repeat at least 999 times but 1999 times would be
better and 9999 times would be ideal.

The problem is, R-users will know, is that using "for" loops like this
is slow, and gets slower the further into the loop you get.

However, I am not a sophisticated programmer, and cannot think of a more
efficient way to do this.

Thanks in advance,

Andy Park (University of Winnipeg).

Charilaos Skiadas

2007-Nov-17 03:20 UTC

head link

[R] Permutation of a distance matrix

On Nov 16, 2007, at 6:42 PM, Andrew Park wrote:
> Hi there,
>
> I would like to find a more efficient way of permuting the rows and  
> columns of a symmetrical matrix that represents ecological or  
> actual distances between objects in space.  The permutation is of  
> the type used in a Mantel test.
>
> Specifically, the permutation has to accomplish something like this:
>
>
> Original matrix addresses:
>
> a11   a12   a13
>
> a21   a22   a23
>
> a31   a32   a33
>
>
> Example permutation
>
> a22   a23   a21
>
> a32   a33   a31
>
> a12   a13   a11
>
> that is relative positions of rows and columns are conserved in the  
> permutation.
>
> The problem is, R-users will know, is that using "for" loops like
> this is slow, and gets slower the further into the loop you get.
>
> However, I am not a sophisticated programmer, and cannot think of a  
> more efficient way to do this.
Would this work do what you want? (Main point: You can index a vector  
by an array):

n <- 3
x <- apply(expand.grid(1:n,1:n),1,paste,collapse="")
x <- paste("a",x,sep="")
dim(x) <- c(n,n)
prm <- sample(1:n)
h<-apply(expand.grid(prm,prm),1,function(x) x[1]+n*x[2]-n)
matrix(x[h],c(n,n))
> Thanks in advance,
>
> Andy Park (University of Winnipeg).
>
Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

Dave Roberts

2007-Nov-27 23:40 UTC

head link

[R] Permutation of a distance matrix

Andy,

    As you have noted, there are issues related to looping in R.  There 
are a couple of possible solutions.

1) code the permutation routine in FORTRAN or C and only call it once. 
If you don't know either of those languages then this won't help.

2) avoid recalculating the raw distances and simple permute the existing 
matrix a suitable number of times.  E.g.

 > library(labdsv)
 > dis.bc <- dsvdis(bryceveg,'bray') # bray/curtis dissimilarity
matrix
 > dis.mat <- as.matrix(dis.bc)
 > size <- nrow(dis.mat)
 > for (i in 1:999) {
 >     tmp <- dis.mat[sample(1:size,size,replace=FALSE),] # permute rows
 >     tmp <- tmp[,sample(1:size,size,replace=FALSE)] # permute columns
 >     # calculate mantel or whatever
 >  }

This still requires looping, but avoids the call to ecodist to 
continually recalculate distances that you already know.  Since sample()
is optimized R code, even in a loop it's pretty fast.  By permuting rows 
first, and then columns in the same loop you avoid nested loops which is 
really slow.  On my fairly old PC the above code took a few seconds, and 
dis.mat is 160x160.

Dave Roberts

Andrew Park wrote:> 
> Hi there,
> 
> I would like to find a more efficient way of permuting the rows and columns
of a symmetrical matrix that represents ecological or actual distances between
objects in space.  The permutation is of the type used in a Mantel test.
> 
> Specifically, the permutation has to accomplish something like this:
> 
> 
> Original matrix addresses:
> 
> a11   a12   a13
> 
> a21   a22   a23
> 
> a31   a32   a33
> 
> 
> Example permutation
> 
> a22   a23   a21
> 
> a32   a33   a31
> 
> a12   a13   a11
> 
> that is relative positions of rows and columns are conserved in the
permutation.
> 
> Basically, I have been doing this in a "for" loop by (1)
permuting the raw data vector using "sample", (2) generating a lower
triangular distance matrix from the permuted raw data using the
"distance" function from "ecodist', and (3) calculating a
bunch of statistics including the Mantel correlation and multiple regression
statistics, which are then stored in blank matrices that were declared prior to
beginning the loop.  The whole procedure needs to repeat at least 999 times but
1999 times would be better and 9999 times would be ideal.
> 
> The problem is, R-users will know, is that using "for" loops like
this is slow, and gets slower the further into the loop you get.
> 
> However, I am not a sophisticated programmer, and cannot think of a more
efficient way to do this.
> 
> Thanks in advance,
> 
> Andy Park (University of Winnipeg).
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
David W. Roberts                                     office 406-994-4548
Professor and Head                                      FAX 406-994-3190
Department of Ecology                         email droberts at montana.edu
Montana State University
Bozeman, MT 59717-3460

Dave Roberts

2007-Nov-27 23:52 UTC

head link

[R] Permutation of a distance matrix

Andy,

     Sorry, my first response was a little hasty.  I see you're trying 
to preserve the distance of a sample to itself along the diagonal and 
the symmetry of the matrix.  This is actually simpler.

 > library(labdsv)
 > dis.bc <- dsvdis(bryceveg,'bray') # bray/curtis dissimilarity
matrix
 > dis.mat <- as.matrix(dis.bc)
 > size <- nrow(dis.mat)
 > for (i in 1:999) {
 >     z <- sample(1:size,size,replace=FALSE)
 >     tmp <- dis.mat[z,]
 >     tmp <- tmp[,z]
 >     # calculate mantel or other
 > }

In this case we use the same permuted vector for both the row and column 
and preserve the symmetry.

Dave Roberts

Andrew Park wrote:> 
> Hi there,
> 
> I would like to find a more efficient way of permuting the rows and columns
of a symmetrical matrix that represents ecological or actual distances between
objects in space.  The permutation is of the type used in a Mantel test.
> 
> Specifically, the permutation has to accomplish something like this:
> 
> 
> Original matrix addresses:
> 
> a11   a12   a13
> 
> a21   a22   a23
> 
> a31   a32   a33
> 
> 
> Example permutation
> 
> a22   a23   a21
> 
> a32   a33   a31
> 
> a12   a13   a11
> 
> that is relative positions of rows and columns are conserved in the
permutation.
> 
> Basically, I have been doing this in a "for" loop by (1)
permuting the raw data vector using "sample", (2) generating a lower
triangular distance matrix from the permuted raw data using the
"distance" function from "ecodist', and (3) calculating a
bunch of statistics including the Mantel correlation and multiple regression
statistics, which are then stored in blank matrices that were declared prior to
beginning the loop.  The whole procedure needs to repeat at least 999 times but
1999 times would be better and 9999 times would be ideal.
> 
> The problem is, R-users will know, is that using "for" loops like
this is slow, and gets slower the further into the loop you get.
> 
> However, I am not a sophisticated programmer, and cannot think of a more
efficient way to do this.
> 
> Thanks in advance,
> 
> Andy Park (University of Winnipeg).
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
David W. Roberts                                     office 406-994-4548
Professor and Head                                      FAX 406-994-3190
Department of Ecology                         email droberts at montana.edu
Montana State University
Bozeman, MT 59717-3460

Duncan Murdoch

2007-Nov-28 01:32 UTC

head link

[R] Permutation of a distance matrix

On 16/11/2007 6:42 PM, Andrew Park wrote:> 
> Hi there,
> 
> I would like to find a more efficient way of permuting the rows and columns
of a symmetrical matrix that represents ecological or actual distances between
objects in space.  The permutation is of the type used in a Mantel test.
> 
> Specifically, the permutation has to accomplish something like this:
> 
> 
> Original matrix addresses:
> 
> a11   a12   a13
> 
> a21   a22   a23
> 
> a31   a32   a33
> 
> 
> Example permutation
> 
> a22   a23   a21
> 
> a32   a33   a31
> 
> a12   a13   a11
> 
> that is relative positions of rows and columns are conserved in the
permutation.
> 
> Basically, I have been doing this in a "for" loop by (1)
permuting the raw data vector using "sample", (2) generating a lower
triangular distance matrix from the permuted raw data using the
"distance" function from "ecodist', and (3) calculating a
bunch of statistics including the Mantel correlation and multiple regression
statistics, which are then stored in blank matrices that were declared prior to
beginning the loop.  The whole procedure needs to repeat at least 999 times but
1999 times would be better and 9999 times would be ideal.
> 
> The problem is, R-users will know, is that using "for" loops like
this is slow, and gets slower the further into the loop you get.
I don't think for loops should slow down.  What you may be doing is 
gradually growing a result vector; that does slow down over time.

For example, this is slow:

result <- c()
for (i in 1:100000) result <- c(result, i)

but this is very quick:

result <- numeric(100000)
for (i in 1:100000) result[i] <- i

Duncan Murdoch
> 
> However, I am not a sophisticated programmer, and cannot think of a more
efficient way to do this.
> 
> Thanks in advance,
> 
> Andy Park (University of Winnipeg).
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Possibly Parallel Threads

Search for more reasonably related threads

R help - Nov 2007 - Permutation of a distance matrix

[R] Permutation of a distance matrix

[R] Permutation of a distance matrix

[R] Permutation of a distance matrix

[R] Permutation of a distance matrix

[R] Permutation of a distance matrix

Possibly Parallel Threads