There was a BioConductor thread today where the poster wanted to find pairwise difference between columns of a matrix. I suggested the slow solution below, hoping that someone might suggest a faster and/or more elegant solution, but no other response. I tried unsuccessfully with the apply() family. Searching the mailing list was not very fruitful either. The closest I got to was a cryptic chunk of code in pairwise.table(). Since I do use something similar myself occasionally, I am hoping someone from the R-help list can suggest alternatives or past threads. Thank you. ### Code ### pairwise.difference <- function(m){ npairs <- choose( ncol(m), 2 ) results <- matrix( NA, nc=npairs, nr=nrow(m) ) cnames <- rep(NA, npairs) if(is.null(colnames(m))) colnames(m) <- paste("col", 1:ncol(m), sep="") k <- 1 for(i in 1:ncol(m)){ for(j in 1:ncol(m)){ if(j <= i) next; results[ ,k] <- m[ ,i] - m[ ,j] cnames[k] <- paste(colnames(m)[ c(i, j) ], collapse=".vs.") k <- k + 1 } } colnames(results) <- cnames rownames(results) <- rownames(m) return(results) } ### Example using a matrix with 5 gene/row and 4 columns ### mat <- matrix( sample(1:20), nc=4 ) colnames(mat) <- LETTERS[1:4]; rownames(mat) <- paste( "g", 1:5, sep="") mat A B C D g1 10 16 3 15 g2 18 5 12 19 g3 7 4 8 13 g4 14 2 6 11 g5 17 1 20 9 pairwise.difference(mat) A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D g1 -6 7 -5 13 1 -12 g2 13 6 -1 -7 -14 -7 g3 3 -1 -6 -4 -9 -5 g4 12 8 3 -4 -9 -5 g5 16 -3 8 -19 -8 11 Regards, -- Adaikalavan Ramasamy ramasamy at cancer.org.uk Centre for Statistics in Medicine http://www.ihs.ox.ac.uk/csm/ Cancer Research UK Tel : 01865 226 677 Old Road Campus, Headington, Oxford Fax : 01865 226 962
On Fri, 2004-07-30 at 18:30, Adaikalavan Ramasamy wrote:> There was a BioConductor thread today where the poster wanted to find > pairwise difference between columns of a matrix. I suggested the slow > solution below, hoping that someone might suggest a faster and/or more > elegant solution, but no other response. > > I tried unsuccessfully with the apply() family. Searching the mailing > list was not very fruitful either. The closest I got to was a cryptic > chunk of code in pairwise.table(). > > Since I do use something similar myself occasionally, I am hoping > someone from the R-help list can suggest alternatives or past threads. > Thank you. > > ### Code ### > pairwise.difference <- function(m){ > npairs <- choose( ncol(m), 2 ) > results <- matrix( NA, nc=npairs, nr=nrow(m) ) > cnames <- rep(NA, npairs) > if(is.null(colnames(m))) colnames(m) <- paste("col", 1:ncol(m), sep="") > > k <- 1 > for(i in 1:ncol(m)){ > for(j in 1:ncol(m)){ > if(j <= i) next; > results[ ,k] <- m[ ,i] - m[ ,j] > cnames[k] <- paste(colnames(m)[ c(i, j) ], collapse=".vs.") > k <- k + 1 > } > } > > colnames(results) <- cnames > rownames(results) <- rownames(m) > return(results) > } > > ### Example using a matrix with 5 gene/row and 4 columns ### > mat <- matrix( sample(1:20), nc=4 ) > colnames(mat) <- LETTERS[1:4]; rownames(mat) <- paste( "g", 1:5, sep="") > mat > A B C D > g1 10 16 3 15 > g2 18 5 12 19 > g3 7 4 8 13 > g4 14 2 6 11 > g5 17 1 20 9 > > pairwise.difference(mat) > A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D > g1 -6 7 -5 13 1 -12 > g2 13 6 -1 -7 -14 -7 > g3 3 -1 -6 -4 -9 -5 > g4 12 8 3 -4 -9 -5 > g5 16 -3 8 -19 -8 11How about this: I am taking advantage of the combinations() function in the 'gregmisc' package to define the pairwise column combinations based upon the input matrix colnames. Given that, perhaps Greg might want to add this function to the package if it holds up to scrutiny. Additional error checking would be required as I note below. pairwise.diffs <- function(x) { if(is.null(colnames(x))) colnames(x) <- 1:ncol(x) col.diffs <- combinations(ncol(x), 2, colnames(x)) result <- x[, col.diffs[, 1]] - x[, col.diffs[, 2]] colnames(result) <- paste(col.diffs[, 1], ".vs.", col.diffs[, 2], sep = "") result } What I am essentially doing is creating the matrix 'col.diffs' to hold the combinations of the colnames in matrix 'x'. If 'x' does not have colnames, I set them to the column indices. Then in line 2, I do the pairwise subtractions. Line 3 simply sets up the colnames in the result as the combinations. Note that the subtractions, as you have above, are the first column minus the second column in the pairwise combinations. You would also want to check for an input matrix of <3 columns, since the 'result' in that case would be a vector, rather than a matrix. In that case, you could add code to coerce 'result' to a matrix, or simply not allow matrices with <3 columns. So, using your example matrix above (different seed value):> mat <- matrix(sample(1:20), nc=4) > colnames(mat) <- LETTERS[1:4] > rownames(mat) <- paste( "g", 1:5, sep="") > matA B C D g1 1 17 13 10 g2 12 5 7 16 g3 2 19 6 14 g4 20 4 11 8 g5 3 15 18 9> pairwise.diffs(mat)A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D g1 -16 -12 -9 4 7 3 g2 7 5 -4 -2 -11 -9 g3 -17 -4 -12 13 5 -8 g4 16 9 12 -7 -4 3 g5 -12 -15 -6 -3 6 9 HTH, Marc Schwartz
Adaikalavan Ramasamy <ramasamy <at> cancer.org.uk> writes: : : There was a BioConductor thread today where the poster wanted to find : pairwise difference between columns of a matrix. I suggested the slow : solution below, hoping that someone might suggest a faster and/or more : elegant solution, but no other response. : : I tried unsuccessfully with the apply() family. Searching the mailing : list was not very fruitful either. The closest I got to was a cryptic : chunk of code in pairwise.table(). : : Since I do use something similar myself occasionally, I am hoping : someone from the R-help list can suggest alternatives or past threads. : Thank you. : : ### Code ### : pairwise.difference <- function(m){ : npairs <- choose( ncol(m), 2 ) : results <- matrix( NA, nc=npairs, nr=nrow(m) ) : cnames <- rep(NA, npairs) : if(is.null(colnames(m))) colnames(m) <- paste("col", 1:ncol(m), sep="") : : k <- 1 : for(i in 1:ncol(m)){ : for(j in 1:ncol(m)){ : if(j <= i) next; : results[ ,k] <- m[ ,i] - m[ ,j] : cnames[k] <- paste(colnames(m)[ c(i, j) ], collapse=".vs.") : k <- k + 1 : } : } : : colnames(results) <- cnames : rownames(results) <- rownames(m) : return(results) : } : : ### Example using a matrix with 5 gene/row and 4 columns ### : mat <- matrix( sample(1:20), nc=4 ) : colnames(mat) <- LETTERS[1:4]; rownames(mat) <- paste( "g", 1:5, sep="") : mat : A B C D : g1 10 16 3 15 : g2 18 5 12 19 : g3 7 4 8 13 : g4 14 2 6 11 : g5 17 1 20 9 : : pairwise.difference(mat) : A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D : g1 -6 7 -5 13 1 -12 : g2 13 6 -1 -7 -14 -7 : g3 3 -1 -6 -4 -9 -5 : g4 12 8 3 -4 -9 -5 : g5 16 -3 8 -19 -8 11 1. Note that mat[,j] - mat subtracts each column of mat from the j-th column so we just cbind 3 matrices: cbind( mat[,1]-mat[2:4], mat[,2]-mat[,3:4], mat[,3]-mat[,4,drop=F] ) 2. For a general matrix a single sapply can do it like this: f <- function(i, mat) mat[, i-1] - mat[, i:ncol(mat), drop = FALSE] do.call("cbind", sapply(2:ncol(mat), f, mat)) 3. To add nice column names just enhance f. The statements "z <- ..." and "do.call ..." are the same as before: f <- function(i, mat) { z <- mat[, i-1] - mat[, i:ncol(mat), drop = FALSE] colnames(z) <- paste(colnames(mat)[i-1], colnames(z), sep = "-") z } do.call("cbind", sapply(2:ncol(mat), f, mat))