Gundala Viswanath
2009-Jan-11 15:38 UTC
[R] Converting Numerical Matrix to List of Strings
Hi all, Given a matrix:> mat[,1] [,2] [,3] [1,] 0 0 0 [2,] 3 3 3 [3,] 1 1 1 [4,] 2 1 1 How can I convert it to a list of strings:> desired_output[1] "aaa" "ttt" "ccc" "gcc" In principle: 1. Number of Column in matrix = length of string (= 3) 2. Number of Row in matrix = length of vector ( = 4). 3. Character "a" encode as "0", "c" -> "1", "g" -> "2", "t" -> "3" Length of strings are assumed to be uniform within the vector, and it can be greater than 3 (up to 40 characters). - Gundala Viswanath Jakarta - Indonesia
try this:> mapping <- c('0'='a', '1'='c', '2'='g', '3'='t') > x <- matrix(sample(0:3, 30, TRUE), ncol=3) > x[,1] [,2] [,3] [1,] 3 1 1 [2,] 1 3 2 [3,] 1 1 1 [4,] 1 1 1 [5,] 2 1 3 [6,] 1 3 0 [7,] 1 3 2 [8,] 3 1 0 [9,] 0 3 0 [10,] 3 3 0> apply(x, 1, function(z){+ paste(mapping[as.character(z)], collapse='') + }) [1] "tcc" "ctg" "ccc" "ccc" "gct" "cta" "ctg" "tca" "ata" "tta"> >On Sun, Jan 11, 2009 at 10:38 AM, Gundala Viswanath <gundalav at gmail.com> wrote:> Hi all, > > Given a matrix: > >> mat > > [,1] [,2] [,3] > [1,] 0 0 0 > [2,] 3 3 3 > [3,] 1 1 1 > [4,] 2 1 1 > > > How can I convert it to a list of strings: > >> desired_output > [1] "aaa" "ttt" "ccc" "gcc" > > > In principle: > > 1. Number of Column in matrix = length of string (= 3) > 2. Number of Row in matrix = length of vector ( = 4). > 3. Character "a" encode as "0", > "c" -> "1", > "g" -> "2", > "t" -> "3" > > > Length of strings are assumed to be uniform within the vector, > and it can be greater than 3 (up to 40 characters). > > > - Gundala Viswanath > Jakarta - Indonesia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
On Sun, Jan 11, 2009 at 9:38 AM, Gundala Viswanath <gundalav at gmail.com> wrote:> Hi all, > > Given a matrix: > >> mat > > [,1] [,2] [,3] > [1,] 0 0 0 > [2,] 3 3 3 > [3,] 1 1 1 > [4,] 2 1 1> How can I convert it to a list of strings:>> desired_output > [1] "aaa" "ttt" "ccc" "gcc"Are you looking for a general solution or do you want something specific for these 64 potential codon-like patterns? If you just want the patterns corresponding to all possible triplets of A, C, G, T then colSums(4^(0:2) * t(mat)) + 1 gives you a set of indices between 1 and 64. Then you need to create the 64 possible patterns. Here is one way> bases <- factor(c("A","C","G","T")) > head(patterns <- do.call(paste, expand.grid(bases, bases, bases)))[1] "A A A" "C A A" "G A A" "T A A" "A C A" "C C A"> (mat <- matrix(c(0,3,1,2,0,3,1,1,0,3,1,1), ncol = 3))[,1] [,2] [,3] [1,] 0 0 0 [2,] 3 3 3 [3,] 1 1 1 [4,] 2 1 1> colSums(4^(0:2) * t(mat)) + 1[1] 1 64 22 23> patterns[colSums(4^(0:2) * t(mat)) + 1][1] "A A A" "T T T" "C C C" "G C C" We will leave the elimination of the blanks in the patterns as an exercise for the reader.> > In principle: > > 1. Number of Column in matrix = length of string (= 3) > 2. Number of Row in matrix = length of vector ( = 4). > 3. Character "a" encode as "0", > "c" -> "1", > "g" -> "2", > "t" -> "3" > > > Length of strings are assumed to be uniform within the vector, > and it can be greater than 3 (up to 40 characters). > > > - Gundala Viswanath > Jakarta - Indonesia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Dimitris Rizopoulos
2009-Jan-11 16:31 UTC
[R] Converting Numerical Matrix to List of Strings
one way is the following: mat <- matrix(sample(0:3, 12, TRUE), 4, 3) strg <- c("a", "c", "g", "t") out <- strg[mat + 1] dim(out) <- dim(mat) apply(out, 1, paste, collapse = "") I hope it helps. Best, Dimitris Gundala Viswanath wrote:> Hi all, > > Given a matrix: > >> mat > > [,1] [,2] [,3] > [1,] 0 0 0 > [2,] 3 3 3 > [3,] 1 1 1 > [4,] 2 1 1 > > > How can I convert it to a list of strings: > >> desired_output > [1] "aaa" "ttt" "ccc" "gcc" > > > In principle: > > 1. Number of Column in matrix = length of string (= 3) > 2. Number of Row in matrix = length of vector ( = 4). > 3. Character "a" encode as "0", > "c" -> "1", > "g" -> "2", > "t" -> "3" > > > Length of strings are assumed to be uniform within the vector, > and it can be greater than 3 (up to 40 characters). > > > - Gundala Viswanath > Jakarta - Indonesia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014