Gundala Viswanath
2009-Jan-11 15:38 UTC
[R] Converting Numerical Matrix to List of Strings
Hi all, Given a matrix:> mat[,1] [,2] [,3] [1,] 0 0 0 [2,] 3 3 3 [3,] 1 1 1 [4,] 2 1 1 How can I convert it to a list of strings:> desired_output[1] "aaa" "ttt" "ccc" "gcc" In principle: 1. Number of Column in matrix = length of string (= 3) 2. Number of Row in matrix = length of vector ( = 4). 3. Character "a" encode as "0", "c" -> "1", "g" -> "2", "t" -> "3" Length of strings are assumed to be uniform within the vector, and it can be greater than 3 (up to 40 characters). - Gundala Viswanath Jakarta - Indonesia
try this:> mapping <- c('0'='a', '1'='c', '2'='g', '3'='t') > x <- matrix(sample(0:3, 30, TRUE), ncol=3) > x[,1] [,2] [,3] [1,] 3 1 1 [2,] 1 3 2 [3,] 1 1 1 [4,] 1 1 1 [5,] 2 1 3 [6,] 1 3 0 [7,] 1 3 2 [8,] 3 1 0 [9,] 0 3 0 [10,] 3 3 0> apply(x, 1, function(z){+ paste(mapping[as.character(z)], collapse='') + }) [1] "tcc" "ctg" "ccc" "ccc" "gct" "cta" "ctg" "tca" "ata" "tta"> >On Sun, Jan 11, 2009 at 10:38 AM, Gundala Viswanath <gundalav at gmail.com> wrote:> Hi all, > > Given a matrix: > >> mat > > [,1] [,2] [,3] > [1,] 0 0 0 > [2,] 3 3 3 > [3,] 1 1 1 > [4,] 2 1 1 > > > How can I convert it to a list of strings: > >> desired_output > [1] "aaa" "ttt" "ccc" "gcc" > > > In principle: > > 1. Number of Column in matrix = length of string (= 3) > 2. Number of Row in matrix = length of vector ( = 4). > 3. Character "a" encode as "0", > "c" -> "1", > "g" -> "2", > "t" -> "3" > > > Length of strings are assumed to be uniform within the vector, > and it can be greater than 3 (up to 40 characters). > > > - Gundala Viswanath > Jakarta - Indonesia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
On Sun, Jan 11, 2009 at 9:38 AM, Gundala Viswanath <gundalav at gmail.com> wrote:> Hi all, > > Given a matrix: > >> mat > > [,1] [,2] [,3] > [1,] 0 0 0 > [2,] 3 3 3 > [3,] 1 1 1 > [4,] 2 1 1> How can I convert it to a list of strings:>> desired_output > [1] "aaa" "ttt" "ccc" "gcc"Are you looking for a general solution or do you want something specific for these 64 potential codon-like patterns? If you just want the patterns corresponding to all possible triplets of A, C, G, T then colSums(4^(0:2) * t(mat)) + 1 gives you a set of indices between 1 and 64. Then you need to create the 64 possible patterns. Here is one way> bases <- factor(c("A","C","G","T")) > head(patterns <- do.call(paste, expand.grid(bases, bases, bases)))[1] "A A A" "C A A" "G A A" "T A A" "A C A" "C C A"> (mat <- matrix(c(0,3,1,2,0,3,1,1,0,3,1,1), ncol = 3))[,1] [,2] [,3] [1,] 0 0 0 [2,] 3 3 3 [3,] 1 1 1 [4,] 2 1 1> colSums(4^(0:2) * t(mat)) + 1[1] 1 64 22 23> patterns[colSums(4^(0:2) * t(mat)) + 1][1] "A A A" "T T T" "C C C" "G C C" We will leave the elimination of the blanks in the patterns as an exercise for the reader.> > In principle: > > 1. Number of Column in matrix = length of string (= 3) > 2. Number of Row in matrix = length of vector ( = 4). > 3. Character "a" encode as "0", > "c" -> "1", > "g" -> "2", > "t" -> "3" > > > Length of strings are assumed to be uniform within the vector, > and it can be greater than 3 (up to 40 characters). > > > - Gundala Viswanath > Jakarta - Indonesia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Dimitris Rizopoulos
2009-Jan-11 16:31 UTC
[R] Converting Numerical Matrix to List of Strings
one way is the following:
mat <- matrix(sample(0:3, 12, TRUE), 4, 3)
strg <- c("a", "c", "g", "t")
out <- strg[mat + 1]
dim(out) <- dim(mat)
apply(out, 1, paste, collapse = "")
I hope it helps.
Best,
Dimitris
Gundala Viswanath wrote:> Hi all,
>
> Given a matrix:
>
>> mat
>
> [,1] [,2] [,3]
> [1,] 0 0 0
> [2,] 3 3 3
> [3,] 1 1 1
> [4,] 2 1 1
>
>
> How can I convert it to a list of strings:
>
>> desired_output
> [1] "aaa" "ttt" "ccc" "gcc"
>
>
> In principle:
>
> 1. Number of Column in matrix = length of string (= 3)
> 2. Number of Row in matrix = length of vector ( = 4).
> 3. Character "a" encode as "0",
> "c" -> "1",
> "g" -> "2",
> "t" -> "3"
>
>
> Length of strings are assumed to be uniform within the vector,
> and it can be greater than 3 (up to 40 characters).
>
>
> - Gundala Viswanath
> Jakarta - Indonesia
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center
Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014