Gundala Viswanath
2009-Jan-06 01:26 UTC
[R] Encoding Vector of Strings into Numerical Matrix
Dear all, Given such vector of array. tags <- c("aaa", "ttt", "ccc", "gcc", "atn") How can I obtain a matrix corresponding to it [,1] [,2] [,3] [1,] 0 0 0 [2,] 3 3 3 [3,] 1 1 1 [4,] 2 1 1 [5,] 0 3 0 In principle: 1. Number of Column in matrix = length of string (= 3) 2. Number of Row in matrix = length of vector ( =4). 3. Character "a" encode as "0", "c" -> "1", "g" -> "2", "t" -> "3" "n" -> "0" Length of strings are assumed to be uniform within the vector, and it can be greater than 3 (up to 40 characters). - Gundala Viswanath Jakarta - Indonesia
try this:> tags <- c("aaa", "ttt", "ccc", "gcc", "atn") > key <- c(a=0, c=1, g=2, t=3, n=0) > x <- t(sapply(strsplit(tags, ''), function(z) key[z])) > xa a a [1,] 0 0 0 [2,] 3 3 3 [3,] 1 1 1 [4,] 2 1 1 [5,] 0 3 0 On Mon, Jan 5, 2009 at 8:26 PM, Gundala Viswanath <gundalav at gmail.com> wrote:> Dear all, > > Given such vector of array. > > tags <- c("aaa", "ttt", "ccc", "gcc", "atn") > > How can I obtain a matrix corresponding to it > > [,1] [,2] [,3] > [1,] 0 0 0 > [2,] 3 3 3 > [3,] 1 1 1 > [4,] 2 1 1 > [5,] 0 3 0 > > > In principle: > > 1. Number of Column in matrix = length of string (= 3) > 2. Number of Row in matrix = length of vector ( =4). > 3. Character "a" encode as "0", > "c" -> "1", > "g" -> "2", > "t" -> "3" > "n" -> "0" > > Length of strings are assumed to be uniform within the vector, > and it can be greater than 3 (up to 40 characters). > > - Gundala Viswanath > Jakarta - Indonesia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?