Gundala Viswanath
2009-Jan-06 01:26 UTC
[R] Encoding Vector of Strings into Numerical Matrix
Dear all,
Given such vector of array.
tags <- c("aaa", "ttt", "ccc", "gcc",
"atn")
How can I obtain a matrix corresponding to it
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 3 3 3
[3,] 1 1 1
[4,] 2 1 1
[5,] 0 3 0
In principle:
1. Number of Column in matrix = length of string (= 3)
2. Number of Row in matrix = length of vector ( =4).
3. Character "a" encode as "0",
"c" -> "1",
"g" -> "2",
"t" -> "3"
"n" -> "0"
Length of strings are assumed to be uniform within the vector,
and it can be greater than 3 (up to 40 characters).
- Gundala Viswanath
Jakarta - Indonesia
try this:> tags <- c("aaa", "ttt", "ccc", "gcc", "atn") > key <- c(a=0, c=1, g=2, t=3, n=0) > x <- t(sapply(strsplit(tags, ''), function(z) key[z])) > xa a a [1,] 0 0 0 [2,] 3 3 3 [3,] 1 1 1 [4,] 2 1 1 [5,] 0 3 0 On Mon, Jan 5, 2009 at 8:26 PM, Gundala Viswanath <gundalav at gmail.com> wrote:> Dear all, > > Given such vector of array. > > tags <- c("aaa", "ttt", "ccc", "gcc", "atn") > > How can I obtain a matrix corresponding to it > > [,1] [,2] [,3] > [1,] 0 0 0 > [2,] 3 3 3 > [3,] 1 1 1 > [4,] 2 1 1 > [5,] 0 3 0 > > > In principle: > > 1. Number of Column in matrix = length of string (= 3) > 2. Number of Row in matrix = length of vector ( =4). > 3. Character "a" encode as "0", > "c" -> "1", > "g" -> "2", > "t" -> "3" > "n" -> "0" > > Length of strings are assumed to be uniform within the vector, > and it can be greater than 3 (up to 40 characters). > > - Gundala Viswanath > Jakarta - Indonesia > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?