I have three character strings represented below as seq1, seq2, and seq3. Each string has a reference character different from the other. Thus, for seq1, the reference character is U, seq2, S (3rd S from left where A is leftmost character) and for seq3 Y. seq1 = PQRTUWXYseq2 = AQSDSSDHRSseq3 = EEZYJKFFBHO I wish to generate a 3 by 26 matrix where 3 represent seq1, seq2, seq3 and 26 the letters of the alphabet in order. A matrix entry should correspond to the number of characters from the reference character to the said character. We would consider characters to the left of the reference character to have a negative value and characters to the right a positive value. In addition, if a character appears more than once, we would consider the lowest of the counts. The output for seq1, seq2, seq3 shown below where 99 indicates missing. A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 -4 -3 -2 99 0 99 99 1 2 3 99 -5 99 99 -2 99 99 99 1 99 99 99 99 99 99 99 -4 99 2 0 99 99 99 99 99 99 99 99 5 99 99 2 3 99 6 99 1 2 99 99 99 7 99 99 99 99 99 99 99 99 99 0 -1 Could someone help me with a code on how to implement this.Thank you in advance for your helpJN [[alternative HTML version deleted]]
Hello, Try seq1 <- 'PQRTUWXY' seq2 <- 'AQSDSSDHRS' seq3 <- 'EEZYJKFFBHO' ref1 <- 'U' ref2 <- 'S' ref3 <- 'Y' fun <- function(seq, chr){ f <- function(x, seq, chr){ pos <- regexpr(x, seq) if(pos < 0) 99 else as.integer(pos - regexpr(chr, seq)) } sapply(LETTERS, f, seq, chr) } rbind( fun(seq1, ref1), fun(seq2, ref2), fun(seq3, ref3) ) Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/counting-characters-starting-point-tp4405233p4405475.html Sent from the R help mailing list archive at Nabble.com.