thr3ads.net - R help - [R] Sequence analysis [Apr 2013]

If this information is useful, please help other people find it:
Share via:

ben1983

2013-Apr-19 11:21 UTC

[R] Sequence analysis

Hiya,
            I am trying to look at the similarities between a number of
sequences, for example i am trying to see how similar "ababbbassdaa"
is to
"addffggssbbsbbs" I was wondering is the some way for me to see how
similar
they are in terms of, for example, number of a's, number of b's, how
often a
and ab are consecutive, how often abab is together etc.
Any advice would be really useful......any kind of shove in the right
direction would be amazing! I've tried doing basic alignments but i think
this is loosing quite a lot of information.
Many thanks,
Ben



--
View this message in context:
http://r.789695.n4.nabble.com/Sequence-analysis-tp4664693.html
Sent from the R help mailing list archive at Nabble.com.

arun

2013-Apr-19 13:59 UTC

head link

[R] Sequence analysis

Hi,

May be

library(Biostrings) from Bioconductor helps you.
source("http://bioconductor.org/biocLite.R")
biocLite("Biostrings")
?matchPattern()
?letterFrequency()
vec1<- "ababbbassdaa"
alphabetFrequency(DNAString(vec1))
#A C G T M R W S Y K V H D B N - + 
#5 0 0 0 0 0 0 2 0 0 0 0 1 4 0 0 0 

letterFrequency(DNAStringSet(vec1),letters="AC",OR=0)
?# ?? A C
#[1,] 5 0

vec2<- "addffggssbbsbbs" 

longestConsecutive(c(vec1,vec2),"b")
#[1] 3 2

?matchPattern(DNAString("AB"),DNAString(vec1))
?# Views on a 12-letter DNAString subject
#subject: ABABBBASSDAA
#views:
?# ? start end width
#[1]???? 1?? 2???? 2 [AB]
#[2]???? 3?? 4???? 2 [AB]


Also, 

library(seqinr)
?lapply(seq(s2c(vec2)),function(i) table(splitseq(s2c(vec2),word=i)))
#[[1]]
#
#a b d f g s 
#1 4 2 2 2 4 
#
#[[2]]
#
#ad bb bs df fg gs sb 
# 1? 1? 1? 1? 1? 1? 1 
---------------------------------------
A.K.




----- Original Message -----
From: ben1983 <ben_thompson at talk21.com>
To: r-help at r-project.org
Cc: 
Sent: Friday, April 19, 2013 7:21 AM
Subject: [R] Sequence analysis

Hiya,
? ? ? ? ? ? I am trying to look at the similarities between a number of
sequences, for example i am trying to see how similar "ababbbassdaa"
is to
"addffggssbbsbbs" I was wondering is the some way for me to see how
similar
they are in terms of, for example, number of a's, number of b's, how
often a
and ab are consecutive, how often abab is together etc.
Any advice would be really useful......any kind of shove in the right
direction would be amazing! I've tried doing basic alignments but i think
this is loosing quite a lot of information.
Many thanks,
Ben



--
View this message in context:
http://r.789695.n4.nabble.com/Sequence-analysis-tp4664693.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Apr 2013 - Sequence analysis

[R] Sequence analysis

[R] Sequence analysis

Apparently Analagous Threads