Any better solution than this ? 
 
sum(strsplit("TCGGGGGACAATCGGTAACCCGTCT", "")[[1]] ==
"G")
_________________________________________________________________
	[[alternative HTML version deleted]]
Ken Knoblauch
2008-Jul-15  15:40 UTC
[R] counting number of "G" in "TCGGGGGACAATCGGTAACCCGTCT"
Daren Tan <daren76 <at> hotmail.com> writes: > Any better solution than this ?> sum(strsplit("TCGGGGGACAATCGGTAACCCGTCT", "")[[1]] == "G")Try table(strsplit("TCGGGGGACAATCGGTAACCCGTCT", "")) A C G T 5 7 8 5 and get all 4 at once. HTH -- Ken Knoblauch Inserm U846 Institut Cellule Souche et Cerveau D?partement Neurosciences Int?gratives 18 avenue du Doyen L?pine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr
Henrik Bengtsson
2008-Jul-15  15:43 UTC
[R] counting number of "G" in "TCGGGGGACAATCGGTAACCCGTCT"
Seems like you can do:
library("matchprobes")   # on Bioconductor
countbases("TCGGGGGACAATCGGTAACCCGTCT")[,"G"]
The catch is that it only counts A, C, G, and T:s and no other symbols.
/Henrik
On Tue, Jul 15, 2008 at 8:27 AM, Daren Tan <daren76 at hotmail.com>
wrote:>
> Any better solution than this ?
>
> sum(strsplit("TCGGGGGACAATCGGTAACCCGTCT", "")[[1]] ==
"G")
> _________________________________________________________________
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Wolfgang Huber
2008-Jul-15  15:59 UTC
[R] counting number of "G" in "TCGGGGGACAATCGGTAACCCGTCT"
Hi, And the Bioconductor package "Biostrings" is the place to go for any serious work with sequences. -- Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber 15/07/2008 16:43 Henrik Bengtsson scripsit> Seems like you can do: > > library("matchprobes") # on Bioconductor > countbases("TCGGGGGACAATCGGTAACCCGTCT")[,"G"] > > The catch is that it only counts A, C, G, and T:s and no other symbols. > > /Henrik > > On Tue, Jul 15, 2008 at 8:27 AM, Daren Tan <daren76 a hotmail.com> wrote: >> Any better solution than this ? >> >> sum(strsplit("TCGGGGGACAATCGGTAACCCGTCT", "")[[1]] == "G") >> _________________________________________________________________
Patrick Aboyoun
2008-Jul-15  16:29 UTC
[R] counting number of "G" in "TCGGGGGACAATCGGTAACCCGTCT"
Henrik,
As Wolfgang mentioned, the Biostrings package in Bioconductor has a 
number of sequence manipulation functions. The alphabetFrequency 
function would get you what you need.
 > library(Biostrings)
 > alphabetFrequency(DNAString("TCGGGGGACAATCGGTAACCCGTCT"))
A C G T M R W S Y K V H D B N - +
5 7 8 5 0 0 0 0 0 0 0 0 0 0 0 0 0
 > alphabetFrequency(DNAString("TCGGGGGACAATCGGTAACCCGTCT"),
baseOnly =
TRUE)
    A     C     G     T other
    5     7     8     5     0
Patrick
Wolfgang Huber wrote:> Hi,
>
> And the Bioconductor package "Biostrings" is the place to go for
any
> serious work with sequences.
>
Riley, Steve
2008-Jul-15  17:28 UTC
[R] counting number of "G" in "TCGGGGGACAATCGGTAACCCGTCT"
Daren,
Not sure if it is any easier, but another solution is:
code <- unlist(strsplit("TCGGGGGACAATCGGTAACCCGTCT",""))
length(grep("[G]",code)) 
Steve 
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Daren Tan
Sent: Tuesday, July 15, 2008 11:28 AM
To: r-help at stat.math.ethz.ch
Subject: [R] counting number of "G" in
"TCGGGGGACAATCGGTAACCCGTCT"
Any better solution than this ? 
 
sum(strsplit("TCGGGGGACAATCGGTAACCCGTCT", "")[[1]] ==
"G")
_________________________________________________________________
	[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.