Hi, I have a string "GGGGGGCCCAATCGCAATTCCAATT" What I want to do is to count the percentage of each letter in the string, what string functions can I use to count the number of each letter appearing in the string? For example, the letter "A" appeared 6 times, letter "T" appeared 5 times, how can I use a string function to get the these number? thanks, karena -- View this message in context: http://r.789695.n4.nabble.com/R-string-functions-tp3600484p3600484.html Sent from the R help mailing list archive at Nabble.com.
x<-'GTTACTGGTACC' table(strsplit(x,'')) hth, Daniel karena wrote:> > Hi, > > I have a string "GGGGGGCCCAATCGCAATTCCAATT" > > What I want to do is to count the percentage of each letter in the string, > what string functions can I use to count the number of each letter > appearing in the string? > > For example, the letter "A" appeared 6 times, letter "T" appeared 5 times, > how can I use a string function to get the these number? > > thanks, > > karena >-- View this message in context: http://r.789695.n4.nabble.com/R-string-functions-tp3600484p3600568.html Sent from the R help mailing list archive at Nabble.com.
Tena koe Karena
Try:
table(strsplit("GGGGGGCCCAATCGCAATTCCAATT", ''))
HTH ...
Peter Alspach
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of karena
> Sent: Thursday, 16 June 2011 8:37 a.m.
> To: r-help at r-project.org
> Subject: [R] R string functions
>
> Hi,
>
> I have a string "GGGGGGCCCAATCGCAATTCCAATT"
>
> What I want to do is to count the percentage of each letter in the
> string,
> what string functions can I use to count the number of each letter
> appearing
> in the string?
>
> For example, the letter "A" appeared 6 times, letter
"T" appeared 5
> times,
> how can I use a string function to get the these number?
>
> thanks,
>
> karena
>
> --
> View this message in context: http://r.789695.n4.nabble.com/R-string-
> functions-tp3600484p3600484.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
The contents of this e-mail are confidential and may be subject to legal
privilege.
If you are not the intended recipient you must not use, disseminate, distribute
or
reproduce all or any part of this e-mail or attachments. If you have received
this
e-mail in error, please notify the sender and delete all material pertaining to
this
e-mail. Any opinion or views expressed in this e-mail are those of the
individual
sender and may not represent those of The New Zealand Institute for Plant and
Food Research Limited.
Hi, On Wed, Jun 15, 2011 at 4:37 PM, karena <dr.jzhou at gmail.com> wrote:> Hi, > > I have a string "GGGGGGCCCAATCGCAATTCCAATT" > > What I want to do is to count the percentage of each letter in the string, > what string functions can I use to count the number of each letter appearing > in the string? > > For example, the letter "A" appeared 6 times, letter "T" appeared 5 times, > how can I use a string function to get the these number?The replies you've already received are already helpful ... in addition to them, though, I'd suggest you check out the "Biostrings" package from bioconductor since it looks like you are working with DNA: http://www.bioconductor.org/packages/release/bioc/html/Biostrings.html There are many (many^2) things already implemented in that package that you will likely want to do with genomic sequences, and done so in a memory-and-performance efficient manner. For this particular example: R> library(Biostrings) R> x <- DNAString("GGGGGGCCCAATCGCAATTCCAATT") R> oligonucleotideFrequency(x, 1) A C G T 6 7 7 5 ## And just for fun: R> oligonucleotideFrequency(x, 2) AA AC AG AT CA CC CG CT GA GC GG GT TA TC TG TT 3 0 0 3 3 3 1 0 0 2 5 0 0 2 0 2 Depending on how much genomic/sequence stuff you are planning to do, it could be worth your while to invest some time looking into various functionality the Biostrings (and IRanges) package(s) provides for you. Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact