Hi, I have a string "GGGGGGCCCAATCGCAATTCCAATT" What I want to do is to count the percentage of each letter in the string, what string functions can I use to count the number of each letter appearing in the string? For example, the letter "A" appeared 6 times, letter "T" appeared 5 times, how can I use a string function to get the these number? thanks, karena -- View this message in context: http://r.789695.n4.nabble.com/R-string-functions-tp3600484p3600484.html Sent from the R help mailing list archive at Nabble.com.
x<-'GTTACTGGTACC' table(strsplit(x,'')) hth, Daniel karena wrote:> > Hi, > > I have a string "GGGGGGCCCAATCGCAATTCCAATT" > > What I want to do is to count the percentage of each letter in the string, > what string functions can I use to count the number of each letter > appearing in the string? > > For example, the letter "A" appeared 6 times, letter "T" appeared 5 times, > how can I use a string function to get the these number? > > thanks, > > karena >-- View this message in context: http://r.789695.n4.nabble.com/R-string-functions-tp3600484p3600568.html Sent from the R help mailing list archive at Nabble.com.
Tena koe Karena Try: table(strsplit("GGGGGGCCCAATCGCAATTCCAATT", '')) HTH ... Peter Alspach> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of karena > Sent: Thursday, 16 June 2011 8:37 a.m. > To: r-help at r-project.org > Subject: [R] R string functions > > Hi, > > I have a string "GGGGGGCCCAATCGCAATTCCAATT" > > What I want to do is to count the percentage of each letter in the > string, > what string functions can I use to count the number of each letter > appearing > in the string? > > For example, the letter "A" appeared 6 times, letter "T" appeared 5 > times, > how can I use a string function to get the these number? > > thanks, > > karena > > -- > View this message in context: http://r.789695.n4.nabble.com/R-string- > functions-tp3600484p3600484.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. Any opinion or views expressed in this e-mail are those of the individual sender and may not represent those of The New Zealand Institute for Plant and Food Research Limited.
Hi, On Wed, Jun 15, 2011 at 4:37 PM, karena <dr.jzhou at gmail.com> wrote:> Hi, > > I have a string "GGGGGGCCCAATCGCAATTCCAATT" > > What I want to do is to count the percentage of each letter in the string, > what string functions can I use to count the number of each letter appearing > in the string? > > For example, the letter "A" appeared 6 times, letter "T" appeared 5 times, > how can I use a string function to get the these number?The replies you've already received are already helpful ... in addition to them, though, I'd suggest you check out the "Biostrings" package from bioconductor since it looks like you are working with DNA: http://www.bioconductor.org/packages/release/bioc/html/Biostrings.html There are many (many^2) things already implemented in that package that you will likely want to do with genomic sequences, and done so in a memory-and-performance efficient manner. For this particular example: R> library(Biostrings) R> x <- DNAString("GGGGGGCCCAATCGCAATTCCAATT") R> oligonucleotideFrequency(x, 1) A C G T 6 7 7 5 ## And just for fun: R> oligonucleotideFrequency(x, 2) AA AC AG AT CA CC CG CT GA GC GG GT TA TC TG TT 3 0 0 3 3 3 1 0 0 2 5 0 0 2 0 2 Depending on how much genomic/sequence stuff you are planning to do, it could be worth your while to invest some time looking into various functionality the Biostrings (and IRanges) package(s) provides for you. Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact