Daren Tan
2008-Jul-31 11:11 UTC
[R] Identifying common prefixes from a vector of words, and delete those prefixes
For example, c("dog.is.an.animal", "cat.is.an.animal", "rat.is.an.animal"). How can I identify the common prefix is ".is.an.animal" and delete it to give c("dog", "cat", "rat") ? Thanks _________________________________________________________________ [[alternative HTML version deleted]]
Richard.Cotton at hsl.gov.uk
2008-Jul-31 13:21 UTC
[R] Identifying common prefixes from a vector of words, and delete those prefixes
> For example, c("dog.is.an.animal", "cat.is.an.animal", "rat.is.an. > animal"). How can I identify the common prefix is ".is.an.animal" > and delete it to give c("dog", "cat", "rat") ?foo <- c("dog.is.an.animal", "cat.is.an.animal", "rat.is.an.animal") sub(".is.an.animal", "", foo) Being pedantic, ".is.an.animal" is a suffix not a prefix since it comes afterwards. Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}
John Kane
2008-Jul-31 16:48 UTC
[R] Identifying common prefixes from a vector of words, and delete those prefixes
There MUST be a better way but this will work. x <- c("dog.is.an.animal", "cat.is.an.animal", "rat.is.an.animal") bb <- strsplit(x, "\\.") myfun <- function(m) m[1] animals <- unlist(lapply(bb, myfun)) animals --- On Thu, 7/31/08, Daren Tan <daren76 at hotmail.com> wrote:> From: Daren Tan <daren76 at hotmail.com> > Subject: [R] Identifying common prefixes from a vector of words, and delete those prefixes > To: r-help at stat.math.ethz.ch > Received: Thursday, July 31, 2008, 7:11 AM > For example, c("dog.is.an.animal", > "cat.is.an.animal", "rat.is.an.animal"). > How can I identify the common prefix is > ".is.an.animal" and delete it to give > c("dog", "cat", "rat") ? > > Thanks > _________________________________________________________________ > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code.__________________________________________________________________ [[elided Yahoo spam]]
Christos Hatzis
2008-Jul-31 17:16 UTC
[R] Identifying common prefixes from a vector of words, and delete those prefixes
A more general solution: strip.fun <- function(x, split=".") { xx <- strsplit(x, split, fixed=TRUE) txx <- table(unlist(xx)) nxx <- names(txx)[txx > 1] setdiff(unlist(xx), nxx) }> x <- c("dog.is.an.animal", "cat.is.an.animal", "rat.is.an.animal") > strip.fun(x)[1] "dog" "cat" "rat"> y <- c("my_cat_pet", "my_dog_pet", "my_rat_pet") > strip.fun(y, "_")[1] "cat" "dog" "rat" -Christos> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of John Kane > Sent: Thursday, July 31, 2008 12:48 PM > To: r-help at stat.math.ethz.ch; Daren Tan > Subject: Re: [R] Identifying common prefixes from a vector of > words,and delete those prefixes > > There MUST be a better way but this will work. > > x <- c("dog.is.an.animal", "cat.is.an.animal", > "rat.is.an.animal") bb <- strsplit(x, "\\.") myfun <- > function(m) m[1] animals <- unlist(lapply(bb, myfun)) animals > > > > > --- On Thu, 7/31/08, Daren Tan <daren76 at hotmail.com> wrote: > > > From: Daren Tan <daren76 at hotmail.com> > > Subject: [R] Identifying common prefixes from a vector of > words, and > > delete those prefixes > > To: r-help at stat.math.ethz.ch > > Received: Thursday, July 31, 2008, 7:11 AM For example, > > c("dog.is.an.animal", "cat.is.an.animal", "rat.is.an.animal"). > > How can I identify the common prefix is ".is.an.animal" and > delete it > > to give c("dog", "cat", "rat") ? > > > > Thanks > > _________________________________________________________________ > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > __________________________________________________________________ > [[elided Yahoo spam]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
Martin Morgan
2008-Aug-02 02:42 UTC
[R] Identifying common prefixes from a vector of words, and delete those prefixes
Daren Tan <daren76 at hotmail.com> writes:> For example, c("dog.is.an.animal", "cat.is.an.animal", > "rat.is.an.animal"). How can I identify the common prefix is > ".is.an.animal" and delete it to give c("dog", "cat", "rat") ?The 'Rlibstree' package from omegahat is quite fun for this sort of thing:> install.packages('Rlibstree', repos="http://www.omegahat.org/R")[snip]> library(Rlibstree) > rstrings <- function(string) {+ lapply(lapply(strsplit(string, ""), rev), + paste, collapse="") + }> pets <- c("dog.is.an.animal", "cat.is.an.animal", "rat.is.an.animal") > commonSfx <- rstrings(getCommonPrefix(rstrings(pets))) > commonSfx[[1]] [1] ".is.an.animal"> sub(commonSfx, "", pets)[1] "dog" "cat" "rat" Martin> Thanks > _________________________________________________________________ > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793