Hi R gurus
I have a matching problem that I cant solve. I have tried multiple solutions
and searched varius help-sites but I cant get it to work.
This is the problem
myexstrings = c("*AAA.AA","BBB
BB","*.CCC.","**dd- d")
what I want do do is to remove any non-characters in the beginning and
everything else after the non-character symbol after the first set of
characters so that the string becomes:
c("AAA","BBB","CCC","dd")
I can figure out the start, sub("^\\W*","",
myexstrings,perl=T) will remove
the unwanted beginnings but then its the rest.
And please no links to any helppages, I have been looking at most of them
for the last hour without any success.
Thanks
Regards
Tom
--
View this message in context:
http://www.nabble.com/matching-problem-tp18152158p18152158.html
Sent from the R help mailing list archive at Nabble.com.
On 27 Jun 2008, at 12:23, Tom.O wrote:> > Hi R gurus > I have a matching problem that I cant solve. I have tried multiple > solutions > and searched varius help-sites but I cant get it to work. > > This is the problem > myexstrings = c("*AAA.AA","BBB BB","*.CCC.","**dd- d") > > what I want do do is to remove any non-characters in the beginning and > everything else after the non-character symbol after the first set of > characters so that the string becomes: > > c("AAA","BBB","CCC","dd") > > > I can figure out the start, sub("^\\W*","", myexstrings,perl=T) will > remove > the unwanted beginnings but then its the rest.Try gsub("\\W*","", myexstrings,perl=T) Cheers, --Hans
this should do what you want:
> myexstrings = c("*AAA.AA","BBB
BB","*.CCC.","**dd- d")
> a = gsub("^\\W*","", myexstrings,perl=T)
> b = gsub("\\W.*", "", a, perl=T)
> b
[1] "AAA" "BBB" "CCC" "dd"
first one, removes any non-word characters from the beginning (as you
already figured out)
second one, removes any remaining non-word characters AND everything
following.
on 06/27/2008 06:23 AM Tom.O said the following:> Hi R gurus
> I have a matching problem that I cant solve. I have tried multiple
solutions
> and searched varius help-sites but I cant get it to work.
>
> This is the problem
> myexstrings = c("*AAA.AA","BBB
BB","*.CCC.","**dd- d")
>
> what I want do do is to remove any non-characters in the beginning and
> everything else after the non-character symbol after the first set of
> characters so that the string becomes:
>
> c("AAA","BBB","CCC","dd")
>
>
> I can figure out the start, sub("^\\W*","",
myexstrings,perl=T) will remove
> the unwanted beginnings but then its the rest.
>
> And please no links to any helppages, I have been looking at most of them
> for the last hour without any success.
>
> Thanks
> Regards
> Tom
>
Here is a solution using strapply from the gsubfn package: library(gsubfn) strapply(myexstrings, "(\\w+).*", backref = -1, simplify = c) It matches the first string of word characters following by anything else and then returns the first backreference in each match, i.e. the portion within parentheses, simplifying it all into a character vector (rather than a list). On Fri, Jun 27, 2008 at 6:23 AM, Tom.O <tom.olsson at dnbnor.com> wrote:> > Hi R gurus > I have a matching problem that I cant solve. I have tried multiple solutions > and searched varius help-sites but I cant get it to work. > > This is the problem > myexstrings = c("*AAA.AA","BBB BB","*.CCC.","**dd- d") > > what I want do do is to remove any non-characters in the beginning and > everything else after the non-character symbol after the first set of > characters so that the string becomes: > > c("AAA","BBB","CCC","dd") > > > I can figure out the start, sub("^\\W*","", myexstrings,perl=T) will remove > the unwanted beginnings but then its the rest. > > And please no links to any helppages, I have been looking at most of them > for the last hour without any success. > > Thanks > Regards > Tom > > -- > View this message in context: http://www.nabble.com/matching-problem-tp18152158p18152158.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >