Hi, I don't know what I am doing wrong to the toupper does not seem working in sub + regex. The following returns 's' not the upper class 'S' as I expect: sub("q_([a-z])[a-zA-Z]*",toupper('\\1'),"q_sviRaw") Can someone tell me where I did wrong? Thanks, Richard [[alternative HTML version deleted]]
Thanks, Martin. I did not realize that. I never used perl compatible regex before but seems now I should! Richard -----Original Message----- From: Martin Morgan [mailto:mtmorgan at fhcrc.org] Sent: Monday, April 13, 2009 12:08 PM To: Tan, Richard Subject: Re: [R] toupper does not work in sub + regex "Tan, Richard" <RTan at panagora.com> writes:> Hi, I don't know what I am doing wrong to the toupper does not seem > working in sub + regex. The following returns 's' not the upper class> 'S' as I expect: > > sub("q_([a-z])[a-zA-Z]*",toupper('\\1'),"q_sviRaw")you're expecting toupper to be evaluated after substitution, but it is evaluated before: toupper('\\1') ==> '\\1'. try sub("q_([a-z])[a-zA-Z]*",'\\U\\1',"q_sviRaw", perl=TRUE)> Can someone tell me where I did wrong? > > Thanks, > Richard > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
sub only handles replacement strings, not replacement functions. Your code is the same as: sub("q_([a-z])[a-zA-Z]*", '\\1', "q_sviRaw") since toupper('\\1') has no alphabetics so its just literally '\\1' and the latter is what sub uses. The gsubfn function in the gsubfn package can deal with replacement functions:> library(gsubfn) > gsubfn("q_([a-z])[a-zA-Z]*", toupper, "q_sviRaw")[1] "S" See the home page: http;//gsubfn.googlecode.com, vignette and help page. On Mon, Apr 13, 2009 at 11:54 AM, Tan, Richard <RTan at panagora.com> wrote:> Hi, I don't know what I am doing wrong to the toupper does not seem > working in sub + regex. ?The following returns 's' not the upper class > 'S' as I expect: > > sub("q_([a-z])[a-zA-Z]*",toupper('\\1'),"q_sviRaw") > > Can someone tell me where I did wrong? > > Thanks, > Richard > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
You could also use \\U and \\L in the replacement with perl=TRUE. \\U "converts the rest of the replacement to upper case" and \\L converts to lowercase. (By "replacement" it means the parts of the replacement that arise from parenthesized subpatterns in the pattern argument, not the replacement argument itself.) E.g.,> sub("q_([a-z])[a-zA-Z]*", "\\U\\1\\L", "q_sviRaw", perl=TRUE)[1] "S"> sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\L\\2", "q_sviRaw",perl=TRUE) [1] "S then viraw"> sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\2", "q_sviRaw", perl=TRUE)[1] "S then VIRAW" Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com ---------------------------------------------------------------------- [R] toupper does not work in sub + regex Gabor Grothendieck ggrothendieck at gmail.com Mon Apr 13 18:26:12 CEST 2009 sub only handles replacement strings, not replacement functions. Your code is the same as: sub("q_([a-z])[a-zA-Z]*", '\\1', "q_sviRaw") since toupper('\\1') has no alphabetics so its just literally '\\1' and the latter is what sub uses. The gsubfn function in the gsubfn package can deal with replacement functions:> library(gsubfn) > gsubfn("q_([a-z])[a-zA-Z]*", toupper, "q_sviRaw")[1] "S" See the home page: http;//gsubfn.googlecode.com, vignette and help page. On Mon, Apr 13, 2009 at 11:54 AM, Tan, Richard <RTan at panagora.com> wrote:> Hi, I don't know what I am doing wrong to the toupper does not seem > working in sub + regex. The following returns 's' not the upper class > 'S' as I expect: > > sub("q_([a-z])[a-zA-Z]*",toupper('\\1'),"q_sviRaw") > > Can someone tell me where I did wrong? > > Thanks, > Richard
Apparently Analagous Threads
- Applying "toupper" to only portions of text strings
- search for string insider a string
- data frame select max group by like function
- Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number
- aggregate text column by a few rows