Here is my interaction with R:> sub(x='>|t|',pattern = '|t',replacement='zz')[1] "zz>|t|" So I say to myself "Clearly the | signs need to be escaped, so let's try this"> sub(x='>|t|',pattern = '\|t',replacement='zz')[1] "zz>|t|" Warning messages: 1: '\|' is an unrecognized escape in a character string 2: unrecognized escape removed from "\|t" How can \| be an unrecognized escape? This flatly contradicts help('regex'), or am I misunderstanding the help? The first pattern above works if one uses extended=F. What do R experts think? David -- View this message in context: http://n4.nabble.com/perhaps-regular-expression-bug-with-sign-tp1819872p1819872.html Sent from the R help mailing list archive at Nabble.com.
you need to escape it (twice):> sub(x='>|t|',pattern = '\\|t',replacement='zz')[1] ">zz|" On Fri, Apr 9, 2010 at 4:35 PM, David.Epstein <David.Epstein@warwick.ac.uk>wrote:> > Here is my interaction with R: > > sub(x='>|t|',pattern = '|t',replacement='zz') > [1] "zz>|t|" > > So I say to myself "Clearly the | signs need to be escaped, so let's try > this" > > sub(x='>|t|',pattern = '\|t',replacement='zz') > [1] "zz>|t|" > Warning messages: > 1: '\|' is an unrecognized escape in a character string > 2: unrecognized escape removed from "\|t" > How can \| be an unrecognized escape? This flatly contradicts > help('regex'), > or am I misunderstanding the help? > > The first pattern above works if one uses extended=F. > > What do R experts think? > > David > -- > View this message in context: > http://n4.nabble.com/perhaps-regular-expression-bug-with-sign-tp1819872p1819872.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]]
Henrique Dallazuanna
2010-Apr-09 20:45 UTC
[R] perhaps regular expression bug with | sign ??
Try this: sub(x='>|t|',pattern = '\\|t',replacement='zz') On Fri, Apr 9, 2010 at 5:35 PM, David.Epstein <David.Epstein at warwick.ac.uk> wrote:> > Here is my interaction with R: >> sub(x='>|t|',pattern = '|t',replacement='zz') > [1] "zz>|t|" > > So I say to myself "Clearly the | signs need to be escaped, so let's try > this" >> sub(x='>|t|',pattern = '\|t',replacement='zz') > [1] "zz>|t|" > Warning messages: > 1: '\|' is an unrecognized escape in a character string > 2: unrecognized escape removed from "\|t" > How can \| be an unrecognized escape? This flatly contradicts help('regex'), > or am I misunderstanding the help? > > The first pattern above works if one uses extended=F. > > What do R experts think? > > David > -- > View this message in context: http://n4.nabble.com/perhaps-regular-expression-bug-with-sign-tp1819872p1819872.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
David -
Here's the last paragraph of the "Details" section
of the regex help page:
Patterns are described here as they would be printed by ?cat?:
(_do remember that backslashes need to be doubled when entering R
character strings_, e.g. from the keyboard).
You can get around this restriction using readline:
> pat = readline()
\t> pat
[1] "\\t"> cat(pat,'\n')
\t
It also should be remembered that R will add an extra backslash
when it prints a single backslash -- as can be seen, this is
avoided when you use cat().
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Fri, 9 Apr 2010, David.Epstein wrote:
>
> Here is my interaction with R:
>> sub(x='>|t|',pattern =
'|t',replacement='zz')
> [1] "zz>|t|"
>
> So I say to myself "Clearly the | signs need to be escaped, so
let's try
> this"
>> sub(x='>|t|',pattern =
'\|t',replacement='zz')
> [1] "zz>|t|"
> Warning messages:
> 1: '\|' is an unrecognized escape in a character string
> 2: unrecognized escape removed from "\|t"
> How can \| be an unrecognized escape? This flatly contradicts
help('regex'),
> or am I misunderstanding the help?
>
> The first pattern above works if one uses extended=F.
>
> What do R experts think?
>
> David
> --
> View this message in context:
http://n4.nabble.com/perhaps-regular-expression-bug-with-sign-tp1819872p1819872.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of David.Epstein > Sent: Friday, April 09, 2010 1:36 PM > To: r-help at r-project.org > Subject: [R] perhaps regular expression bug with | sign ?? > > > Here is my interaction with R: > > sub(x='>|t|',pattern = '|t',replacement='zz') > [1] "zz>|t|" > > So I say to myself "Clearly the | signs need to be escaped, > so let's try > this" > > sub(x='>|t|',pattern = '\|t',replacement='zz') > [1] "zz>|t|" > Warning messages: > 1: '\|' is an unrecognized escape in a character string > 2: unrecognized escape removed from "\|t" > How can \| be an unrecognized escape? This flatly contradicts > help('regex'),It would be a bit clearer if the warnings indicated that they were from the R parser (the function that converts your text input to R expressions which are later evaluated). The parser is trying to say that it is treating "\|" as "|". The backlash has special meaning in things like "\n" (newline), "\t" (tab), and "\123" (character number in octal), but not before a vertical bar. Because the parser removed the backslash sub() never saw it. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> or am I misunderstanding the help? > > The first pattern above works if one uses extended=F. > > What do R experts think? > > David > -- > View this message in context: > http://n4.nabble.com/perhaps-regular-expression-bug-with-sign- > tp1819872p1819872.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >