Here is my interaction with R:> sub(x='>|t|',pattern = '|t',replacement='zz')[1] "zz>|t|" So I say to myself "Clearly the | signs need to be escaped, so let's try this"> sub(x='>|t|',pattern = '\|t',replacement='zz')[1] "zz>|t|" Warning messages: 1: '\|' is an unrecognized escape in a character string 2: unrecognized escape removed from "\|t" How can \| be an unrecognized escape? This flatly contradicts help('regex'), or am I misunderstanding the help? The first pattern above works if one uses extended=F. What do R experts think? David -- View this message in context: http://n4.nabble.com/perhaps-regular-expression-bug-with-sign-tp1819872p1819872.html Sent from the R help mailing list archive at Nabble.com.
you need to escape it (twice):> sub(x='>|t|',pattern = '\\|t',replacement='zz')[1] ">zz|" On Fri, Apr 9, 2010 at 4:35 PM, David.Epstein <David.Epstein@warwick.ac.uk>wrote:> > Here is my interaction with R: > > sub(x='>|t|',pattern = '|t',replacement='zz') > [1] "zz>|t|" > > So I say to myself "Clearly the | signs need to be escaped, so let's try > this" > > sub(x='>|t|',pattern = '\|t',replacement='zz') > [1] "zz>|t|" > Warning messages: > 1: '\|' is an unrecognized escape in a character string > 2: unrecognized escape removed from "\|t" > How can \| be an unrecognized escape? This flatly contradicts > help('regex'), > or am I misunderstanding the help? > > The first pattern above works if one uses extended=F. > > What do R experts think? > > David > -- > View this message in context: > http://n4.nabble.com/perhaps-regular-expression-bug-with-sign-tp1819872p1819872.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]]
Henrique Dallazuanna
2010-Apr-09 20:45 UTC
[R] perhaps regular expression bug with | sign ??
Try this: sub(x='>|t|',pattern = '\\|t',replacement='zz') On Fri, Apr 9, 2010 at 5:35 PM, David.Epstein <David.Epstein at warwick.ac.uk> wrote:> > Here is my interaction with R: >> sub(x='>|t|',pattern = '|t',replacement='zz') > [1] "zz>|t|" > > So I say to myself "Clearly the | signs need to be escaped, so let's try > this" >> sub(x='>|t|',pattern = '\|t',replacement='zz') > [1] "zz>|t|" > Warning messages: > 1: '\|' is an unrecognized escape in a character string > 2: unrecognized escape removed from "\|t" > How can \| be an unrecognized escape? This flatly contradicts help('regex'), > or am I misunderstanding the help? > > The first pattern above works if one uses extended=F. > > What do R experts think? > > David > -- > View this message in context: http://n4.nabble.com/perhaps-regular-expression-bug-with-sign-tp1819872p1819872.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
David - Here's the last paragraph of the "Details" section of the regex help page: Patterns are described here as they would be printed by ?cat?: (_do remember that backslashes need to be doubled when entering R character strings_, e.g. from the keyboard). You can get around this restriction using readline:> pat = readline()\t> pat[1] "\\t"> cat(pat,'\n')\t It also should be remembered that R will add an extra backslash when it prints a single backslash -- as can be seen, this is avoided when you use cat(). - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Fri, 9 Apr 2010, David.Epstein wrote:> > Here is my interaction with R: >> sub(x='>|t|',pattern = '|t',replacement='zz') > [1] "zz>|t|" > > So I say to myself "Clearly the | signs need to be escaped, so let's try > this" >> sub(x='>|t|',pattern = '\|t',replacement='zz') > [1] "zz>|t|" > Warning messages: > 1: '\|' is an unrecognized escape in a character string > 2: unrecognized escape removed from "\|t" > How can \| be an unrecognized escape? This flatly contradicts help('regex'), > or am I misunderstanding the help? > > The first pattern above works if one uses extended=F. > > What do R experts think? > > David > -- > View this message in context: http://n4.nabble.com/perhaps-regular-expression-bug-with-sign-tp1819872p1819872.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of David.Epstein > Sent: Friday, April 09, 2010 1:36 PM > To: r-help at r-project.org > Subject: [R] perhaps regular expression bug with | sign ?? > > > Here is my interaction with R: > > sub(x='>|t|',pattern = '|t',replacement='zz') > [1] "zz>|t|" > > So I say to myself "Clearly the | signs need to be escaped, > so let's try > this" > > sub(x='>|t|',pattern = '\|t',replacement='zz') > [1] "zz>|t|" > Warning messages: > 1: '\|' is an unrecognized escape in a character string > 2: unrecognized escape removed from "\|t" > How can \| be an unrecognized escape? This flatly contradicts > help('regex'),It would be a bit clearer if the warnings indicated that they were from the R parser (the function that converts your text input to R expressions which are later evaluated). The parser is trying to say that it is treating "\|" as "|". The backlash has special meaning in things like "\n" (newline), "\t" (tab), and "\123" (character number in octal), but not before a vertical bar. Because the parser removed the backslash sub() never saw it. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> or am I misunderstanding the help? > > The first pattern above works if one uses extended=F. > > What do R experts think? > > David > -- > View this message in context: > http://n4.nabble.com/perhaps-regular-expression-bug-with-sign- > tp1819872p1819872.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >