Hello, I have several strings where I am trying to eliminate the period and everything after the period, using a regular expression. However, I am having trouble getting this to work.> x = "wa.w" > gsub(x, "\..*", "", perl=TRUE)[1] "" Warning messages: 1: '\.' is an unrecognized escape in a character string 2: unrecognized escape removed from "\..*" In perl, you can match a single period with \. Is this not so even with perl=TRUE. I would like for x to be equal to> x = "wa"What am I missing here? -stephen =========================================Stephen J. Barr University of Washington WEB: www.econsteve.com
Henrique Dallazuanna
2009-May-13 23:47 UTC
[R] matching period with perl regular expression
Try this: gsub("^(\\w*).*$", "\\1", x) On Wed, May 13, 2009 at 8:41 PM, Stephen J. Barr <stephenjbarr@gmail.com>wrote:> Hello, > > I have several strings where I am trying to eliminate the period and > everything after the period, using a regular expression. However, I am > having trouble getting this to work. > > > x = "wa.w" > > gsub(x, "\..*", "", perl=TRUE) > [1] "" > Warning messages: > 1: '\.' is an unrecognized escape in a character string > 2: unrecognized escape removed from "\..*" > > In perl, you can match a single period with \. > Is this not so even with perl=TRUE. I would like for x to be equal to > > x = "wa" > > What am I missing here? > -stephen > =========================================> Stephen J. Barr > University of Washington > WEB: www.econsteve.com > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
R interprets backslash to give special meaning to the next character, i.e. it strips off the backslash and send the following character to gsub possibly reinterpreting it specially (for example \n is newline). Thus a backslash will never get to gsub unless you use a double backslash. Thus we can use "'\\." to represent \. Also note that that the regular expression "[.]" represents a literal dot and does not require a backslash in the first place. You don't need perl = TRUE for simple regular expressions like this. On Wed, May 13, 2009 at 7:41 PM, Stephen J. Barr <stephenjbarr at gmail.com> wrote:> Hello, > > I have several strings where I am trying to eliminate the period and > everything after the period, using a regular expression. However, I am > having trouble getting this to work. > >> x = "wa.w" >> gsub(x, "\..*", "", perl=TRUE) > [1] "" > Warning messages: > 1: '\.' is an unrecognized escape in a character string > 2: unrecognized escape removed from "\..*" > > In perl, you can match a single period with \. > Is this not so even with perl=TRUE. I would like for x to be equal to >> x = "wa" > > What am I missing here? > -stephen > =========================================> Stephen J. Barr > University of Washington > WEB: www.econsteve.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Bill.Venables at csiro.au
2009-May-13 23:59 UTC
[R] matching period with perl regular expression
You have the arguments out of line and you need two backslashes:> x <- "wa.w" > gsub("\\..*", "", x)[1] "wa">Bill Venables http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Stephen J. Barr Sent: Thursday, 14 May 2009 9:42 AM To: r-help at r-project.org Subject: [R] matching period with perl regular expression Hello, I have several strings where I am trying to eliminate the period and everything after the period, using a regular expression. However, I am having trouble getting this to work.> x = "wa.w" > gsub(x, "\..*", "", perl=TRUE)[1] "" Warning messages: 1: '\.' is an unrecognized escape in a character string 2: unrecognized escape removed from "\..*" In perl, you can match a single period with \. Is this not so even with perl=TRUE. I would like for x to be equal to> x = "wa"What am I missing here? -stephen =========================================Stephen J. Barr University of Washington WEB: www.econsteve.com ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Gabor Grothendieck <ggrothendieck <at> gmail.com> writes:> > R interprets backslash to give special meaning to the next character, i.e. > it strips off the backslash and send the following character to gsub > possibly reinterpreting it specially (for example \n is newline). Thus > a backslash will never get to gsub unless you use a double backslash.To quote Peter Dalgaard: "The generic rule for backslashes is that you need twice as many as you thought" Only make sure that you have the right stopping rule for recursive evaluation of that. Dieter