Mark Heckmann
2009-Jun-08 14:15 UTC
[R] using regular expressions to retrieve a digit-digit-dot structure from a string
Hi, i need to recognize itemization structures in strings which follow the format: "digit-digit-dot" like e.g. 1. 2. 19. 211. Given the string " This happened in the 21. century." (the dot behind 21 is used in German instead of 21st) I want know where the dots are but I do not want the 21.-dot to be returned as well. I am not good at regular expressions. How can I retrieve or recognize dots excluding the digit-digit-dot structure? TIA, Mark ------------------------------- Mark Heckmann + 49 (0) 421 - 1614618 www.markheckmann.de R-Blog: <http://ryouready.wordpress.com> http://ryouready.wordpress.com [[alternative HTML version deleted]]
Henrique Dallazuanna
2009-Jun-08 17:32 UTC
[R] using regular expressions to retrieve a digit-digit-dot structure from a string
Try this: x <- "This happened in the 21. century." gregexpr("[[:digit:]]\\.", x) This returns the position of the digit-dot in the string. On Mon, Jun 8, 2009 at 11:15 AM, Mark Heckmann <mark.heckmann@gmx.de> wrote:> Hi, > > > > i need to recognize itemization structures in strings which follow the > format: "digit-digit-dot" like e.g. > > > > 1. > > 2. > > 19. > > 211. > > > > Given the string " This happened in the 21. century." (the dot behind 21 is > used in German instead of 21st) I want know where the dots are but I do not > want the 21.-dot to be returned as well. > > > > I am not good at regular expressions. How can I retrieve or recognize dots > excluding the digit-digit-dot structure? > > > > TIA, Mark > > > > ------------------------------- > > Mark Heckmann > > + 49 (0) 421 - 1614618 > > www.markheckmann.de > > R-Blog: <http://ryouready.wordpress.com> http://ryouready.wordpress.com > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
Marc Schwartz
2009-Jun-08 17:34 UTC
[R] using regular expressions to retrieve a digit-digit-dot structure from a string
On Jun 8, 2009, at 9:15 AM, Mark Heckmann wrote:> Hi, > > > > i need to recognize itemization structures in strings which follow the > format: "digit-digit-dot" like e.g. > > > > 1. > > 2. > > 19. > > 211. > > > > Given the string " This happened in the 21. century." (the dot > behind 21 is > used in German instead of 21st) I want know where the dots are but I > do not > want the 21.-dot to be returned as well. > > > > I am not good at regular expressions. How can I retrieve or > recognize dots > excluding the digit-digit-dot structure? > > > > TIA, Mark >vec <- c("1.", "2.", "19.", "211.", "This happened in the 21. century") > grep("^[0-9]+\\.", vec, value = TRUE) [1] "1." "2." "19." "211." The regex "^[0-9]+\\." is interpreted as "match one or more digits followed by a period, only at the beginning of the line". The caret '^' defines the beginning of the line, so that a sequence of numbers followed by a period in the middle of the line will not match. HTH, Marc Schwartz
Gabor Grothendieck
2009-Jun-08 17:36 UTC
[R] using regular expressions to retrieve a digit-digit-dot structure from a string
Try this. See ?regex for more.> x <- 'This happened in the 21. century." (the dot behind 21 is' > regexpr("(?![0-9]+)[.]", x, perl = TRUE)[1] 24 attr(,"match.length") [1] 1 On Mon, Jun 8, 2009 at 10:15 AM, Mark Heckmann<mark.heckmann at gmx.de> wrote:> Hi, > > > > i need to recognize itemization structures in strings which follow the > format: "digit-digit-dot" like e.g. > > > > 1. > > 2. > > 19. > > 211. > > > > Given the string " This happened in the 21. century." (the dot behind 21 is > used in German instead of 21st) I want know where the dots are but I do not > want the 21.-dot to be returned as well. > > > > I am not good at regular expressions. How can I retrieve or recognize dots > excluding the digit-digit-dot structure? > > > > TIA, Mark > > > > ------------------------------- > > Mark Heckmann > > + 49 (0) 421 - 1614618 > > www.markheckmann.de > > R-Blog: ?<http://ryouready.wordpress.com> http://ryouready.wordpress.com > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Reasonably Related Threads
- no partial matching of argument names after dots argument - why?
- modifying the dots argument - how?
- RGtk2 - retrieve ggraphics mouse coordinates during drag-and-drop event
- Reordering the results from table(cut()) by break argument
- problem with math expressions in grid graphics when using line breaks (\n)