Dear R users, my problem today deals with my ignorance on regular expressions. a matter I recently discovered. Consider the following foo <- c("V_7_101110_V", "V_7_101110_V", "V_9_101110_V", "V_9_101110_V", "V_9_s101110_V", "V_9_101110_V", "V_9_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_17_101110_V", "V_17_101110_V") what I'm trying to obtain is to add a zero in front of numbers below 10, as in c("V_07_101110_V", "V_07_101110_V", "V_09_101110_V", "V_09_101110_V", "V_09_101110_V", "V_09_101110_V", "V_09_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_17_101110_V", "V_17_101110_V") I'm able to do this on the emacs buffer through query-replace-regexp C-M-% search for V_\(.\)_ and substitute with V_0\1_ but I completely ignore how to do it with gsub within R and the help is quite complicate to understand (at least to me, at this moment in time) I can search the vector through grep("V_._", foo) but I always get errors either on gsub('V_\(.\)_', 'V_0\1_', foo) or I get not what I'm looking for on gsub('V_._', 'V_0._', foo) gsub('V_._', 'V_0\1_', foo) Thanks in advance -- Ottorino-Luca Pantani, Universit? di Firenze Dip. Scienza del Suolo e Nutrizione della Pianta P.zle Cascine 28 50144 Firenze Italia Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+ Version 2.12.9) ESS version 5.5 -- R 2.10.0
On 11/16/2009 8:21 AM, Ottorino-Luca Pantani wrote:> Dear R users, > my problem today deals with my ignorance on regular expressions. > a matter I recently discovered.You were close. First, gsub by default doesn't need escapes before the parens. (There are lots of different conventions for regular expressions, unfortunately.) So the Emacs regular expression V_\(.\)_ is entered as "V_(.)_" in the default version of gsub(). Second, to enter a backslash into a string, you need to escape it. So the replacement pattern V_0\1_ is entered as "V_0\\1_". So gsub("V_(.)_", "V_0\\1_", foo) should give you what you want. Duncan Murdoch> > Consider the following > > foo <- > c("V_7_101110_V", "V_7_101110_V", "V_9_101110_V", "V_9_101110_V", > "V_9_s101110_V", "V_9_101110_V", "V_9_101110_V", "V_11_101110_V", > "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", > "V_17_101110_V", "V_17_101110_V") > > what I'm trying to obtain is to add a zero in front of numbers below 10, > as in > > c("V_07_101110_V", "V_07_101110_V", "V_09_101110_V", "V_09_101110_V", > "V_09_101110_V", "V_09_101110_V", "V_09_101110_V", "V_11_101110_V", > "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", > "V_17_101110_V", "V_17_101110_V") > > > I'm able to do this on the emacs buffer through query-replace-regexp > > C-M-% > search for > V_\(.\)_ > and substitute with > V_0\1_ > > but I completely ignore how to do it with gsub within R > and the help is quite complicate to understand > (at least to me, at this moment in time) > > I can search the vector through > grep("V_._", foo) > > but I always get errors either on > gsub('V_\(.\)_', 'V_0\1_', foo) > > > or I get not what I'm looking for on > gsub('V_._', 'V_0._', foo) > gsub('V_._', 'V_0\1_', foo) > > Thanks in advance
On Nov 16, 2009, at 8:21 AM, Ottorino-Luca Pantani wrote:> Dear R users, > my problem today deals with my ignorance on regular expressions. > a matter I recently discovered. > > Consider the following > > foo <- > c("V_7_101110_V", "V_7_101110_V", "V_9_101110_V", "V_9_101110_V", > "V_9_s101110_V", "V_9_101110_V", "V_9_101110_V", "V_11_101110_V", > "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", > "V_17_101110_V", "V_17_101110_V") > > what I'm trying to obtain is to add a zero in front of numbers below > 10, > as in > > c("V_07_101110_V", "V_07_101110_V", "V_09_101110_V", > "V_09_101110_V", > "V_09_101110_V", "V_09_101110_V", "V_09_101110_V", "V_11_101110_V", > "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V", > "V_17_101110_V", "V_17_101110_V") >Any of these (the need for doubling of the "\\" for the back-reference seems to be the main issue: > gsub("_([[:digit:]])_.", "_0\\1_", foo) [1] "V_07_01110_V" "V_07_01110_V" "V_09_01110_V" "V_09_01110_V" "V_09_101110_V" [6] "V_09_01110_V" "V_09_01110_V" "V_11_101110_V" "V_11_101110_V" "V_11_101110_V" [11] "V_11_101110_V" "V_11_101110_V" "V_17_101110_V" "V_17_101110_V" > gsub("_(\\d)_.", "_0\\1_", foo) [1] "V_07_01110_V" "V_07_01110_V" "V_09_01110_V" "V_09_01110_V" "V_09_101110_V" [6] "V_09_01110_V" "V_09_01110_V" "V_11_101110_V" "V_11_101110_V" "V_11_101110_V" [11] "V_11_101110_V" "V_11_101110_V" "V_17_101110_V" "V_17_101110_V" > gsub("V_(.)_", "V_0\\1_", foo) [1] "V_07_101110_V" "V_07_101110_V" "V_09_101110_V" "V_09_101110_V" "V_09_s101110_V" [6] "V_09_101110_V" "V_09_101110_V" "V_11_101110_V" "V_11_101110_V" "V_11_101110_V" [11] "V_11_101110_V" "V_11_101110_V" "V_17_101110_V" "V_17_101110_V"> > I'm able to do this on the emacs buffer through query-replace-regexp > > C-M-% > search for > V_\(.\)_ > and substitute with > V_0\1_ > > but I completely ignore how to do it with gsub within R > and the help is quite complicate to understand > (at least to me, at this moment in time) > > I can search the vector through > grep("V_._", foo) > > but I always get errors either on > gsub('V_\(.\)_', 'V_0\1_', foo) > > > or I get not what I'm looking for on > gsub('V_._', 'V_0._', foo) > gsub('V_._', 'V_0\1_', foo) > > Thanks in advance > -- > Ottorino-Luca Pantani, Universit? di Firenze > Dip. Scienza del Suolo e Nutrizione della Pianta > P.zle Cascine 28 50144 Firenze Italia > Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+ > Version 2.12.9) > ESS version 5.5 -- R 2.10.0 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT