I have a file containing "words" like a a/b a/b/c where there may be multiple words on a line (separated by spaces).? The a, b, and c strings can contain non-space, non-slash characters. I'd like to use gsub() to extract the c strings (which should be empty if there are none). A real example is "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" which I'd like to transform to " 587 587 587 587" Another real example is "f 1067 28680 24462" which should transform to "?? ". I've tried a few different regexprs, but am unable to find a way to say "transform words by deleting everything up to and including the 2nd slash" when there might be zero, one or two slashes.? Any suggestions? Duncan Murdoch

Hi Duncan, why not split on / and take the correct elements? It is not as elegant as regex but could do the trick. Best, Ulrik On Mon, 9 Oct 2017 at 17:03 Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> I have a file containing "words" like > > > a > > a/b > > a/b/c > > where there may be multiple words on a line (separated by spaces). The > a, b, and c strings can contain non-space, non-slash characters. I'd > like to use gsub() to extract the c strings (which should be empty if > there are none). > > A real example is > > "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" > > which I'd like to transform to > > " 587 587 587 587" > > Another real example is > > "f 1067 28680 24462" > > which should transform to " ". > > I've tried a few different regexprs, but am unable to find a way to say > "transform words by deleting everything up to and including the 2nd > slash" when there might be zero, one or two slashes. Any suggestions? > > Duncan Murdoch > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]

> On 9 Oct 2017, at 17:02 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > I have a file containing "words" like > > > a > > a/b > > a/b/c > > where there may be multiple words on a line (separated by spaces). The a, b, and c strings can contain non-space, non-slash characters. I'd like to use gsub() to extract the c strings (which should be empty if there are none). > > A real example is > > "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" > > which I'd like to transform to > > " 587 587 587 587" > > Another real example is > > "f 1067 28680 24462" > > which should transform to " ". > > I've tried a few different regexprs, but am unable to find a way to say "transform words by deleting everything up to and including the 2nd slash" when there might be zero, one or two slashes. Any suggestions? >I think you might need something like this: s <- "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" l <- strsplit(s, " ")[[1]] pat <- "[[:alnum:]]*/[[:alnum:]]*/([[:alnum:]]*)" paste(ifelse(grepl(pat,l),gsub(pat, "\\1", l), ""), collapse=" ") -pd> Duncan Murdoch > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com

> x <- "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" > gsub("(^| *)([^/ ]*/?){0,2}", "\\1", x)[1] " 587 587 587 587"> y <- "aa aa/ aa/bb aa/bb/ aa/bb/cc aa/bb/cc/ aa/bb/cc/dd aa/bb/cc/dd/" > gsub("(^| *)([^/ ]*/?){0,2}", "\\1", y)[1] " cc cc/ cc/dd cc/dd/" Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Oct 9, 2017 at 8:02 AM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> I have a file containing "words" like > > > a > > a/b > > a/b/c > > where there may be multiple words on a line (separated by spaces). The a, > b, and c strings can contain non-space, non-slash characters. I'd like to > use gsub() to extract the c strings (which should be empty if there are > none). > > A real example is > > "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" > > which I'd like to transform to > > " 587 587 587 587" > > Another real example is > > "f 1067 28680 24462" > > which should transform to " ". > > I've tried a few different regexprs, but am unable to find a way to say > "transform words by deleting everything up to and including the 2nd slash" > when there might be zero, one or two slashes. Any suggestions? > > Duncan Murdoch > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posti > ng-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]

How about this (I'm showing it as a pipe because it's easier to read that way): library(magrittr) "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" %>% ? strsplit(' ') %>% ? unlist %>% ? sub('^[^/]*/*','',.) %>% ? sub('^[^/]*/*','',.) %>% ? paste(collapse = ' ') Georges Monette -- Georges Monette, PhD P.Stat.(SSC) | Associate Professor. Faculty of Science, Department of Mathematics & Statistics | North 626 Ross Building | York University | 4700 Keele Street, Toronto, ON M3J 1P3 | Telephone: 416-736-5250 | Fax: 416-736-5757 | E-Mail: georges at yorku.ca On 2017-10-09 11:02 AM, Duncan Murdoch wrote:> I have a file containing "words" like > > > a > > a/b > > a/b/c > > where there may be multiple words on a line (separated by spaces).? > The a, b, and c strings can contain non-space, non-slash characters. > I'd like to use gsub() to extract the c strings (which should be empty > if there are none). > > A real example is > > "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" > > which I'd like to transform to > > " 587 587 587 587" > > Another real example is > > "f 1067 28680 24462" > > which should transform to "?? ". > > I've tried a few different regexprs, but am unable to find a way to > say "transform words by deleting everything up to and including the > 2nd slash" when there might be zero, one or two slashes.? Any > suggestions? > > Duncan Murdoch > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >