Gianluca Rossi
2014-Feb-16 12:50 UTC
[R] Extracting everything between two symbols in a string
Hello, I have a vector containing some names. I want to extract the title on every row, basically everything between the ", " (included the white space) and "." > head(combi$Name) [1] "Braund, Mr. Owen Harris" [2] "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" [3] "Heikkinen, Miss. Laina" [4] "Futrelle, Mrs. Jacques Heath (Lily May Peel)" [5] "Allen, Mr. William Henry" [6] "Moran, Mr. James" I suppose grep with the argument `value = TRUE` might come useful but I have difficulties on find the right regular expressions to accomplish my needs. combi$Title <- grep("", combi$Name, value = TRUE) Many thanks, Gianluca
Rui Barradas
2014-Feb-16 19:46 UTC
[R] Extracting everything between two symbols in a string
Hello, Try the following. x <- "Braund, Mr. Owen Harris" sub("^.*, (M[[:alpha:]]*)\\..*$", "\\1", x) Hope this helps, Rui Barradas Em 16-02-2014 12:50, Gianluca Rossi escreveu:> Hello, > > I have a vector containing some names. I want to extract the title on > every row, basically everything between the ", " (included the white > space) and "." > > > head(combi$Name) > [1] "Braund, Mr. Owen Harris" > [2] "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" > [3] "Heikkinen, Miss. Laina" > [4] "Futrelle, Mrs. Jacques Heath (Lily May Peel)" > [5] "Allen, Mr. William Henry" > [6] "Moran, Mr. James" > > I suppose grep with the argument `value = TRUE` might come useful but I > have difficulties on find the right regular expressions to accomplish my > needs. > > combi$Title <- grep("", combi$Name, value = TRUE) > > Many thanks, > > Gianluca > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.