I have done an embarrassingly bad job using a mixture of gsub and strsplit to solve a problem. Below is sample code showing what I have to start with (the vector xx) and I want to end up with two vectors x and y that contain only the digits found in xx. Any regex users with advice most welcome Harold xx <- c("S24:57", "S24:86", "S24:119", "S24:129", "S24:138", "S24:163") yy <- gsub("S","\\1", xx) a1 <- gsub(":"," ", yy) a2 <- sapply(a1, function(x) strsplit(x, ' ')) x <- as.numeric(sapply(a2, function(x) x[1])) y <- as.numeric(sapply(a2, function(x) x[2])) [[alternative HTML version deleted]]
On Fri, Aug 1, 2014 at 10:46 AM, Doran, Harold <HDoran at air.org> wrote:> I have done an embarrassingly bad job using a mixture of gsub and strsplit to solve a problem. Below is sample code showing what I have to start with (the vector xx) and I want to end up with two vectors x and y that contain only the digits found in xx. > > Any regex users with advice most welcome > > Harold > > xx <- c("S24:57", "S24:86", "S24:119", "S24:129", "S24:138", "S24:163") > yy <- gsub("S","\\1", xx) > a1 <- gsub(":"," ", yy) > a2 <- sapply(a1, function(x) strsplit(x, ' ')) > x <- as.numeric(sapply(a2, function(x) x[1])) > y <- as.numeric(sapply(a2, function(x) x[2]))> library(gsubfn) > strapply(xx, "\\d+", as.numeric, simplify = TRUE)[,1] [,2] [,3] [,4] [,5] [,6] [1,] 24 24 24 24 24 24 [2,] 57 86 119 129 138 163 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
You could try: library(stringr) ? simplify2array(str_extract_all(xx, perl('(?<=[A-Z]|\\:)\\d+'))) ???? [,1] [,2] [,3]? [,4]? [,5]? [,6] [1,] "24" "24" "24"? "24"? "24"? "24" [2,] "57" "86" "119" "129" "138" "163" A.K. On Friday, August 1, 2014 10:49 AM, "Doran, Harold" <HDoran at air.org> wrote: I have done an embarrassingly bad job using a mixture of gsub and strsplit to solve a problem. Below is sample code showing what I have to start with (the vector xx) and I want to end up with two vectors x and y that contain only the digits found in xx. Any regex users with advice most welcome Harold xx <- c("S24:57",? "S24:86",? "S24:119",? "S24:129",? "S24:138",? "S24:163") yy <- gsub("S","\\1", xx) a1 <- gsub(":"," ", yy) a2 <- sapply(a1, function(x) strsplit(x, ' ')) x <- as.numeric(sapply(a2, function(x) x[1])) y <- as.numeric(sapply(a2, function(x) x[2])) ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Aug 1, 2014, at 9:46 AM, Doran, Harold <HDoran at air.org> wrote:> I have done an embarrassingly bad job using a mixture of gsub and strsplit to solve a problem. Below is sample code showing what I have to start with (the vector xx) and I want to end up with two vectors x and y that contain only the digits found in xx. > > Any regex users with advice most welcome > > Harold > > xx <- c("S24:57", "S24:86", "S24:119", "S24:129", "S24:138", "S24:163") > yy <- gsub("S","\\1", xx) > a1 <- gsub(":"," ", yy) > a2 <- sapply(a1, function(x) strsplit(x, ' ')) > x <- as.numeric(sapply(a2, function(x) x[1])) > y <- as.numeric(sapply(a2, function(x) x[2]))If a matrix is a satisfactory result, rather than two separate vectors:> sapply(strsplit(gsub("S", "", xx), xx, split = ":"), as.numeric)[,1] [,2] [,3] [,4] [,5] [,6] [1,] 24 24 24 24 24 24 [2,] 57 86 119 129 138 163 Regards, Marc Schwartz
How about: x <- as.numeric(sub("^S([0-9]+):([0-9]+)$", "\\1", xx)) y <- as.numeric(sub("^S([0-9]+):([0-9]+)$", "\\2", xx)) 2014-08-01 16:46 GMT+02:00 Doran, Harold <HDoran@air.org>:> I have done an embarrassingly bad job using a mixture of gsub and strsplit > to solve a problem. Below is sample code showing what I have to start with > (the vector xx) and I want to end up with two vectors x and y that contain > only the digits found in xx. > > Any regex users with advice most welcome > > Harold > > xx <- c("S24:57", "S24:86", "S24:119", "S24:129", "S24:138", > "S24:163") > yy <- gsub("S","\\1", xx) > a1 <- gsub(":"," ", yy) > a2 <- sapply(a1, function(x) strsplit(x, ' ')) > x <- as.numeric(sapply(a2, function(x) x[1])) > y <- as.numeric(sapply(a2, function(x) x[2])) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]