I have strings contain postcode and letters, some seperated with blank, some with comma, and some hasn't seperated. eg, "2324gz" "2567 HK" "3741,BF" I want to seperate the number and letters into two new variables. I know this should be quite basic question, but searched on regex syntax and that seems a bit scarey to me, any one can shot me a quick solution on this particular question? thanks, Sun
Gabor Grothendieck
2008-Mar-05 15:07 UTC
[R] regex sulotion for seperating number and string
Try this:> library(gsubfn) > x <- c("2324gz", "2567 HK", "3741,BF") > strapply(x, "[[:digit:]]+|[[:alpha:]]+")[[1]] [1] "2324" "gz" [[2]] [1] "2567" "HK" [[3]] [1] "3741" "BF" On Wed, Mar 5, 2008 at 9:51 AM, sun <flyhyena at yahoo.com.cn> wrote:> I have strings contain postcode and letters, some seperated with blank, some > with comma, and some hasn't seperated. eg, "2324gz" "2567 HK" "3741,BF" > > I want to seperate the number and letters into two new variables. > > I know this should be quite basic question, but searched on regex syntax and > that seems a bit scarey to me, any one can shot me a quick solution on this > particular question? > > thanks, > Sun > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
vincenzo.2.di-iorio at gsk.com
2008-Mar-05 15:08 UTC
[R] regex sulotion for seperating number and string
Hi Sun, vec <- c("2324gz","2567 HK","3741,BF") vec1 <- gsub('[^[:digit:]]','',vec) vec2 <- gsub('[^[:alpha:]]','',vec)> vec1[1] "2324" "2567" "3741"> vec2[1] "gz" "HK" "BF" Cheers Vincenzo ----------------------------------------------------------------------------------- Vincenzo Luca Di Iorio Consultant PME User support - GSK R&D Limited ----------------------------------------------------------------------------------- "sun" <flyhyena@yahoo.com.cn> Sent by: r-help-bounces@r-project.org 05-Mar-2008 15:51 To r-help@stat.math.ethz.ch cc Subject [R] regex sulotion for seperating number and string I have strings contain postcode and letters, some seperated with blank, some with comma, and some hasn't seperated. eg, "2324gz" "2567 HK" "3741,BF" I want to seperate the number and letters into two new variables. I know this should be quite basic question, but searched on regex syntax and that seems a bit scarey to me, any one can shot me a quick solution on this particular question? thanks, Sun ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
This should do it for you:> x <- c("2564gc", "2367,GH", "2134 JHG") > x.sep <- gsub("([[:digit:]]+)[ ,]*([[:alpha:]]+)", "\\1 \\2", x) > # now create separate values > strsplit(x.sep, " ")[[1]] [1] "2564" "gc" [[2]] [1] "2367" "GH" [[3]] [1] "2134" "JHG">On 3/5/08, sun <flyhyena at yahoo.com.cn> wrote:> I have strings contain postcode and letters, some seperated with blank, some > with comma, and some hasn't seperated. eg, "2324gz" "2567 HK" "3741,BF" > > I want to seperate the number and letters into two new variables. > > I know this should be quite basic question, but searched on regex syntax and > that seems a bit scarey to me, any one can shot me a quick solution on this > particular question? > > thanks, > Sun > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?