Krishna Tateneni
2010-May-20 00:15 UTC
[R] regex help: splitting strings with no separator
Greetings, I have a vector of values that are a word followed by a number, e.g., x c("Apple12","HP42","Dell91"). The goal is to split this vector into two vectors such that the first vector contains just the words and the second contains just the numbers. I cannot use strsplit (or at least I do not know how) as there is no obvious separator. I can use sub to create a separator, e.g., y = sub("([[:digit:]])"," \\1",x), and then use strsplit, but I thought more experienced R users may have a better solution. I've spent some time with Google, but not turned up anything so far. Many thanks, --Krishna [[alternative HTML version deleted]]
Jorge Ivan Velez
2010-May-20 00:23 UTC
[R] regex help: splitting strings with no separator
Hi Krishna, Here is a suggestion:> x <- c("Apple12","HP42","Dell91") > foo <- function(x) data.frame(brand = gsub("[0-9]", "", x), number gsub("[^0-9]", "", x)) > foo(x)# brand number # 1 Apple 12 # 2 HP 42 # 3 Dell 91 HTH, Jorge On Wed, May 19, 2010 at 8:15 PM, Krishna Tateneni <> wrote:> Greetings, > > I have a vector of values that are a word followed by a number, e.g., x > c("Apple12","HP42","Dell91"). The goal is to split this vector into two > vectors such that the first vector contains just the words and the second > contains just the numbers. I cannot use strsplit (or at least I do not > know > how) as there is no obvious separator. > > I can use sub to create a separator, e.g., y = sub("([[:digit:]])"," > \\1",x), and then use strsplit, but I thought more experienced R users may > have a better solution. I've spent some time with Google, but not turned > up > anything so far. > > Many thanks, > --Krishna > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Henrique Dallazuanna
2010-May-20 00:40 UTC
[R] regex help: splitting strings with no separator
Try this: read.table(textConnection(gsub("([[:alpha:]])(\\d.*)", "\\1;\\2", x)), sep ";") or do.call(rbind, strsplit(gsub("([[:alpha:]])(\\d.*)", "\\1;\\2", x), ";")) On Wed, May 19, 2010 at 9:15 PM, Krishna Tateneni <tateneni@gmail.com>wrote:> Greetings, > > I have a vector of values that are a word followed by a number, e.g., x > c("Apple12","HP42","Dell91"). The goal is to split this vector into two > vectors such that the first vector contains just the words and the second > contains just the numbers. I cannot use strsplit (or at least I do not > know > how) as there is no obvious separator. > > I can use sub to create a separator, e.g., y = sub("([[:digit:]])"," > \\1",x), and then use strsplit, but I thought more experienced R users may > have a better solution. I've spent some time with Google, but not turned > up > anything so far. > > Many thanks, > --Krishna > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
Gabor Grothendieck
2010-May-20 01:21 UTC
[R] regex help: splitting strings with no separator
One way is to use strapply in the gsubfn package. It is like apply in that the first argument is the object (in both cases), the second is the modifier (the margin in the case of apply and the regular expression in the case of strapply) and a function (in both cases). The parenthesized expressions in the regular expression are captured and passed to the function. Here \\D+ is a string of non-digits and \\d+ is a string of digits. See http://gsubfn.googlecode.com home page, the vignette and the help for more info.> library(gsubfn) > strapply(x, "(\\D+)(\\d+)", c, simplify = rbind)[,1] [,2] [1,] "Apple" "12" [2,] "HP" "42" [3,] "Dell" "91" On Wed, May 19, 2010 at 8:15 PM, Krishna Tateneni <tateneni at gmail.com> wrote:> Greetings, > > I have a vector of values that are a word followed by a number, e.g., x > c("Apple12","HP42","Dell91"). ?The goal is to split this vector into two > vectors such that the first vector contains just the words and the second > contains just the numbers. ?I cannot use strsplit (or at least I do not know > how) as there is no obvious separator. > > I can use sub to create a separator, e.g., y = sub("([[:digit:]])"," > \\1",x), and then use strsplit, but I thought more experienced R users may > have a better solution. ?I've spent some time with Google, but not turned up > anything so far. > > Many thanks, > --Krishna > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >