Tan, Richard
2009-Jun-08 21:40 UTC
[R] Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number
Hi, This is not exactly an R question but I am trying to use gsub to replace a string that contains 5-9 alpha-numeric characters, at least one of which is a number. Is there a good way to write it in a one line regex? Thanks, Richard
Barry Rowlingson
2009-Jun-08 22:27 UTC
[R] Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number
On Mon, Jun 8, 2009 at 10:40 PM, Tan, Richard<RTan at panagora.com> wrote:> Hi, > > This is not exactly an R question but I am trying to use gsub to replace > a string that contains 5-9 alpha-numeric characters, at least one of > which is a number. ?Is there a good way to write it in a one line regex?The only way I can think of is to spell out all the possible expressions, somethinglike: [0-9][a-z0-9]{4} | [a-z0-9][0-9][a-z0-9]{3} | [a-z0-9]{2}[0-9][a-z0-9]{2} .... and so on. That is, have a regex component for every possible 5, 6, 7, 8, and 9 character expression with [0-9] in each place. I'm not sure this qualifies as 'good', though.. Better to do it in two stages, one to check for 5-9 alphanumerics, and then another to check for a number. Here's something on a test vector 's':> cbind(s,grepl("^[A-Z0-9]{5,9}$",s),grepl("[0-9]",s))s [1,] "SHRT" "FALSE" "FALSE" [2,] "5HRT" "FALSE" "TRUE" [3,] "M1TCH" "TRUE" "TRUE" [4,] "M1TCH5" "TRUE" "TRUE" [5,] "LONG3RS" "TRUE" "TRUE" [6,] "NONUMBER" "TRUE" "FALSE" [7,] "TOOLOOOONGG" "FALSE" "FALSE" The ones you want give two TRUE values. Extending to lower-case is left as an exercise... Barry