Lapointe, Pierre
2006-Nov-06 22:23 UTC
[R] grep searching for sequence of 3 consecutive upper case letters
Hello, I need to identify all elements which have a sequence of 3 consecutive upper case letters, anywhere in the string. I tested my grep expression on this site: http://regexlib.com/RETester.aspx But when I try it in R, it does not filter anything. str <-c("AGH", "this WOUld be good", "Not Good at All") str[grep('[A-Z]{3}',str)] #looking for a sequence of 3 consecutive upper case letters [1] "AGH" "this WOUld be good" "Not Good at All" Any idea? Pierre ************************************************** AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}}
David Barron
2006-Nov-06 22:37 UTC
[R] grep searching for sequence of 3 consecutive upper case letters
Try str[grep('[[:upper:]]{3}',str)] On 06/11/06, Lapointe, Pierre <Pierre.Lapointe at nbf.ca> wrote:> Hello, > > I need to identify all elements which have a sequence of 3 consecutive upper > case letters, anywhere in the string. > > I tested my grep expression on this site: http://regexlib.com/RETester.aspx > > But when I try it in R, it does not filter anything. > > str <-c("AGH", "this WOUld be good", "Not Good at All") > str[grep('[A-Z]{3}',str)] #looking for a sequence of 3 consecutive upper > case letters > > [1] "AGH" "this WOUld be good" "Not Good at All" > > Any idea? > > Pierre > > ************************************************** > AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ================================David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP
Peter Dalgaard
2006-Nov-06 22:51 UTC
[R] grep searching for sequence of 3 consecutive upper case letters
"Lapointe, Pierre" <Pierre.Lapointe at nbf.ca> writes:> Hello, > > I need to identify all elements which have a sequence of 3 consecutive upper > case letters, anywhere in the string. > > I tested my grep expression on this site: http://regexlib.com/RETester.aspx > > But when I try it in R, it does not filter anything. > > str <-c("AGH", "this WOUld be good", "Not Good at All") > str[grep('[A-Z]{3}',str)] #looking for a sequence of 3 consecutive upper > case letters > > [1] "AGH" "this WOUld be good" "Not Good at All" > > Any idea?There are multiple versions of RE's, and fine details resolve in different ways. Don't expect the RETester to hold the Final Truth; it seems to relate to a particular programming environment, which is not R.> grep('[A-Z]{3}', str, perl=TRUE)[1] 1 2 Not only that, but> grep('[ABCDEFGHIJKLMNOPQRSTUVWXYZ]{3}', str)[1] 1 2 Hint: What is your collating sequence?> Sys.setlocale("LC_COLLATE", "C")[1] "C"> grep('[A-Z]{3}', str)[1] 1 2 -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907