That is not a very selective regex.
Actually, a long "or" probably is best, but you don't have to type
it in directly.
prefixes <- c( "AD", "FN" )
pat <- paste0( "^(", paste( prefixes, collapse="|" ),
")[0-9]{4}$" )
grepl( pat, Identifier )
--
Sent from my phone. Please excuse my brevity.
On November 29, 2016 10:37:29 AM PST, Glenn Schultz <glennmschultz at
me.com> wrote:>Hello All,
>
>I have a dataframe of about 1.5 million rows from this dataframe I need
>to filter out identifiers. ?An example would be 070000-07099,
>AD0000-AD0999, and AL0000-AL9999, FN0000-FN9999. ?I am using grepl to
>identify those of interest as follows:
>
>?grepl("^[FN]|[AD]{2}", Identifier)
>
>The above seems to work in the case of FN and AD. ?However, there are
>20 such identifiers and there must be a better way to do this than a
>long "or" statement. ?Ultimately, I would like to filter these out
>using dplyr which I think the first step is to create a vector of
>TRUE/FALSE then filter on TRUE
>
>Any Ideas are appreciated,
>Glenn
>
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.