On 11/03/2015 10:31, Bob O'Hara wrote:> Hi!
>
> I'm trying to persuade R's regular expressions to do what I want. I
This is not "R's regular expressions" , but the world's
regular expressions.
> have a vector of strings which are names of variables, some of which
> are elements of strings. I want to reformat all of the variables into
> a list, so (for example) beta[1] and beta[2] would be a vector in the
> list. Where I'm struggling is how to pick out the correct variables.
>
> The problem is that if I have a sub-string, str, then I want to find
> the strings that is either the same as the sub-string, or is the
> substring followed by a '['. I feel I should be able to do this
within
> a character class if I could give it an end of string character, i.e.
> '[$\\[]' where $ is not a literal $, but the end of the string
(i.e.
> how it's interpreted outside a character class)
$ inside a character class is a character, not a metacharacter.
>
> Here's an example, using $ where I want the end of string:
>
>> VarNames <- c("alpha", "beta[1]",
"beta[2]", "m", "mu.k", "mu.r")
>> TryNames <- unique(gsub('[]\\[1-9]',"",VarNames))
>>
>> VarNames[grep(paste('^',TryNames[1], '[$\\[]',
sep=""), VarNames)] # want "alpha"
> character(0)
>> VarNames[grep(paste('^',TryNames[2], '[$\\[]',
sep=""), VarNames)] # Gives waht I want
> [1] "beta[1]" "beta[2]"
>> VarNames[grep(paste('^',TryNames[3], '[$\\[]',
sep=""), VarNames)] # want "m"
> character(0)
>> VarNames[grep(paste('^',TryNames[3], sep=""),
VarNames)] # gives more than "m"
> [1] "m" "mu.k" "mu.r"
>
> Is it possible to do this, or will I have to resort to using '|'
> (which works but is ugly & will only get uglier in the future)?
Why is
grep(paste0('^', TryNames[1], '($|\\[)'), VarNames, value =
TRUE)
not a lot less ugly than the code you present?
>
> Bob
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK