On Sun, 21 Mar 2004, Fred J. wrote:
> I could use some help here with trying to use perl
> stype regex to extract the first group of letters
> before a ( . )
> so if I have a sting AACEE.adiid and wanting AACEE
> i <- "AACEE.adiid"
> grep(".+\..?+",i,perl=T)
> I must be doing somthing wrong but don't know what it
> is?
First, see ?regexp, which says
Patterns are described here as they would be printed by 'cat': do
remember that backslashes need to be doubled in entering R
character strings from the keyboard.
so you need to double \.
Second, your pattern is wrong. You wanted the first ., so use
".+?\\..*"
in perl style, or just "[^.]+\\..+" in any style.
Second, grep tells you whether or not the pattern occurred. If you want
to extract it, you need to use sub and sub-expressions, as in
sub("(.+?)(\\..+)", "\\1", i, perl=TRUE)
sub("([^.]+)(\\..+)", "\\1", i)
Please do read the help pages before posting: they have the information
and relevant examples.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595