I believe the backslash is not considered an escape character by the extended RE
library used by R (perl=FALSE), so it is being treated as a literal. This means
that the last ] is outside the character class and is the atom that the *
applies to.
gsub("^([[:alnum:]\\[\\]]*).*", "\\1", "a]]]rray[n]
<- 10", perl=FALSE)
yields
"a]]]"
(Using F in place of FALSE is bad form.)
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On October 15, 2014 4:42:00 AM PDT, ALBERTO VIEIRA FERREIRA MONTEIRO <albmont
at centroin.com.br> wrote:>I just found a curious behaviour of regexp and I'd like to share with
>y'all.
>
>gsub("^([[:alnum:]\\[\\]]*).*", "\\1", "array[n]
<- 10", perl=T) #
>works as expected ("array[n]")
>
>gsub("^([[:alnum:]\\[\\]]*).*", "\\1", "array[n]
<- 10", perl=F) #
>doesn't work ("a")
>
>I didn't find anything in the documentation explain what's going on,
>and why the second gsub doesn't work.
>
>Alberto Monteiro
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.