Hi everybody,
I am developing some functions that use regular expressions
and grepl, to check whether certain strings match a given pattern or not.
In the regular expression I use some shortcuts such as [:alnum:].
Reading the documentation for regular expression there is one sentence that
is not entirely clear to me:
<< The only portable way to specify all ASCII letters is to list them all
as the character class
[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz].
(The current implementation uses numerical order of the encoding.)
Certain named classes of characters are predefined. Their interpretation
depends on the locale (see locales); the interpretation below is that of
the POSIX locale.
[:alnum:]
Alphanumeric characters: [:alpha:] and [:digit:]. >>
Does this mean that I can use [:alnum:] safely to check for letters and
numbers?
Or is there the risk that the code won't work in a computer using a
different locale?
If so, can't I tell grepl to use the POSIX locale to interpret the
alfanumeric characters?
Thanks a lot in advance for the help!
Best,
Luca
[[alternative HTML version deleted]]