thr3ads.net - R help - [R] Unicode normalization? [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Twitter
Facebook
Email

Allan Engelhardt

2009-Jun-17 15:35 UTC

[R] Unicode normalization?

Does R support unicode normalization?  For my application, I'd quite 
like to test for canonical equivalence (e.g. "n\u0303" is equivalent
to
"\u00F1" which is ?) and ideally convert strings to NFD form. 
("\u0303"
is the "combining tilde" character.)  Is there a package for this?

The Unicode Normalization FAQ [1] states that "Programs should always 
compare canonical-equivalent Unicode strings as equal" so is it even a 
bug that "n\u0303" != "\u00F1" in my version of R?

Allan

[1] see http://www.unicode.org/unicode/faq/normalization.html

R help - Jun 2009 - Unicode normalization?

[R] Unicode normalization?