search for: r_nchar

Displaying 1 result from an estimated 1 matches for "r_nchar".

Did you mean: r_char
2020 Apr 04
0
Possible Bug In Validation of UTF-8 Sequences
As per `?intToUtf8`, and in the comments to `valid_utf8`[1], R intends to prevent illegal UTF-8 such as UTF-8 encoded UTF-16 surrogate pairs.? `R_nchar`, invoked via `base::nchar`, explicitly validates UTF-8 strings[2], but allows the surrogate: ??? > Encoding('\ud800') ??? [1] "UTF-8" ??? > nchar('\ud800')? // should be an error ??? [1] 1 The problem manifests on systems where `char` is signed.? The logic used to...