search for: u0001f937bar

Displaying 1 result from an estimated 1 matches for "u0001f937bar".

2020 Jun 08
1
Potential issue with perl-based pattern matching with Unicode characters on Windows R 4.0 and above
...e noticed new behavior in `regexpr(..., perl = TRUE)` on Windows with R4.0 and above with Unicode characters. Here's a minimal example where I'd expect to see a start value of `5` (as R 3.6.2 and below gives), but R 4.0.0 (and R 4.0.1) now returns: ``` > regexpr("b", "foo\U0001F937bar", perl = TRUE) #> [1] 6 #> attr(,"match.length") #> [1] 1 ``` Perhaps this change in behavior could be explained by R4.0's migration to PCRE2? Here is some relevant output from my R4.0 session: ``` > pcre_config() #> UTF-8 Unicode properties JIT stack #&gt...