Displaying 1 result from an estimated 1 matches for "u0001f937bar".
2020 Jun 08
1
Potential issue with perl-based pattern matching with Unicode characters on Windows R 4.0 and above
...e noticed new behavior in `regexpr(..., perl = TRUE)` on Windows with
R4.0 and above with Unicode characters. Here's a minimal example where I'd
expect to see a start value of `5` (as R 3.6.2 and below gives), but R
4.0.0 (and R 4.0.1) now returns:
```
> regexpr("b", "foo\U0001F937bar", perl = TRUE)
#> [1] 6
#> attr(,"match.length")
#> [1] 1
```
Perhaps this change in behavior could be explained by R4.0's migration to
PCRE2? Here is some relevant output from my R4.0 session:
```
> pcre_config()
#> UTF-8 Unicode properties JIT stack
#>...