Hi, I am failing make check in r72721 at the end of reg-tests-1d.R. The relevant block of code is ## path.expand shouldn't translate to local encoding PR#17120 filename <- "\U9b3c.R" print(Encoding(filename)) x1 <- path.expand(paste0("~/", filename)) print(Encoding(x1)) x2 <- paste0(path.expand("~/"), filename) print(Encoding(x2)) stopifnot(identical( path.expand(paste0("~/", filename)), paste0(path.expand("~/"), filename))) ## Chinese character was changed to hex code Encoding(x1) is "unknown" while Encoding(x2) is "UTF-8". If I run this code with R --vanilla, both are UTF-8 and the assertion passes. What is make check doing differently? Or is there something wrong with my setting/environment? Thanks, h. -- +--- | Hiroyuki Kawakatsu | Business School, Dublin City University | Dublin 9, Ireland. Tel +353 (0)1 700 7496
On 24/05/2017 5:47 AM, Hiroyuki Kawakatsu wrote:> Hi, > > I am failing make check in r72721 at the end of reg-tests-1d.R. The > relevant block of code is > > ## path.expand shouldn't translate to local encoding PR#17120 > filename <- "\U9b3c.R" > print(Encoding(filename)) > x1 <- path.expand(paste0("~/", filename)) > print(Encoding(x1)) > x2 <- paste0(path.expand("~/"), filename) > print(Encoding(x2)) > stopifnot(identical( path.expand(paste0("~/", filename)), paste0(path.expand("~/"), filename))) > ## Chinese character was changed to hex code > > Encoding(x1) is "unknown" while Encoding(x2) is "UTF-8". If I run > this code with R --vanilla, both are UTF-8 and the assertion > passes. What is make check doing differently? Or is there something > wrong with my setting/environment? Thanks, >I think the test is wrong because in the first case you are working in a locale where that character is representable. In my locale it is not, so x1 is converted to UTF-8, and everything compares equal. An explicit conversion of x1 to UTF-8 should fix this, i.e. replace x1 <- path.expand(paste0("~/", filename)) with x1 <- enc2utf8(path.expand(paste0("~/", filename))) Could you try this and see if it helps? Duncan Murdoch
On 2017-05-24, Duncan Murdoch wrote:> > I think the test is wrong because in the first case you are working in a > locale where that character is representable. In my locale it is not, so x1 > is converted to UTF-8, and everything compares equal. > > An explicit conversion of x1 to UTF-8 should fix this, i.e. replace > > x1 <- path.expand(paste0("~/", filename)) > > with > > x1 <- enc2utf8(path.expand(paste0("~/", filename))) > > Could you try this and see if it helps?Nope:> ## path.expand shouldn't translate to local encoding PR#17120 > filename <- "\U9b3c.R" > > x11 <- path.expand(paste0("~/", filename)) > print(Encoding(x11))[1] "unknown"> x12 <- enc2utf8( path.expand(paste0("~/", filename)) ) > print(Encoding(x12))[1] "unknown"> x2 <- paste0(path.expand("~/"), filename) > print(Encoding(x2))[1] "UTF-8"> > #stopifnot(identical(path.expand(paste0("~/", filename)), > stopifnot(identical(enc2utf8( path.expand(paste0("~/", filename)) ),+ paste0(path.expand("~/"), filename))) Error: identical(enc2utf8(path.expand(paste0("~/", filename))), paste0(path.expand("~/"), .... is not TRUE Execution halted I forgot to report:> sessionInfo()R Under development (unstable) (2017-05-23 r72721) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux 9 (stretch) Matrix products: default BLAS: /usr/local/share/R-devel/lib/libRblas.so LAPACK: /usr/local/share/R-devel/lib/libRlapack.so locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.5.0 h. -- +--- | Hiroyuki Kawakatsu | Business School, Dublin City University | Dublin 9, Ireland. Tel +353 (0)1 700 7496