On RedHat Enterprise Linux 6, the test below fails (this is using the stock GCC 4.4.7) from R-devel r72707. LC_CTYPE is unset when I run it, but LANG=en_US.UTF-8 It also failed "yesterday" where as far as I recall the test code looked a bit different. Best, Kasper> ## Results differed by platform, but some gave incorrect results onstring 10.> > > ## str() on large strings (in multibyte locales; changing locale may notwork everywhere> oloc <- Sys.getlocale("LC_CTYPE") > mbyte.lc <- {+ if(.Platform$OS.type == "windows") + "English_United States.28605" + else if(grepl("[.]UTF-8$", oloc, ignore.case=TRUE)) # typically nowadays + oloc + else + "C.UTF-8" # or rather "en_US.UTF-8" (? from system("locale -a| fgrep .UTF-8") ) + }> stopifnot(identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc))Error: identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc) is not TRUE In addition: Warning message: In Sys.setlocale("LC_CTYPE", mbyte.lc) : OS reports request to set locale to "C.UTF-8" cannot be honored Execution halted [[alternative HTML version deleted]]
I rebuilt R with export LC_CTYPE=en_US.UTF-8 and the test still fail. Surprisingly, when I run R from the bin directory and execute the test code, it runs without error:> oloc <- Sys.getlocale("LC_CTYPE") > mbyte.lc <- {+ if(.Platform$OS.type == "windows") + "English_United States.28605" + else if(grepl("[.]UTF-8$", oloc, ignore.case=TRUE)) # typically nowadays + oloc + else + "C.UTF-8" # or rather "en_US.UTF-8" (? from system("locale -a| fgrep .UTF-8") ) + }> stopifnot(identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc)) > oloc[1] "en_US.UTF-8"> mbyte.lc[1] "en_US.UTF-8" On Fri, May 19, 2017 at 7:29 PM, Kasper Daniel Hansen < kasperdanielhansen at gmail.com> wrote:> On RedHat Enterprise Linux 6, the test below fails (this is using the > stock GCC 4.4.7) from R-devel r72707. LC_CTYPE is unset when I run it, but > LANG=en_US.UTF-8 > > It also failed "yesterday" where as far as I recall the test code looked a > bit different. > > Best, > Kasper > > > ## Results differed by platform, but some gave incorrect results on > string 10. > > > > > > ## str() on large strings (in multibyte locales; changing locale may not > work everywhere > > oloc <- Sys.getlocale("LC_CTYPE") > > mbyte.lc <- { > + if(.Platform$OS.type == "windows") > + "English_United States.28605" > + else if(grepl("[.]UTF-8$", oloc, ignore.case=TRUE)) # typically > nowadays > + oloc > + else > + "C.UTF-8" # or rather "en_US.UTF-8" (? from system("locale -a| > fgrep .UTF-8") ) > + } > > stopifnot(identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc)) > Error: identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc) is not > TRUE > In addition: Warning message: > In Sys.setlocale("LC_CTYPE", mbyte.lc) : > OS reports request to set locale to "C.UTF-8" cannot be honored > Execution halted >[[alternative HTML version deleted]]
>>>>> Kasper Daniel Hansen <kasperdanielhansen at gmail.com> >>>>> on Fri, 19 May 2017 20:09:24 -0400 writes:> I rebuilt R with > export LC_CTYPE=en_US.UTF-8 > and the test still fail. Surprisingly, when I run R from the bin directory > and execute the test code, it runs without error: >> oloc <- Sys.getlocale("LC_CTYPE") >> mbyte.lc <- { > + if(.Platform$OS.type == "windows") > + "English_United States.28605" > + else if(grepl("[.]UTF-8$", oloc, ignore.case=TRUE)) # typically nowadays > + oloc > + else > + "C.UTF-8" # or rather "en_US.UTF-8" (? from system("locale -a| fgrep .UTF-8") ) > + } >> stopifnot(identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc)) >> oloc > [1] "en_US.UTF-8" >> mbyte.lc > [1] "en_US.UTF-8" I had been making these changes in R-devel after offline discussions with Linux users for which the original check (using "en_UK.UTF-8") failed. What I read below is suggesting that "C.UTF-8" is not okay either, as a fallback. It seems we should use "en_US.UTF-8" as fallback instead (though I assume that won't work in North Korea). I've committed a version that does that _and_ no longer stops when that identical() does not give a 'TRUE'. Martin > On Fri, May 19, 2017 at 7:29 PM, Kasper Daniel Hansen < > kasperdanielhansen at gmail.com> wrote: >> On RedHat Enterprise Linux 6, the test below fails (this is using the >> stock GCC 4.4.7) from R-devel r72707. LC_CTYPE is unset when I run it, but >> LANG=en_US.UTF-8 >> >> It also failed "yesterday" where as far as I recall the test code looked a >> bit different. >> >> Best, >> Kasper >> >> > ## Results differed by platform, but some gave incorrect results on >> string 10. >> > >> > >> > ## str() on large strings (in multibyte locales; changing locale may not >> work everywhere >> > oloc <- Sys.getlocale("LC_CTYPE") >> > mbyte.lc <- { >> + if(.Platform$OS.type == "windows") >> + "English_United States.28605" >> + else if(grepl("[.]UTF-8$", oloc, ignore.case=TRUE)) # typically >> nowadays >> + oloc >> + else >> + "C.UTF-8" # or rather "en_US.UTF-8" (? from system("locale -a| >> fgrep .UTF-8") ) >> + } >> > stopifnot(identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc)) >> Error: identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc) is not >> TRUE >> In addition: Warning message: >> In Sys.setlocale("LC_CTYPE", mbyte.lc) : >> OS reports request to set locale to "C.UTF-8" cannot be honored >> Execution halted >>