Skye Bender-deMoll
2014-Apr-14 21:36 UTC
[Rd] best way to write tests when sort() evaluates differently in R CMD check due to LC_COLLATE locale setting?
Dear R devel, What is the correct way to write package tests that could possibly fail due to locale collation behavior? Is it safe/proper for me to call Sys.setlocale("LC_COLLATE", "en_US.UTF-8") in each test file? Or should I explicitly force collation to C before writing tests? Or do I need to always call sort() on my comparison objects to ensure they are sorted in the same locale-specific way? I'd had a strange situation where a package test I'm writing fails R CMD check, but runs fine in the R terminal. I eventually got to the point where I can see that in R CMD check, the vector I'm comparing to evaluate the test result did not seem to be sorted as requested. Further digging revealed that the locale's LC_COLLATE value is set to 'C' in R CMD check while it is "en_US.UTF-8" in my R terminal. Now that I know what to look for in the documentation, I realize that this is a feature. p.36 of "Writing R Extensions" states: "All these tests are run with collation set to the C locale, and for the examples and tests with environment variable LANGUAGE=en: this is to minimize differences between platforms. " It appears that this impacts the sort order of capital letters > Sys.setlocale("LC_COLLATE", "C") [1] "C" > sort(c("a",'A','b','c')) [1] "A" "a" "b" "c" > Sys.setlocale("LC_COLLATE", "en_US.UTF-8") [1] "en_US.UTF-8" > sort(c("a",'A','b','c')) [1] "a" "A" "b" "c" best, -skye
Reasonably Related Threads
- R CMD check: Locale not set to C?
- issues with dev.new avoiding RStudio plot device on unix?
- issues with dev.new avoiding RStudio plot device on unix?
- alternatives to do.call() when namespace is attached but not loaded?
- issues with dev.new avoiding RStudio plot device on unix?