jeffreys at rand.org
2009-Dec-21 19:40 UTC
[Rd] sort yields different results on OS X (PR#14163)
Full_Name: Jeffrey Sullivan Version: 2.10 OS: Mac Submission from: (NULL) (130.154.0.250) Sort produces different results when sorting strings with non-alphanumeric characters, depending on the operating system: RHEL 5.2, R 2.10.0 -------------> v <- c("1","<0",">3","2") > Sys.setlocale("LC_COLLATE","en_US.UTF-8")[1] "en_US.UTF-8"> sort(v)[1] "<0" "1" "2" ">3" Max OS 10.5.8, R 2.10.1 -------------------> v <- c("1","<0",">3","2") > Sys.setlocale("LC_COLLATE","en_US.UTF-8")[1] "en_US.UTF-8"> sort(v)[1] "<0" ">3" "1" "2"
Prof Brian Ripley
2009-Dec-22 12:18 UTC
[Rd] sort yields different results on OS X (PR#14163)
As the help says The sort order for character vectors will depend on the collating sequence of the locale in use: see ?Comparison?. and that ref says Collation of non-letters (spaces, punctuation signs, hyphens, fractions and so on) is even more problematic. That different OSes use the same name for a locale does not make them the same locale. Note that R can be compiled to use ICU, which provides a well-considered collation suite. R on Mac OS X uses ICU, as does a Linux build if it is available -- so I would say that it is RHEL that is out of line here (it makes little sense to have < and > far apart in the collation sequence). Why did you report a documented difference as a bug? On Mon, 21 Dec 2009, jeffreys at rand.org wrote:> Full_Name: Jeffrey Sullivan > Version: 2.10 > OS: Mac > Submission from: (NULL) (130.154.0.250) > > > Sort produces different results when sorting strings with non-alphanumeric > characters, depending on the operating system: > > RHEL 5.2, R 2.10.0 > ------------- >> v <- c("1","<0",">3","2") >> Sys.setlocale("LC_COLLATE","en_US.UTF-8") > [1] "en_US.UTF-8" >> sort(v) > [1] "<0" "1" "2" ">3" > > Max OS 10.5.8, R 2.10.1 > ------------------- >> v <- c("1","<0",">3","2") >> Sys.setlocale("LC_COLLATE","en_US.UTF-8") > [1] "en_US.UTF-8" >> sort(v) > [1] "<0" ">3" "1" "2" > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595