Hello: sort(c('A', 'b', 'C')) seems to produce different answers in R interactive than in "R CMD check", at least under both Fedora 13 and Windows 7 with Windows 7 sessionInfo() copied below: In interactive, the result is c('A', 'b', 'C'); with R CMD check, it is c('A', 'C', 'b'). This produced the infelicity of a bug in "R CMD check" that I could not replicate with interactive R because a *.Rd file contained the equivalent example of stopifnot(all.equal(sort(c('A', 'b', 'C')), c('A', 'b', 'C'))): It worked just fine interactively but failed R CMD check. Once I understood this problem, it was easy to fix. However, it was not easy to find, especially since I got the same problem under Fedora 13 Linux and Windows 7. This seems to be a sufficiently obscure anomaly that I thought someone might like to see it reported here. Best Wishes, Spencer Graves > sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] SIM_1.4-6 fda_2.2.6 zoo_1.6-5 RCurl_1.5-0.1 bitops_1.0-4.1 [6] R2HTML_2.2 oce_0.3-1 loaded via a namespace (and not attached): [1] grid_2.12.2 lattice_0.19-30 tools_2.12.2
Collation is locale-specific. To stop problems such as that you encontered, where it matters 'R CMD check' uses LC_COLLATE=C (and documents it). Otherwise package checks would be system-dependent, and maybe even user-dependent. It really isn't a good idea to misuse examples in help files for regression tests, not least when the result is not system-independent. I'm afraid the only obscurity here is why you didn't undertand basic facts about collation (facts which are linked to from the help page of sort()). E.g. The collating sequence of locales such as ?en_US? is normally different from ?C? (which should use ASCII) and can be surprising. Beware of making _any_ assumptions about the collation order Please do also take note of what the posting guide asked you to do about obsolete versions of R. On Sat, 9 Jul 2011, Spencer Graves wrote:> Hello: > > > sort(c('A', 'b', 'C')) seems to produce different answers in R > interactive than in "R CMD check", at least under both Fedora 13 and Windows > 7 with Windows 7 sessionInfo() copied below: > > > In interactive, the result is c('A', 'b', 'C'); with R CMD check, it > is c('A', 'C', 'b'). This produced the infelicity of a bug in "R CMD check" > that I could not replicate with interactive R because a *.Rd file contained > the equivalent example of stopifnot(all.equal(sort(c('A', 'b', 'C')), c('A', > 'b', 'C'))): It worked just fine interactively but failed R CMD check. > > > Once I understood this problem, it was easy to fix. However, it was > not easy to find, especially since I got the same problem under Fedora 13 > Linux and Windows 7. > > > This seems to be a sufficiently obscure anomaly that I thought someone > might like to see it reported here. > > > Best Wishes, > Spencer Graves > > >> sessionInfo() > R version 2.12.2 (2011-02-25) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] splines stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] SIM_1.4-6 fda_2.2.6 zoo_1.6-5 RCurl_1.5-0.1 > bitops_1.0-4.1 > [6] R2HTML_2.2 oce_0.3-1 > > loaded via a namespace (and not attached): > [1] grid_2.12.2 lattice_0.19-30 tools_2.12.2 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Jul 10, 2011, at 05:44 , Spencer Graves wrote:> Hello: > > > sort(c('A', 'b', 'C')) seems to produce different answers in R interactive than in "R CMD check", at least under both Fedora 13 and Windows 7 with Windows 7 sessionInfo() copied below: > > > In interactive, the result is c('A', 'b', 'C'); with R CMD check, it is c('A', 'C', 'b'). This produced the infelicity of a bug in "R CMD check" that I could not replicate with interactive R because a *.Rd file contained the equivalent example of stopifnot(all.equal(sort(c('A', 'b', 'C')), c('A', 'b', 'C'))): It worked just fine interactively but failed R CMD check. > > > Once I understood this problem, it was easy to fix. However, it was not easy to find, especially since I got the same problem under Fedora 13 Linux and Windows 7. > > > This seems to be a sufficiently obscure anomaly that I thought someone might like to see it reported here. >Well, the problem is here: [snip]> > locale: > [1] LC_COLLATE=English_United States.1252=========================================> [2] LC_CTYPE=English_United States.1252> [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252All checks in R (unless we overlooked some) run with LC_COLLATE=C, because otherwise they give different results in different locales. One notorious example is that people expect that a file or an object called "zzz" comes out last in a sort, but Estonian sorts "z" between "s" and "t"... Notice that your .Rd example would, for the same reason, break for people with different locale settings. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com