Don MacQueen
2010-Jan-16 00:04 UTC
[Rd] order() fails on a chr object of class "AsIs" with "\265" in it
Here's an example (session info at the end).> tmpv <- c('\265g/L','Bq/L') > order(tmpv)[1] 2 1> tmpv <- I(tmpv) > order(tmpv)Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed> foov <- gsub('\265','',tmpv) > order(foov)[1] 2 1> str(tmpv)Class 'AsIs' chr [1:2] "\265g/L" "Bq/L"> str(foov)Class 'AsIs' chr [1:2] "g/L" "Bq/L" I can easily work around this in my scripts, but shouldn't order() succeed with such an object? (I suppose this could be Mac-specific, but I'm assuming it's not...) For context: The character "\265" causes the Greek letter mu to be displayed in various output devices. For example, the character vector eventually gets written to an html file, which when displayed in Firefox (Mac) is displayed as Greek mu. Also in Excel 2004 (Mac). I first wrote these scripts 6 years ago, when "\265" was a way I could find to display the Greek mu in output text files of various kinds. They worked as recently as 3 months ago. Maybe there's a better way now to display a mu in text-based contexts?> sessionInfo()R version 2.10.1 (2009-12-14) i386-apple-darwin8.11.1 locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base Thanks -Don -- -------------------------------------- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062
Prof Brian Ripley
2010-Jan-16 07:17 UTC
[Rd] order() fails on a chr object of class "AsIs" with "\265" in it
On Fri, 15 Jan 2010, Don MacQueen wrote:> Here's an example (session info at the end). > >> tmpv <- c('\265g/L','Bq/L') >> order(tmpv) > [1] 2 1 >> tmpv <- I(tmpv) >> order(tmpv) > Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed >> foov <- gsub('\265','',tmpv) >> order(foov) > [1] 2 1 >> str(tmpv) > Class 'AsIs' chr [1:2] "\265g/L" "Bq/L" >> str(foov) > Class 'AsIs' chr [1:2] "g/L" "Bq/L" > > I can easily work around this in my scripts, but shouldn't order() succeed > with such an object?Not in the C locale. There is no pre-defined ordering for non-ASCII characters in that locale and the string is invalid in a strict C locale.> (I suppose this could be Mac-specific, but I'm assuming it's not...)No, but the handling of invalid strings in C is OS-specific.> For context: > The character "\265" causes the Greek letter mu to be displayed in various > output devices. For example, the character vector eventually gets written to > an html file, which when displayed in Firefox (Mac) is displayed as Greek mu. > Also in Excel 2004 (Mac). > > I first wrote these scripts 6 years ago, when "\265" was a way I could find > to display the Greek mu in output text files of various kinds. They worked as > recently as 3 months ago. Maybe there's a better way now to display a mu in > text-based contexts?Use UTF-8 and Unicode \u03BC (http://www.alanwood.net/unicode/greek.html). The issue is that you need a xtfrm method for 'AsIs': it falls back to comparisons via .gt and those (correctly) fail. xtfrm.AsIs <- function(x) xtfrm(unclass(x)) would keep get you going until you fix the scripts.> >> sessionInfo() > R version 2.10.1 (2009-12-14) > i386-apple-darwin8.11.1 > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > Thanks > -Don > -- > -------------------------------------- > Don MacQueen > Environmental Protection Department > Lawrence Livermore National Laboratory > Livermore, CA, USA > 925-423-1062 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Maybe Matching Threads
- xftrm is more than 100x slower for AsIs than for character vectors
- xftrm is more than 100x slower for AsIs than for character vectors
- Problem with order() and I()
- xftrm is more than 100x slower for AsIs than for character vectors
- several bugs (PR#918) lists and matrices