Dear all, ? I don't know if you consider this a bug or feature, but it breaks reasonable code: 'unlist' and 'sapply' convert 'ordered' to 'factor' even if all levels are equal. Here is a simple example: o <- ordered(letters) o[[1]] lapply(o, min)[[1]]??????????# ordered factor unlist(lapply(o, min))[[1]]? # no longer ordered sapply(o, min)[[1]]??????????# no longer ordered Jens?Oehlschl?gel ? ? P.S: The above examples are silly for simple reproduction. The current behavior broke my use-case which had a structure like this ? # have some data x <- 1:20 # apply some function to each element somefunc <- function(x){ ? # do something and return an ordinal level ? sample(o, 1) } x <- sapply(x, somefunc) # get minimum result min(x) # Error in Summary.factor(c(2L, 26L), na.rm = FALSE) : #?? ?min? not meaningful for factors ? ?> version?????????????? _????????????????????????? ? platform?????? x86_64-pc-linux-gnu??????? ? arch?????????? x86_64???????????????????? ? os???????????? linux-gnu????????????????? ? system???????? x86_64, linux-gnu????????? ? status??????????????????????????????????? ? major????????? 3????????????????????????? ? minor????????? 4.0??????????????????????? ? year?????????? 2017?????????????????????? ? month????????? 04???????????????????????? ? day??????????? 21???????????????????????? ? svn rev??????? 72570????????????????????? ? language?????? R????????????????????????? ? version.string R version 3.4.0 (2017-04-21) nickname?????? You Stupid Darkness????????
Hi, It's been my experience that when you combine or aggregate vectors of factors using a function, you should be prepared for surprises, as it's not obvious what the "right" way to combine factors is (ordered or not), especially if two vectors of factors have different levels or (if ordered) are ordered in a different way. For instance, what would you expect to get from unlist() if each element of the list had different levels, or were both ordered, but in a different way, or if some elements of the list were factors and others were ordered factors?> unlist(list(ordered(c("a","b")), ordered(c("b","a"))))[1] ? Honestly, my biggest surprise from your question was that unlist even returned a factor at all. For example, the c() function just converts factors to integers.> c(ordered(c("a","b")), ordered(c("a","b")))[1] 1 2 1 2 And here's one that's especially weird. When rbind() data frames with an ordered factor, you still get an ordered factor back, but the order may be different from either of the original orders:> x1 <- data.frame(a=ordered(c("b","c"))) > x2 <- data.frame(a=ordered(c("a","b","c"))) > str(rbind(x1,x2)) # Note b < a'data.frame': 5 obs. of 1 variable: $ a: Ord.factor w/ 3 levels "b"<"c"<"a": 1 2 3 1 2 Should rbind just have returned an integer like c(), or returned a factor like unlist(), or should it kept the result as an ordered factor, but ordered the result in a different way? I have no idea. So in short, IMO, there are definitely inconsistencies in how ordered/factors are handled across functions, but I think it would be hard to point to any single function and say it is wrong or needs to be changed. My best advice, is to just be careful when combining or aggregating factors. --Robert -----Original Message----- From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of "Jens Oehlschl?gel" Sent: Friday, June 16, 2017 9:04 AM To: r-devel at r-project.org Cc: jens.oehlschlaegel at truecluster.com Subject: [Rd] 'ordered' destroyed to 'factor' Dear all, ? I don't know if you consider this a bug or feature, but it breaks reasonable code: 'unlist' and 'sapply' convert 'ordered' to 'factor' even if all levels are equal. Here is a simple example: o <- ordered(letters) o[[1]] lapply(o, min)[[1]]??????????# ordered factor unlist(lapply(o, min))[[1]]? # no longer ordered sapply(o, min)[[1]]??????????# no longer ordered Jens?Oehlschl?gel ? ? P.S: The above examples are silly for simple reproduction. The current behavior broke my use-case which had a structure like this ? # have some data x <- 1:20 # apply some function to each element somefunc <- function(x){ ? # do something and return an ordinal level ? sample(o, 1) } x <- sapply(x, somefunc) # get minimum result min(x) # Error in Summary.factor(c(2L, 26L), na.rm = FALSE) : #?? ?min? not meaningful for factors ? ?> version?????????????? _????????????????????????? ? platform?????? x86_64-pc-linux-gnu??????? ? arch?????????? x86_64???????????????????? ? os???????????? linux-gnu????????????????? ? system???????? x86_64, linux-gnu????????? ? status??????????????????????????????????? ? major????????? 3????????????????????????? ? minor????????? 4.0??????????????????????? ? year?????????? 2017?????????????????????? ? month????????? 04???????????????????????? ? day??????????? 21???????????????????????? ? svn rev??????? 72570????????????????????? ? language?????? R????????????????????????? ? version.string R version 3.4.0 (2017-04-21) nickname?????? You Stupid Darkness???????? ______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
This can be traced back to the following line in unlist(): structure(res, levels = lv, names = nm, class = "factor") The Details section of ?unlist states specifically how it treats factors, so this is documented and expected behaviour. This is also the appropriate behaviour. In your case one could argue that unlist should maintain the order, as there's only a single factor. However, the moment you have 2 ordered factors, there's no guarantee that the levels are the same, or even in the same order. Hence it is impossible to determine what should be the correct order. For this reason, the only logical object to be returned in case of a list of factors, is an unordered factor. In your use case (so with a list of factors with identical ordered levels) the solution is one extra step: x <- list( factor(c("a","b"), levels = c("a","b","c"), ordered = TRUE), factor(c("b","c"), levels = c("a","b","c"), ordered = TRUE) ) res <- sapply(x, min) res <- ordered(res, levels = levels(res)) min(res) I hope this explains Cheers Joris On Fri, Jun 16, 2017 at 3:03 PM, "Jens Oehlschl?gel" < jens.oehlschlaegel at truecluster.com> wrote:> Dear all, > > I don't know if you consider this a bug or feature, but it breaks > reasonable code: 'unlist' and 'sapply' convert 'ordered' to 'factor' even > if all levels are equal. Here is a simple example: > > o <- ordered(letters) > o[[1]] > lapply(o, min)[[1]] # ordered factor > unlist(lapply(o, min))[[1]] # no longer ordered > sapply(o, min)[[1]] # no longer ordered > > Jens Oehlschl?gel > > > P.S: The above examples are silly for simple reproduction. The current > behavior broke my use-case which had a structure like this > > # have some data > x <- 1:20 > # apply some function to each element > somefunc <- function(x){ > # do something and return an ordinal level > sample(o, 1) > } > x <- sapply(x, somefunc) > # get minimum result > min(x) > # Error in Summary.factor(c(2L, 26L), na.rm = FALSE) : > # ?min? not meaningful for factors > > > > version > _ > platform x86_64-pc-linux-gnu > arch x86_64 > os linux-gnu > system x86_64, linux-gnu > status > major 3 > minor 4.0 > year 2017 > month 04 > day 21 > svn rev 72570 > language R > version.string R version 3.4.0 (2017-04-21) > nickname You Stupid Darkness > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 (0)9 264 61 79 Joris.Meys at Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
> On 16 Jun 2017, at 15:59 , Robert McGehee <rmcgehee at walleyetrading.net> wrote: > > For instance, what would you expect to get from unlist() if each element of the list had different levels, or were both ordered, but in a different way, or if some elements of the list were factors and others were ordered factors? >> unlist(list(ordered(c("a","b")), ordered(c("b","a")))) > [1] ?Those actually have the same levels in the same order: a < b Possibly, this brings the point home more clearly unlist(list(ordered(c("a","c")), ordered(c("b","d")))) (Notice that alphabetical order is largely irrelevant, so all of these level orderings are equally possible: a < c < b < d a < b < c < d a < b < d < c b < a < c < d b < a < d < c b < d < a < c ). -pd -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com