Sebastian Martin Krantz
2020-May-08 21:01 UTC
[Rd] base::order making available retGrp and sortStr options for radix method?
Hi together, a bit more than a month ago I have released the 'collapse' package for advanced and fast data transformation in R with an array of fast grouped and weighted functions and facilities for efficient grouped programming in R. As I am preparing the next update of this package I have come across the following: For grouping, 'collapse' uses the function 'GRP', and efficient wrapper around data.table:::forderv for fast radix sort based grouping. To do this the source code for forderv was copied and deparallelized. Now I realized that an earlier deparallelized version of forderv is already fully available in base R: https://github.com/wch/r-source/blob/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/main/radixsort.c This function is called in base::order(..., method = "radix"). I was mildly aware that data.table ordering has made it into base R but I first thought the grouping feature of forder had been removed. However in fact it is there but disabled. base::order lines 31-35 reads: if (method == "radix") { decreasing <- rep_len(as.logical(decreasing), length(z)) return(.Internal(radixsort(na.last, decreasing, FALSE, TRUE, ...))) } which is essentially: return(.Internal(radixsort(na.last, decreasing, retGrp, sortStr, ...))) with the retGrp arguments which returns the group starts and the maximum group size disabled. sortStr = FALSE can be used to do unordered groupings. My request is if it is possible to make available these features to the user. It would make available extremely fast ordered grouping facilities to all developers and prevent the need for people like myself to copy this source code. In R it could be made available through a simple function like: radixorder <- function(..., na.last = TRUE, decreasing = FALSE, retGrp FALSE, sortStr = TRUE) { z <- list(...) decreasing <- rep_len(as.logical(decreasing), length(z)) return(.Internal(radixsort(na.last, decreasing, retGRP, otharg, ...))) } Alternatively a macro in the C API like R_orderVector i.e. R_orderVectorRadix would be great. Best regards, Sebastian <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virenfrei. www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> [[alternative HTML version deleted]]