Harvey Smith
2019-Jan-05 04:37 UTC
[Rd] unsorted - suggestion for performance improvement and ALTREP support for POSIXct
I believe the performance of isUnsorted() in sort.c could be improved by calling REAL() once (outside of the for loop), rather than calling it twice inside the loop. As an aside, it is implemented in the faster way in doSort() (sort.c line 401). The example below shows the performance improvement for a vectors of double of moving REAL() outside the for loop. # example as implemented in isUnsorted body = " R_xlen_t n, i; n = XLENGTH(x); for(i = 0; i+1 < n ; i++) if(REAL(x)[i] > REAL(x)[i+1]) return ScalarLogical(TRUE); return ScalarLogical(FALSE);"; f1 = inline::cfunction(sig = signature(x='numeric'), body=body) # example updated with only one call to REAL() body = " R_xlen_t n, i; n = XLENGTH(x); double* real_x = REAL(x); for(i = 0; i+1 < n ; i++) if(real_x[i] > real_x[i+1]) return ScalarLogical(TRUE); return ScalarLogical(FALSE);"; f2 = inline::cfunction(sig = signature(x='numeric'), body=body) # unsorted x.double = as.double(1:1e7) + 0 x.posixct = Sys.time() + x.double microbenchmark::microbenchmark( f1(x.double), f2(x.double), # faster due to one REAL() f1(x.posixct), f2(x.posixct), # faster due to one REAL() unit='ms', times=10) Unit: milliseconds expr min lq mean median uq max neval f1(x.double) 35.737629 37.991785 43.004432 38.575525 39.198533 80.85625 10 f2(x.double) 6.053373 6.064323 7.238750 6.092453 8.438550 10.69384 10 f1(x.posixct) 36.315705 36.542253 42.349745 38.355395 39.378262 81.59857 10 f2(x.posixct) 6.063946 6.070741 7.579176 6.138518 7.063024 13.94141 10 I would also like to suggest ALTREP support for POSIXct vectors, which are interpreted as type REAL in the c code, but do not gain the performance benefits of real vectors. Sorted vectors of timestamps are important for joining time series and in calls to findInterval(). # unsorted vectors x.double = as.double(1:1e7) + 0 x.posixct = Sys.time() + x.double # sort for altrep benefit x.double.sort <- sort(x.double) x.posixct.sort <- sort(x.posixct) microbenchmark::microbenchmark( is.unsorted(x.double), is.unsorted(x.double.sort), # faster due to altrep is.unsorted(x.posixct), is.unsorted(x.posixct.sort), # no altrep benefit unit='ms', times=10) Unit: milliseconds expr min lq mean median uq max neval is.unsorted(x.double) 16.987730 17.010008 17.1577173 17.0862785 17.308674 17.474432 10 is.unsorted(x.double.sort) 0.000378 0.000756 0.0065327 0.0075525 0.010195 0.011706 10 is.unsorted(x.posixct) 36.925876 37.084837 43.4125593 37.4695915 41.858589 78.742174 10 is.unsorted(x.posixct.sort) 36.966654 37.031975 51.1228686 37.1235380 37.777319 153.270170 10 Since there do not appear to be any tests for is.unsorted() these are some tests to be added for some types. # integer sequence x <- -10L:10L stopifnot(!is.unsorted(x, na.rm = F, strictly = T)) stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) # integer not strictly x <- -10L:10L x[2] <- x[3] stopifnot( is.unsorted(x, na.rm = F, strictly = T)) stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) # integer with NA x <- -10L:10L x[2] <- NA stopifnot(!is.unsorted(x, na.rm = T, strictly = F)) stopifnot(is.na(is.unsorted(x, na.rm = F, strictly = F))) # double x <- seq(from = -10, to = 10, by=0.01) stopifnot(!is.unsorted(x, na.rm = F, strictly = T)) stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) # double not strictly x <- seq(from = -10, to = 10, by=0.01) x[2] <- x[3] stopifnot( is.unsorted(x, na.rm = F, strictly = T)) stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) # double with NA x <- seq(from = -10, to = 10, by=0.01) x[length(x)] <- NA stopifnot(!is.unsorted(x, na.rm = T, strictly = F)) stopifnot(is.na(is.unsorted(x, na.rm = F, strictly = F))) # logical stopifnot(!is.unsorted( c(F, T, T), strictly = F)) stopifnot( is.unsorted( c(F, T, T), strictly = T)) stopifnot( is.unsorted( c(T, T, F), strictly = F)) stopifnot( is.unsorted( c(T, T, F), strictly = T)) # POSIXct x <- seq(from=as.POSIXct('2018-1-1'), to=as.POSIXct('2019-1-1'), by='day') stopifnot(!is.unsorted(x, na.rm = T, strictly = F)) stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) # POSIXct not strictly x <- seq(from=as.POSIXct('2018-1-1'), to=as.POSIXct('2019-1-1'), by='day') x[2] <- x[3] stopifnot( is.unsorted(x, na.rm = F, strictly = T)) stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) # POSIXct with NA x <- seq(from=as.POSIXct('2018-1-1'), to=as.POSIXct('2019-1-1'), by='day') x[length(x)] <- NA stopifnot(!is.unsorted(x, na.rm = T, strictly = F)) stopifnot(is.na(is.unsorted(x, na.rm = F, strictly = F))) [[alternative HTML version deleted]]
Gabriel Becker
2019-Jan-08 01:17 UTC
[Rd] unsorted - suggestion for performance improvement and ALTREP support for POSIXct
Hi Harvey, Its exciting to see people thinking about and looking at ALTREP speedups "in the wild" :). You're absolutely right that pulling out the REAL call will give you a significant speedup, but ALTREP does add a little wrinkle (and a solution to it!). Detailed responses and comments inline: On Mon, Jan 7, 2019 at 11:58 AM Harvey Smith <harvey13131 at gmail.com> wrote:> I believe the performance of isUnsorted() in sort.c could be improved by > calling REAL() once (outside of the for loop), rather than calling it twice > inside the loop. As an aside, it is implemented in the faster way in > doSort() (sort.c line 401). The example below shows the performance > improvement for a vectors of double of moving REAL() outside the for loop. > > <snip> > >In light of ALTREP's inclusion in the R internals its best to avoid asking things for their full data vector when you don't need to. Instead, you can use the ITERATE_BY_REGION macro R (courtesy of Luke, I believe?) provides in <includedir>/R_ext/Itermacros.h. This is particularly true of R's internals, which also preferably won't "explode"/invalidate an ALTREP (which asking for a writable pointer does) when they don't need to. Most internal functions haven't been converted to this yet, as you see with is.unsorted (and its not a high priority to do the conversion until it becomes an issue for any given case), but this is what, e.g., R's own sum function now does. ITERATE_BY_REGION is based on *_GET_REGION, which was added to the C API as part ALTREP, but works on ALTREP and normal vectors, and won't explode in corner cases where materializing a full ALTREP vector would be problematic. The core concept for ITERATE_BY_REGION is to grab regions (a quick glance tells me its 512 elements at a time) of a vector, copying them into a buffer, and using the same trick your code does by avoiding pointer lookup inside the inner tight loop. Do note that as of now I had to compile my function with language="C", rather than the default "C++" to avoid an error about initializing a const double * with a const void * value. On my machine, at least, you actually *nearly* the same speedup with all the added safety. Eyeballing it I'm not convinced the difference is statistically signfiicant, to be honest, but even if it is, you get most of the benefit... body = " R_xlen_t n, i; n = XLENGTH(x); for(i = 0; i+1 < n ; i++) if(REAL(x)[i] > REAL(x)[i+1]) return ScalarLogical(TRUE); return ScalarLogical(FALSE);"; f1 = inline::cfunction(sig = signature(x='numeric'), body=body) body = " R_xlen_t n, i; n = XLENGTH(x); double* real_x = REAL(x); for(i = 0; i+1 < n ; i++) if(real_x[i] > real_x[i+1]) return ScalarLogical(TRUE); return ScalarLogical(FALSE);"; f2 = inline::cfunction(sig = signature(x='numeric'), body=body) body = " double tmp = -DBL_MAX; // minimum possible double value ITERATE_BY_REGION(x, xptr, i, nbatch, double, REAL, { if(xptr[0] < tmp) //deal with batch barriers, tmp is end of last batch return ScalarLogical(TRUE); for(R_xlen_t k = 0; k < nbatch - 1; k++) { if(xptr[k] > xptr[k+1]) return ScalarLogical(TRUE); } tmp = xptr[nbatch - 1]; }); return ScalarLogical(FALSE);"; f3 = inline::cfunction(sig = signature(x='numeric'), body=body, includes '#include "R_ext/Itermacros.h"', language = "C") x.double = as.double(1:1e7) + 0 x.posixct = Sys.time() + x.double microbenchmark::microbenchmark( f1(x.double), f2(x.double), # one REAL call f3(x.double), # ITERATE_BY_REGION f1(x.posixct), f2(x.posixct), # one REAL call f3(x.posixct), # ITERATE_BY_REGION unit='ms', times=100) Unit: milliseconds expr min lq mean median uq max f1(x.double) 26.377432 27.234192 28.156993 27.774590 28.602643 32.213378 f2(x.double) 4.722712 4.854300 5.011549 4.991388 5.127996 5.523156 f3(x.double) 4.759537 4.788137 5.408925 5.373667 5.713877 6.694330 f1(x.posixct) 77.975030 78.853724 85.867995 82.530822 83.557849 123.546206 f2(x.posixct) 4.637912 4.660033 4.872892 4.750513 4.880569 5.907149 f3(x.posixct) 4.643806 4.665936 5.094212 5.085454 5.384414 5.778274 neval 10 10 10 10 10 10 To be extra careful we can check that we're getting all the edges right just incase, since the code is admittedly harder to follow and a bit more arcane:> x.double2 = x.double> x.double2[512] = x.double[1] #unsorted at end of first batch> stopifnot(f3(x.double2))>> x.double2a = x.double> x.double2a[513] = x.double[1] #unsorted at beginning of 2nd batch> stopifnot(f3(x.double2a))>>> ##check edges> x.double3 = x.double> x.double3[length(x.double3)] = x.double3[1] #unsorted at last element> stopifnot(f3(x.double3))>> x.double4 = x.double> x.double4[1] = x.double[5] #unsorted at first element> stopifnot(f3(x.double4)) >If R-core is interested I'm happy to develop a patch for the isUnsorted workhorse c-function, at least for integers and reals.> I would also like to suggest ALTREP support for POSIXct vectors, which are > interpreted as type REAL in the c code, but do not gain the performance > benefits of real vectors. Sorted vectors of timestamps are important for > joining time series and in calls to findInterval(). >So looking at this, it is because is.object(x.posixct) returns true, which means sort.default does x[order(x, <bla bla bla>)], which ALTREP is not currently (and may not ever be?) smart enough to catch on its own and know is sorted. Its true we could add something after that to wrap it in what is called a wrapper altrep which would know it's sorted, but we don't do that currently now and I'm not sure we actually should in the general case. I'm not convinced its safe to assume an object class' defined ordering will match the ordering of an underlying double/int representation. I believe we ran into something similar with deferred sting conversions from integers (I think, possibly doubles) where the int had sortedness information but that wasn't correct for the *character vector *the ALTREP ultimately represented. Best, ~G PS above timings were in a mildly (~1 month) old version of R-devel:> sessionInfo()R Under development (unstable) (2018-12-11 r75837) Platform: x86_64-apple-darwin17.7.0 (64-bit) Running under: macOS High Sierra 10.13.6> # unsorted vectors > x.double = as.double(1:1e7) + 0 > x.posixct = Sys.time() + x.double > # sort for altrep benefit > x.double.sort <- sort(x.double) > x.posixct.sort <- sort(x.posixct) > microbenchmark::microbenchmark( > is.unsorted(x.double), > is.unsorted(x.double.sort), # faster due to altrep > is.unsorted(x.posixct), > is.unsorted(x.posixct.sort), # no altrep benefit > unit='ms', times=10) > Unit: milliseconds > expr min lq mean median > uq max neval > is.unsorted(x.double) 16.987730 17.010008 17.1577173 17.0862785 > 17.308674 17.474432 10 > is.unsorted(x.double.sort) 0.000378 0.000756 0.0065327 0.0075525 > 0.010195 0.011706 10 > is.unsorted(x.posixct) 36.925876 37.084837 43.4125593 37.4695915 > 41.858589 78.742174 10 > is.unsorted(x.posixct.sort) 36.966654 37.031975 51.1228686 37.1235380 > 37.777319 153.270170 10 > > > Since there do not appear to be any tests for is.unsorted() these are some > tests to be added for some types. > > # integer sequence > x <- -10L:10L > stopifnot(!is.unsorted(x, na.rm = F, strictly = T)) > stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) > # integer not strictly > x <- -10L:10L > x[2] <- x[3] > stopifnot( is.unsorted(x, na.rm = F, strictly = T)) > stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) > # integer with NA > x <- -10L:10L > x[2] <- NA > stopifnot(!is.unsorted(x, na.rm = T, strictly = F)) > stopifnot(is.na(is.unsorted(x, na.rm = F, strictly = F))) > # double > x <- seq(from = -10, to = 10, by=0.01) > stopifnot(!is.unsorted(x, na.rm = F, strictly = T)) > stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) > # double not strictly > x <- seq(from = -10, to = 10, by=0.01) > x[2] <- x[3] > stopifnot( is.unsorted(x, na.rm = F, strictly = T)) > stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) > # double with NA > x <- seq(from = -10, to = 10, by=0.01) > x[length(x)] <- NA > stopifnot(!is.unsorted(x, na.rm = T, strictly = F)) > stopifnot(is.na(is.unsorted(x, na.rm = F, strictly = F))) > # logical > stopifnot(!is.unsorted( c(F, T, T), strictly = F)) > stopifnot( is.unsorted( c(F, T, T), strictly = T)) > stopifnot( is.unsorted( c(T, T, F), strictly = F)) > stopifnot( is.unsorted( c(T, T, F), strictly = T)) > # POSIXct > x <- seq(from=as.POSIXct('2018-1-1'), to=as.POSIXct('2019-1-1'), by='day') > stopifnot(!is.unsorted(x, na.rm = T, strictly = F)) > stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) > # POSIXct not strictly > x <- seq(from=as.POSIXct('2018-1-1'), to=as.POSIXct('2019-1-1'), by='day') > x[2] <- x[3] > stopifnot( is.unsorted(x, na.rm = F, strictly = T)) > stopifnot(!is.unsorted(x, na.rm = F, strictly = F)) > # POSIXct with NA > x <- seq(from=as.POSIXct('2018-1-1'), to=as.POSIXct('2019-1-1'), by='day') > x[length(x)] <- NA > stopifnot(!is.unsorted(x, na.rm = T, strictly = F)) > stopifnot(is.na(is.unsorted(x, na.rm = F, strictly = F))) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]