Bert Gunter
2024-Oct-06 23:53 UTC
[R] Coda: On the efficiency of unsplit() for Rolf Turner's recent post
(only of interest -- maybe! -- to those who followed this thread of a couple of weeks ago) Just for the heckuva it, I compared the timing of Deepayan's unsplit(x,f) solution to my as.vector(do.call(rbind, x)) approach to the query for a list of 3 vectors each of length 1000 (the original toy example was for a list of 3 vectors of length 5). Unsurprisingly, I think, because the unsplit() approach works for the general case whereas the do.call(rbind) only works for the balanced structure of the toy example, do.call(rbind) took about 1/10th the time of unsplit:> microbenchmark(unsplit(x,f),times = 1000L)Unit: microseconds expr min lq mean median uq max neval unsplit(x, f) 63.058 64.042 70.44419 65.682 67.24 3893.155 1000 --------------> microbenchmark(as.vector(do.call(rbind,x)),times = 1000L)Unit: microseconds expr min lq mean median uq max neval as.vector(do.call(rbind, x)) 5.617 6.396 7.082299 6.765 7.216 79.335 1000 **Maybe** this suggests that adding a "regular" (or better-named) option to unsplit() that would allow a simpler faster algorithm to be used for the special but perhaps not uncommon case of Rolf's structured toy example might be useful. Please do not reply to this, as I am too ignorant to judge whether this is foolish or not. I leave it to those more qualified to either dismiss or act on this. I just wanted to present some limited but suggestive data. Cheers to all, Bert [[alternative HTML version deleted]]