Henrik Bengtsson
2002-Jun-27 15:12 UTC
[R] Fastest way to find the last index k such that x[k] < y in a sorted vector x?
Hi, I am trying to find the fastest way to "find the last index k such that x[k] < y in a *sorted* vector x" These are my two alternatives: x <- sort(rnorm(1e4)) y <- 0.2 # Alt 1 k <- max(1, sum(x < y)) # Alt 2 "divide and conquer" lastIndexLessThan <- function(x, y) { k0 <- 1; k1 <- length(x) while ((dk <- (k1 - k0)) > 1) { k <- k0 + dk %/% 2 if (x[k] < y) k0 <- k else k1 <- k } k0 } k <- lastIndexLessThan(x, y) Simple benchmarking shows that alternative 1 is faster for short vectors and alternative 2 is faster for long vectors. I believe this is because alt 1 is implemented internally. Is there an internal function of alternative 2 that I don't know about? It would be great because then it would probably be the fastest one on both short and long vectors. Best Henrik Bengtsson Dept. of Mathematical Statistics @ Centre for Mathematical Sciences Lund Institute of Technology/Lund University, Sweden (+2h UTC) Office: P316, +46 46 222 9611 (phone), +46 46 222 4623 (fax) h b @ m a t h s . l t h . s e, http://www.maths.lth.se/bioinformatics/ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Göran Broström
2002-Jun-27 16:41 UTC
[R] Fastest way to find the last index k such that x[k] < y in a sorted vector x?
On Thu, 27 Jun 2002, Henrik Bengtsson wrote:> Hi, I am trying to find the fastest way to > > "find the last index k such that x[k] < y in a *sorted* vector x"How about> k <- max(which(x < y))Haven't done any benchmarking, though. G?ran> > These are my two alternatives: > > x <- sort(rnorm(1e4)) > y <- 0.2 > > # Alt 1 > k <- max(1, sum(x < y)) > > # Alt 2 "divide and conquer" > lastIndexLessThan <- function(x, y) { > k0 <- 1; k1 <- length(x) > while ((dk <- (k1 - k0)) > 1) { > k <- k0 + dk %/% 2 > if (x[k] < y) k0 <- k else k1 <- k > } > k0 > } > k <- lastIndexLessThan(x, y) > > Simple benchmarking shows that alternative 1 is faster for short vectors and > alternative 2 is faster for long vectors. I believe this is because alt 1 is > implemented internally. Is there an internal function of alternative 2 that > I don't know about? It would be great because then it would probably be the > fastest one on both short and long vectors. > > Best > > Henrik Bengtsson > > Dept. of Mathematical Statistics @ Centre for Mathematical Sciences > Lund Institute of Technology/Lund University, Sweden (+2h UTC) > Office: P316, +46 46 222 9611 (phone), +46 46 222 4623 (fax) > h b @ m a t h s . l t h . s e, http://www.maths.lth.se/bioinformatics/ > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- G?ran Brostr?m tel: +46 90 786 5223 professor fax: +46 90 786 6614 Department of Statistics http://www.stat.umu.se/egna/gb/ Ume? University SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thomas Lumley
2002-Jun-27 16:47 UTC
[R] Fastest way to find the last index k such that x[k] < y in a sorted vector x?
On Thu, 27 Jun 2002, Henrik Bengtsson wrote:> Hi, I am trying to find the fastest way to > > "find the last index k such that x[k] < y in a *sorted* vector x" > > These are my two alternatives: > > x <- sort(rnorm(1e4)) > y <- 0.2 > > # Alt 1 > k <- max(1, sum(x < y)) > > # Alt 2 "divide and conquer" > lastIndexLessThan <- function(x, y) { > k0 <- 1; k1 <- length(x) > while ((dk <- (k1 - k0)) > 1) { > k <- k0 + dk %/% 2 > if (x[k] < y) k0 <- k else k1 <- k > } > k0 > } > k <- lastIndexLessThan(x, y) > > Simple benchmarking shows that alternative 1 is faster for short vectors and > alternative 2 is faster for long vectors. I believe this is because alt 1 is > implemented internally.Yes.> Is there an internal function of alternative 2 that > I don't know about? It would be great because then it would probably be the > fastest one on both short and long vectors.You can use uniroot, eg. uniroot(function(i) (x[i]>y)-i/N,c(1,length(x)))$root-1 where N>length(x) On the other hand, I had to go to vectors of length 10^5 to get any reproducible difference between this and max(which(x<y)) -thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._