Henrik Bengtsson
2002-Jun-27 15:12 UTC
[R] Fastest way to find the last index k such that x[k] < y in a sorted vector x?
Hi, I am trying to find the fastest way to
"find the last index k such that x[k] < y in a *sorted* vector x"
These are my two alternatives:
x <- sort(rnorm(1e4))
y <- 0.2
# Alt 1
k <- max(1, sum(x < y))
# Alt 2 "divide and conquer"
lastIndexLessThan <- function(x, y) {
k0 <- 1; k1 <- length(x)
while ((dk <- (k1 - k0)) > 1) {
k <- k0 + dk %/% 2
if (x[k] < y) k0 <- k else k1 <- k
}
k0
}
k <- lastIndexLessThan(x, y)
Simple benchmarking shows that alternative 1 is faster for short vectors and
alternative 2 is faster for long vectors. I believe this is because alt 1 is
implemented internally. Is there an internal function of alternative 2 that
I don't know about? It would be great because then it would probably be the
fastest one on both short and long vectors.
Best
Henrik Bengtsson
Dept. of Mathematical Statistics @ Centre for Mathematical Sciences
Lund Institute of Technology/Lund University, Sweden (+2h UTC)
Office: P316, +46 46 222 9611 (phone), +46 46 222 4623 (fax)
h b @ m a t h s . l t h . s e, http://www.maths.lth.se/bioinformatics/
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Göran Broström
2002-Jun-27 16:41 UTC
[R] Fastest way to find the last index k such that x[k] < y in a sorted vector x?
On Thu, 27 Jun 2002, Henrik Bengtsson wrote:> Hi, I am trying to find the fastest way to > > "find the last index k such that x[k] < y in a *sorted* vector x"How about> k <- max(which(x < y))Haven't done any benchmarking, though. G?ran> > These are my two alternatives: > > x <- sort(rnorm(1e4)) > y <- 0.2 > > # Alt 1 > k <- max(1, sum(x < y)) > > # Alt 2 "divide and conquer" > lastIndexLessThan <- function(x, y) { > k0 <- 1; k1 <- length(x) > while ((dk <- (k1 - k0)) > 1) { > k <- k0 + dk %/% 2 > if (x[k] < y) k0 <- k else k1 <- k > } > k0 > } > k <- lastIndexLessThan(x, y) > > Simple benchmarking shows that alternative 1 is faster for short vectors and > alternative 2 is faster for long vectors. I believe this is because alt 1 is > implemented internally. Is there an internal function of alternative 2 that > I don't know about? It would be great because then it would probably be the > fastest one on both short and long vectors. > > Best > > Henrik Bengtsson > > Dept. of Mathematical Statistics @ Centre for Mathematical Sciences > Lund Institute of Technology/Lund University, Sweden (+2h UTC) > Office: P316, +46 46 222 9611 (phone), +46 46 222 4623 (fax) > h b @ m a t h s . l t h . s e, http://www.maths.lth.se/bioinformatics/ > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- G?ran Brostr?m tel: +46 90 786 5223 professor fax: +46 90 786 6614 Department of Statistics http://www.stat.umu.se/egna/gb/ Ume? University SE-90187 Ume?, Sweden e-mail: gb at stat.umu.se -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thomas Lumley
2002-Jun-27 16:47 UTC
[R] Fastest way to find the last index k such that x[k] < y in a sorted vector x?
On Thu, 27 Jun 2002, Henrik Bengtsson wrote:> Hi, I am trying to find the fastest way to > > "find the last index k such that x[k] < y in a *sorted* vector x" > > These are my two alternatives: > > x <- sort(rnorm(1e4)) > y <- 0.2 > > # Alt 1 > k <- max(1, sum(x < y)) > > # Alt 2 "divide and conquer" > lastIndexLessThan <- function(x, y) { > k0 <- 1; k1 <- length(x) > while ((dk <- (k1 - k0)) > 1) { > k <- k0 + dk %/% 2 > if (x[k] < y) k0 <- k else k1 <- k > } > k0 > } > k <- lastIndexLessThan(x, y) > > Simple benchmarking shows that alternative 1 is faster for short vectors and > alternative 2 is faster for long vectors. I believe this is because alt 1 is > implemented internally.Yes.> Is there an internal function of alternative 2 that > I don't know about? It would be great because then it would probably be the > fastest one on both short and long vectors.You can use uniroot, eg. uniroot(function(i) (x[i]>y)-i/N,c(1,length(x)))$root-1 where N>length(x) On the other hand, I had to go to vectors of length 10^5 to get any reproducible difference between this and max(which(x<y)) -thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._