Gopi Goteti
2012-Aug-24 04:49 UTC
[R] updating elements of a vector sequentially - is there a faster way?
I would like to know whether there is a faster way to do the below operation (updating vec1). My objective is to update the elements of a vector (vec1), where a particular element i is dependent on the previous one. I need to do this on vectors that are 1 million or longer and need to repeat that process several hundred times. The for loop works but is slow. If there is a faster way, please let me know. probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6) p10 <- 0.6 p00 <- 0.4 vec1 <- rep(0, 10) for (i in 2:10) { vec1[i] <- ifelse(vec1[i-1] == 0, ifelse(probs[i]<p10, 0, 1), ifelse(probs[i]<p00, 0, 1)) } Thanks GG [[alternative HTML version deleted]]
Berend Hasselman
2012-Aug-24 08:05 UTC
[R] updating elements of a vector sequentially - is there a faster way?
On 24-08-2012, at 06:49, Gopi Goteti wrote:> I would like to know whether there is a faster way to do the below > operation (updating vec1). > > My objective is to update the elements of a vector (vec1), where a > particular element i is dependent on the previous one. I need to do this on > vectors that are 1 million or longer and need to repeat that process > several hundred times. The for loop works but is slow. If there is a faster > way, please let me know. > > probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6) > p10 <- 0.6 > p00 <- 0.4 > vec1 <- rep(0, 10) > for (i in 2:10) { > vec1[i] <- ifelse(vec1[i-1] == 0, > ifelse(probs[i]<p10, 0, 1), > ifelse(probs[i]<p00, 0, 1)) > }ifelse works on vectors. You should use if() ... else .. here. You can also precompute ifelse(probs[i]<p10, 0, 1) and ifelse(probs[i]<p00, 0, 1) since these expressions do not depend on vec1. Here is some testing code where your code is in function f1 and and an alternative in function f2 using precomputed values and no ifelse. I also use the package compiler to get as much speedup as possible. The code: N <- 100000 # must be a multiple of 10 probs <- rep(c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6), N/10) p10 <- 0.6 p00 <- 0.4 vec1 <- rep(0, N) val.p10 <- ifelse(probs<p10, 0, 1) val.p00 <- ifelse(probs<p00, 0, 1) f1 <- function(vec1) { N <- length(vec1) for (i in 2:N) { vec1[i] <- ifelse(vec1[i-1] == 0, ifelse(probs[i]<p10, 0, 1), ifelse(probs[i]<p00, 0, 1)) } vec1 } f2 <- function(vec1) { N <- length(vec1) for (i in 2:N) { vec1[i] <- if(vec1[i-1] == 0) val.p10[i] else val.p00[i] } vec1 } f1.c <- cmpfun(f1) f2.c <- cmpfun(f2) vec1 <- f1(vec1) vec2 <- f2(vec1) vec3 <- f1.c(vec1) vec4 <- f2.c(vec1) identical(vec1,vec2) identical(vec1,vec3) identical(vec1,vec4) system.time(vec1 <- f1(vec1))[3] system.time(vec2 <- f2(vec1))[3] system.time(vec3 <- f1.c(vec1))[3] system.time(vec4 <- f2.c(vec1))[3] Output is:> identical(vec1,vec2)[1] TRUE> identical(vec1,vec3)[1] TRUE> identical(vec1,vec4)[1] TRUE> system.time(vec1 <- f1(vec1))[3]elapsed 2.922> system.time(vec2 <- f2(vec1))[3]elapsed 0.403> system.time(vec3 <- f1.c(vec1))[3]elapsed 2.4> system.time(vec4 <- f2.c(vec1))[3]elapsed 0.084 A simple loop and using precomputed values achieves a significant speedup compared to your original code. Using the compiler package to compile f2 gains even more sppedup. Berend
Petr Savicky
2012-Aug-24 08:34 UTC
[R] updating elements of a vector sequentially - is there a faster way?
On Thu, Aug 23, 2012 at 09:49:33PM -0700, Gopi Goteti wrote:> I would like to know whether there is a faster way to do the below > operation (updating vec1). > > My objective is to update the elements of a vector (vec1), where a > particular element i is dependent on the previous one. I need to do this on > vectors that are 1 million or longer and need to repeat that process > several hundred times. The for loop works but is slow. If there is a faster > way, please let me know. > > probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6) > p10 <- 0.6 > p00 <- 0.4 > vec1 <- rep(0, 10) > for (i in 2:10) { > vec1[i] <- ifelse(vec1[i-1] == 0, > ifelse(probs[i]<p10, 0, 1), > ifelse(probs[i]<p00, 0, 1)) > }Hi. If p10 is always more than p00, then try the following. probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6) p10 <- 0.6 p00 <- 0.4 # original code vec1 <- rep(0, 10) for (i in 2:10) { vec1[i] <- ifelse(vec1[i-1] == 0, ifelse(probs[i]<p10, 0, 1), ifelse(probs[i]<p00, 0, 1)) } # modification a10 <- ifelse(probs<p10, 0, 1) a00 <- ifelse(probs<p00, 0, 1) vec2 <- ifelse(a10 == a00, a10, NA) vec2[1] <- 0 n <- length(vec2) while (any(is.na(vec2))) { shift <- c(NA, vec2[-n]) vec2 <- ifelse(is.na(vec2), shift, vec2) } all(vec1 == vec2) [1] TRUE If p10 > p00, then a10 <= a00. In this situation, the recurrence satisfies the following. If a10[i] == a00[i], then vec1[i] is the common value and does not depend on vec1[i-1]. If a10[i] < a00[i], then vec1[i] is equal to vec1[i-1]. The suggested code creates an initial vec2, which contains NA at the positions, which depend on the previous value. Then, it iterates copying each value to the next, if the next is NA. The efficiency depends on the length of the sequencies of consecutive NA in the initial vec2. If there are many, but only short sequencies of consecutive NA, the code can be more efficient than a loop over all elements. Hope this helps. Petr Savicky.
PIKAL Petr
2012-Aug-24 08:43 UTC
[R] updating elements of a vector sequentially - is there a faster way?
Hi Well, I am not sure if this is what you want but same result can be achieved by vec1 <- (probs>=p00)*(probs>=p10) Petr> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Gopi Goteti > Sent: Friday, August 24, 2012 6:50 AM > To: r-help at r-project.org > Subject: [R] updating elements of a vector sequentially - is there a > faster way? > > I would like to know whether there is a faster way to do the below > operation (updating vec1). > > My objective is to update the elements of a vector (vec1), where a > particular element i is dependent on the previous one. I need to do > this on vectors that are 1 million or longer and need to repeat that > process several hundred times. The for loop works but is slow. If there > is a faster way, please let me know. > > probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6) p10 <- 0.6 p00 <- > 0.4 > vec1 <- rep(0, 10) > for (i in 2:10) { > vec1[i] <- ifelse(vec1[i-1] == 0, > ifelse(probs[i]<p10, 0, 1), > ifelse(probs[i]<p00, 0, 1)) } > > Thanks > GG > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Petr Savicky
2012-Aug-24 08:48 UTC
[R] updating elements of a vector sequentially - is there a faster way?
On Fri, Aug 24, 2012 at 10:34:14AM +0200, Petr Savicky wrote:> On Thu, Aug 23, 2012 at 09:49:33PM -0700, Gopi Goteti wrote: > > I would like to know whether there is a faster way to do the below > > operation (updating vec1). > > > > My objective is to update the elements of a vector (vec1), where a > > particular element i is dependent on the previous one. I need to do this on > > vectors that are 1 million or longer and need to repeat that process > > several hundred times. The for loop works but is slow. If there is a faster > > way, please let me know. > > > > probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6) > > p10 <- 0.6 > > p00 <- 0.4 > > vec1 <- rep(0, 10) > > for (i in 2:10) { > > vec1[i] <- ifelse(vec1[i-1] == 0, > > ifelse(probs[i]<p10, 0, 1), > > ifelse(probs[i]<p00, 0, 1)) > > } > > Hi. > > If p10 is always more than p00, then try the following.[...]> # modification > a10 <- ifelse(probs<p10, 0, 1) > a00 <- ifelse(probs<p00, 0, 1) > vec2 <- ifelse(a10 == a00, a10, NA) > vec2[1] <- 0 > n <- length(vec2) > while (any(is.na(vec2))) { > shift <- c(NA, vec2[-n]) > vec2 <- ifelse(is.na(vec2), shift, vec2) > }Hi. Let me suggest a variant of this, which can be more efficient in some cases. probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6) p10 <- 0.6 p00 <- 0.4 # original code vec1 <- rep(0, 10) for (i in 2:10) { vec1[i] <- ifelse(vec1[i-1] == 0, ifelse(probs[i]<p10, 0, 1), ifelse(probs[i]<p00, 0, 1)) } # modification a10 <- ifelse(probs<p10, 0, 1) a00 <- ifelse(probs<p00, 0, 1) vec2 <- ifelse(a10 == a00, a10, NA) vec2[1] <- 0 while (1) { i <- which(is.na(vec2)) if (length(i) == 0) break vec2[i] <- vec2[i-1] } all(vec1 == vec2) [1] TRUE Hope this helps. Petr Savicky.
Noia Raindrops
2012-Aug-24 09:37 UTC
[R] updating elements of a vector sequentially - is there a faster way?
Hello, Each block of probs range from p00 to p10 is last value before the block. Example: probs: .1 .1 .5 .5 .5 .9 .9 vec1 : 0 0 0 0 0 1 1 probs: .9 .9 .5 .5 .5 .1 .1 vec1 : 1 1 1 1 1 0 0 So you can eliminate a loop. # modification f5 <- function () { vec1 <- as.numeric(as.character(cut(probs, breaks = c(0, p00, p10, 1), labels = c(0, 0.5, 1), include.lowest = TRUE, right = FALSE))) # = ifelse(probs < p10, ifelse(probs < p00, 0, 0.5), 1) vec1 <- replace(vec1, 1, 0) vec1 <- rle(vec1) vec1 <- within(unclass(vec1), values[values == 0.5] <- values[which(values == 0.5) - 1]) vec1 <- inverse.rle(vec1) vec1 } # original f1 <- function() { vec1 <- rep(0, length(probs)) for (i in 2:length(probs)) { vec1[i] <- ifelse(vec1[i-1] == 0, ifelse(probs[i] < p10, 0, 1), ifelse(probs[i] < p00, 0, 1)) } vec1 } f2 <- function(vec1) { val.p10 <- ifelse(probs < p10, 0, 1) val.p00 <- ifelse(probs < p00, 0, 1) vec1 <- rep(0, length(probs)) for (i in 2:length(probs)) { vec1[i] <- if(vec1[i-1] == 0) val.p10[i] else val.p00[i] } vec1 } f3 <- function () { a10 <- ifelse(probs < p10, 0, 1) a00 <- ifelse(probs < p00, 0, 1) vec1 <- ifelse(a10 == a00, a10, NA) vec1[1] <- 0 n <- length(vec1) while (any(is.na(vec1))) { shift <- c(NA, vec1[-n]) vec1 <- ifelse(is.na(vec1), shift, vec1) } vec1 } f4 <- function () { a10 <- ifelse(probs < p10, 0, 1) a00 <- ifelse(probs < p00, 0, 1) vec1 <- ifelse(a10 == a00, a10, NA) vec1[1] <- 0 n <- length(vec1) while (1) { i <- which(is.na(vec1)) if (length(i) == 0) break vec1[i] <- vec1[i-1] } vec1 } set.seed(1) probs <- runif(10000) p10 <- 0.6 p00 <- 0.4 identical(f1(), f2()) ## [1] TRUE identical(f1(), f3()) ## [1] TRUE identical(f1(), f4()) ## [1] TRUE identical(f1(), f5()) ## [1] TRUE # with random probs rbenchmark::benchmark(f1(), f2(), f3(), f4(), f5(), columns = c("test", "replications", "elapsed", "relative"), replications = 100) ## test replications elapsed relative ## 1 f1() 100 31.456 42.279570 ## 2 f2() 100 4.879 6.557796 ## 3 f3() 100 2.503 3.364247 ## 4 f4() 100 0.939 1.262097 ## 5 f5() 100 0.744 1.000000 # with biased probs probs <- rep(0.5, 1000) rbenchmark::benchmark(f1(), f2(), f3(), f4(), f5(), columns = c("test", "replications", "elapsed", "relative"), replications = 100) ## test replications elapsed relative ## 1 f1() 100 2.917 30.385417 ## 2 f2() 100 0.448 4.666667 ## 3 f3() 100 32.439 337.906250 ## 4 f4() 100 3.635 37.864583 ## 5 f5() 100 0.096 1.000000 -- Noia Raindrops noia.raindrops at gmail.com
Petr Savicky
2012-Aug-24 09:53 UTC
[R] updating elements of a vector sequentially - is there a faster way?
On Thu, Aug 23, 2012 at 09:49:33PM -0700, Gopi Goteti wrote:> I would like to know whether there is a faster way to do the below > operation (updating vec1). > > My objective is to update the elements of a vector (vec1), where a > particular element i is dependent on the previous one. I need to do this on > vectors that are 1 million or longer and need to repeat that process > several hundred times. The for loop works but is slow. If there is a faster > way, please let me know. > > probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6) > p10 <- 0.6 > p00 <- 0.4 > vec1 <- rep(0, 10) > for (i in 2:10) { > vec1[i] <- ifelse(vec1[i-1] == 0, > ifelse(probs[i]<p10, 0, 1), > ifelse(probs[i]<p00, 0, 1)) > }Hi. There are already several solutions, which use the fact that each output value either does not depend on the previous value or is its copy. The following implements this using rle() function. probs <- c(.1, .3, .2, .4, .7, .9, .3, .4, .5, .6) p10 <- 0.6 p00 <- 0.4 # original code vec1 <- rep(0, 10) for (i in 2:10) { vec1[i] <- ifelse(vec1[i-1] == 0, ifelse(probs[i]<p10, 0, 1), ifelse(probs[i]<p00, 0, 1)) } # modification a10 <- ifelse(probs<p10, 0, 1) a00 <- ifelse(probs<p00, 0, 1) vec2 <- ifelse(a10 == a00, a10, -1) vec2[1] <- 0 r <- rle(vec2) rlen <- r$lengths rval <- r$values i <- which(rval == -1) rval[i] <- rval[i-1] vec2 <- rep(rval, times=rlen) all(vec1 == vec2) [1] TRUE Hope this helps. Petr Savicky.