Dear All, wonder if you have thoughts on the following: let us say we have: df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) I would like to rewrite values in column name "a" based on values in column name "b", where based on a certain value of column "b" the next value of column 'a' is prompted, in other words would like to have this as a result: df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) where at the value of 0 in column 'b' the number in column a changes from 1 to 2. From the first zero value of column 'b' and until the next zero in column 'b' the numbers would not change in 'a', ie: they are all 1 in my example... then from 2 it would change to 3 again as 'b' will have zero again in a row, and so on.. Would be grateful for a solution that would allow me to set the values (from 'b') that determine how the values get established in 'a' (ie: lets say instead of 0 I would want 3 being the value where 1 changes to 2 in 'a') and that would be flexible to take into account that the number of rows and the number of time 0 shows up in a row in column 'b' may vary... much appreciate your thoughts.. Andras
Your specification is a bit unclear to me, so I'm not sure the below is really what you want. For example, your example seems to imply that a and b must be of the same length, but I do not see that your description requires this. So the following may not be what you want exactly, but one way to do this(there may be cleverer ones!) is to make use of ?rep. Everything else is just fussy detail. (Your example suggests that you should also learn about ?seq. Both of these should be covered in any good R tutorial, which you should probably spend time with if you haven't already). Anyway... ## WARNING: Not thoroughly tested! May (probably :-( ) contain bugs. f <- function(x,y,switch_val =0) { wh <- which(y == switch_val) len <- length(wh) len_x <- length(x) if(!len) x else if(wh[1] == 1){ if(len ==1) return(rep(x[1],len_x)) else { wh <- wh[-1] len <- len -1 } } count <- c(wh[1]-1,diff(wh)) if(wh[len] == len_x) count<- c(count,1) else count <- c(count, len_x - wh[len] +1) rep(x[seq_along(count)],times = count) }> a <- c(1:5,1:8) > b <- c(0:4,0:7) > f(a,b)[1] 1 1 1 1 1 2 2 2 2 2 2 2 2 Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sun, Aug 6, 2017 at 4:10 AM, Andras Farkas via R-help <r-help at r-project.org> wrote:> Dear All, > > wonder if you have thoughts on the following: > > let us say we have: > > df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) > > > I would like to rewrite values in column name "a" based on values in column name "b", where based on a certain value of column "b" the next value of column 'a' is prompted, in other words would like to have this as a result: > > df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) > > > where at the value of 0 in column 'b' the number in column a changes from 1 to 2. From the first zero value of column 'b' and until the next zero in column 'b' the numbers would not change in 'a', ie: they are all 1 in my example... then from 2 it would change to 3 again as 'b' will have zero again in a row, and so on.. Would be grateful for a solution that would allow me to set the values (from 'b') that determine how the values get established in 'a' (ie: lets say instead of 0 I would want 3 being the value where 1 changes to 2 in 'a') and that would be flexible to take into account that the number of rows and the number of time 0 shows up in a row in column 'b' may vary... > > much appreciate your thoughts.. > > Andras > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Andreas, assuming that the increment is always indicated by the same value (in your example 0), this could work: df$a <- cumsum(seq_along(df$b) %in% which(df$b == 0)) df HTH, Ulrik On Sun, 6 Aug 2017 at 18:06 Bert Gunter <bgunter.4567 at gmail.com> wrote:> Your specification is a bit unclear to me, so I'm not sure the below > is really what you want. For example, your example seems to imply that > a and b must be of the same length, but I do not see that your > description requires this. So the following may not be what you want > exactly, but one way to do this(there may be cleverer ones!) is to > make use of ?rep. Everything else is just fussy detail. (Your example > suggests that you should also learn about ?seq. Both of these should > be covered in any good R tutorial, which you should probably spend > time with if you haven't already). > > Anyway... > > ## WARNING: Not thoroughly tested! May (probably :-( ) contain bugs. > > f <- function(x,y,switch_val =0) > { > wh <- which(y == switch_val) > len <- length(wh) > len_x <- length(x) > if(!len) x > else if(wh[1] == 1){ > if(len ==1) return(rep(x[1],len_x)) > else { > wh <- wh[-1] > len <- len -1 > } > } > count <- c(wh[1]-1,diff(wh)) > if(wh[len] == len_x) count<- c(count,1) > else count <- c(count, len_x - wh[len] +1) > rep(x[seq_along(count)],times = count) > } > > > a <- c(1:5,1:8) > > b <- c(0:4,0:7) > > f(a,b) > [1] 1 1 1 1 1 2 2 2 2 2 2 2 2 > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Sun, Aug 6, 2017 at 4:10 AM, Andras Farkas via R-help > <r-help at r-project.org> wrote: > > Dear All, > > > > wonder if you have thoughts on the following: > > > > let us say we have: > > > > > df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) > > > > > > I would like to rewrite values in column name "a" based on values in > column name "b", where based on a certain value of column "b" the next > value of column 'a' is prompted, in other words would like to have this as > a result: > > > > > df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7)) > > > > > > where at the value of 0 in column 'b' the number in column a changes > from 1 to 2. From the first zero value of column 'b' and until the next > zero in column 'b' the numbers would not change in 'a', ie: they are all 1 > in my example... then from 2 it would change to 3 again as 'b' will have > zero again in a row, and so on.. Would be grateful for a solution that > would allow me to set the values (from 'b') that determine how the values > get established in 'a' (ie: lets say instead of 0 I would want 3 being the > value where 1 changes to 2 in 'a') and that would be flexible to take into > account that the number of rows and the number of time 0 shows up in a row > in column 'b' may vary... > > > > much appreciate your thoughts.. > > > > Andras > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]