Lijun Zhao
2020-Feb-19 06:56 UTC
[R] How to index the occasions in a vector repeatedly under condition 1? if not, it will give a new index.
Dear All, could you please help me how to get the output from the following example? x<-c(543, 543, 543, 543, 551 , 551 ,1128 ,1197, 1197) diff<-x-lag(x) diff [1] NA 0 0 0 8 0 577 69 0 how to index the occassions in x repeatedly if the diff>15? if not, it will give a new index i want the output be like y y<-c(1,1,1,1,1,1,2,3,3) thanks, Lijun [[alternative HTML version deleted]]
Rui Barradas
2020-Feb-19 07:13 UTC
[R] How to index the occasions in a vector repeatedly under condition 1? if not, it will give a new index.
Hello, First of all, a note about your reproducible example. When you write diff <- x - lag(x) there are two things to be said. 1. There is a base R function named 'diff', it is better to use another name. diff(x) #[1] 0 0 0 8 0 577 69 0 2. There are also several functions named 'lag', one of them in base package stats. x - lag(x) #[1] 0 0 0 0 0 0 0 0 0 #attr(,"tsp") #[1] 0 8 1 This is not the one you are using. x - dplyr::lag(x) #[1] NA 0 0 0 8 0 577 69 0 That's the one. When you have a package loaded in your session, please start your scripts with library(<pkgname>), in this case library(dplyr). Now for the question's problem. I will use a different name, 'd', not 'diff'. And qualify the function name with the package name prefix. The main problem is the NA in the first element of 'd', without it cumsum(d > 15) would be enough. This works because the logical values FALSE/TRUE are coded as 0/1 and their cumulative sum goes up every time a TRUE is found. d <- x - dplyr::lag(x) cumsum(is.na(d) | d > 15) #[1] 1 1 1 1 1 1 2 3 3 Hope this helps, Rui Barradas ?s 06:56 de 19/02/20, Lijun Zhao escreveu:> Dear All, > > could you please help me how to get the output from the following example? > > > x<-c(543, 543, 543, 543, 551 , 551 ,1128 ,1197, 1197) > > diff<-x-lag(x) > > diff > > [1] NA 0 0 0 8 0 577 69 0 > > how to index the occassions in x repeatedly if the diff>15? if not, it will > give a new index > > i want the output be like y > > y<-c(1,1,1,1,1,1,2,3,3) > > > thanks, > > > Lijun > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
PIKAL Petr
2020-Feb-19 10:36 UTC
[R] How to index the occasions in a vector repeatedly under condition 1? if not, it will give a new index.
Hi You could get similar result with using diff function Rui suggested c(1,cumsum((diff(x)>15))+1) [1] 1 1 1 1 1 1 2 3 3 Cheers Petr> -----Original Message----- > From: R-help <r-help-bounces at r-project.org> On Behalf Of Rui Barradas > Sent: Wednesday, February 19, 2020 8:13 AM > To: Lijun Zhao <lijunzhao0606 at gmail.com>; r-help at r-project.org > Subject: Re: [R] How to index the occasions in a vector repeatedly under > condition 1? if not, it will give a new index. > > Hello, > > First of all, a note about your reproducible example. > > When you write diff <- x - lag(x) there are two things to be said. > > 1. There is a base R function named 'diff', it is better to use another name. > > diff(x) > #[1] 0 0 0 8 0 577 69 0 > > 2. There are also several functions named 'lag', one of them in base package > stats. > > x - lag(x) > #[1] 0 0 0 0 0 0 0 0 0 > #attr(,"tsp") > #[1] 0 8 1 > > This is not the one you are using. > > x - dplyr::lag(x) > #[1] NA 0 0 0 8 0 577 69 0 > > That's the one. When you have a package loaded in your session, please start > your scripts with library(<pkgname>), in this case library(dplyr). > > > Now for the question's problem. I will use a different name, 'd', not > 'diff'. And qualify the function name with the package name prefix. > > The main problem is the NA in the first element of 'd', without it > cumsum(d > 15) would be enough. This works because the logical values > FALSE/TRUE are coded as 0/1 and their cumulative sum goes up every time > a TRUE is found. > > d <- x - dplyr::lag(x) > cumsum(is.na(d) | d > 15) > #[1] 1 1 1 1 1 1 2 3 3 > > > Hope this helps, > > Rui Barradas > > > ?s 06:56 de 19/02/20, Lijun Zhao escreveu: > > Dear All, > > > > could you please help me how to get the output from the following example? > > > > > > x<-c(543, 543, 543, 543, 551 , 551 ,1128 ,1197, 1197) > > > > diff<-x-lag(x) > > > > diff > > > > [1] NA 0 0 0 8 0 577 69 0 > > > > how to index the occassions in x repeatedly if the diff>15? if not, it will > > give a new index > > > > i want the output be like y > > > > y<-c(1,1,1,1,1,1,2,3,3) > > > > > > thanks, > > > > > > Lijun > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.