Hi all, I want sort the data by ID and Y2 then count the number of rows within IDs. Assign a "flag" variable to reach row starting from first to the last row. For instance, in the following data ID "1" has three rows and each row is assigned flag sequentially 1, 2,3. 2. In the second step, within each ID, I want get the difference between the subsequent row values of y1 and y2(date) values. Within each ID the first value of y1diff and y2diff are always 0. The second values for each will be the current row minus the previous row. lag<-read.table(text=" ID, y1, y2 ID,Y1,y2 1,0,12/25/2014 1,125,9/15/2015 1,350,1/30/2016 2,0,12/25/2012 2,450,9/15/2014 2,750,1/30/2016 2, 656, 11/30/2016 ",sep=",",header=TRUE) output looks like as follows ID,flag,y1,y2,y1dif,y2dif 1,1,0,12/25/2014,0,0 1,2,125,9/15/2015,125,264 1,3,350,1/30/2016,225,137 2,1,0,12/25/2012,0,0 2,2,450,9/15/2014,450,629 2,3,750,1/30/2016,300,502 2, 4, 656 11/30/2016, -94, 305 Thank you
Hello, Try the following. lag<-read.table(text=" ID, y1, y2 1,0,12/25/2014 1,125,9/15/2015 1,350,1/30/2016 2,0,12/25/2012 2,450,9/15/2014 2,750,1/30/2016 2, 656, 11/30/2016 ",sep=",",header=TRUE) str(lag) lag$y2 <- as.Date(lag$y2, format = "%m/%d/%Y") str(lag) # 1) flag <- ave(lag$ID, lag$ID, FUN = seq_along) lag2 <- cbind(lag[1], flag, lag[-1]) # 2) y1dif <- ave(lag2$y1, lag2$ID, FUN = function(y) c(0, y[-1] - y[-length(y)])) y2dif <- unlist(tapply(lag2$y2, lag2$ID, FUN = function(y) c(0, y[-1] - y[-length(y)]))) lag2 <- cbind(lag2, y1dif, y2dif) lag2 Hope this helps, Rui Barradas Em 15-10-2016 17:57, Val escreveu:> Hi all, > > I want sort the data by ID and Y2 then count the number of rows within > IDs. Assign a "flag" variable to reach row starting from first to > the last row. > For instance, in the following data ID "1" has three rows and each > row is assigned flag sequentially 1, 2,3. > > 2. In the second step, within each ID, I want get the difference > between the subsequent row values of y1 and y2(date) values. > Within each ID the first value of y1diff and y2diff are always 0. The > second values for each will be the current row minus the previous > row. > > > > lag<-read.table(text=" ID, y1, y2 > ID,Y1,y2 > 1,0,12/25/2014 > 1,125,9/15/2015 > 1,350,1/30/2016 > 2,0,12/25/2012 > 2,450,9/15/2014 > 2,750,1/30/2016 > 2, 656, 11/30/2016 > ",sep=",",header=TRUE) > > output looks like as follows > > ID,flag,y1,y2,y1dif,y2dif > 1,1,0,12/25/2014,0,0 > 1,2,125,9/15/2015,125,264 > 1,3,350,1/30/2016,225,137 > 2,1,0,12/25/2012,0,0 > 2,2,450,9/15/2014,450,629 > 2,3,750,1/30/2016,300,502 > 2, 4, 656 11/30/2016, -94, 305 > > Thank you >
I forgot about the sorting part and assumed the data.frame was already sorted. If not, after converting y2 to class Date, you can do lag <- lag[order(lag$ID, lag$y2), ] Rui Barradas Em 15-10-2016 19:45, Rui Barradas escreveu:> Hello, > > Try the following. > > > lag<-read.table(text=" ID, y1, y2 > 1,0,12/25/2014 > 1,125,9/15/2015 > 1,350,1/30/2016 > 2,0,12/25/2012 > 2,450,9/15/2014 > 2,750,1/30/2016 > 2, 656, 11/30/2016 > ",sep=",",header=TRUE) > > str(lag) > lag$y2 <- as.Date(lag$y2, format = "%m/%d/%Y") > str(lag) > > # 1) > flag <- ave(lag$ID, lag$ID, FUN = seq_along) > lag2 <- cbind(lag[1], flag, lag[-1]) > > # 2) > y1dif <- ave(lag2$y1, lag2$ID, FUN = function(y) c(0, y[-1] - > y[-length(y)])) > y2dif <- unlist(tapply(lag2$y2, lag2$ID, FUN = function(y) c(0, y[-1] - > y[-length(y)]))) > > lag2 <- cbind(lag2, y1dif, y2dif) > lag2 > > Hope this helps, > > Rui Barradas > > Em 15-10-2016 17:57, Val escreveu: >> Hi all, >> >> I want sort the data by ID and Y2 then count the number of rows within >> IDs. Assign a "flag" variable to reach row starting from first to >> the last row. >> For instance, in the following data ID "1" has three rows and each >> row is assigned flag sequentially 1, 2,3. >> >> 2. In the second step, within each ID, I want get the difference >> between the subsequent row values of y1 and y2(date) values. >> Within each ID the first value of y1diff and y2diff are always 0. The >> second values for each will be the current row minus the previous >> row. >> >> >> >> lag<-read.table(text=" ID, y1, y2 >> ID,Y1,y2 >> 1,0,12/25/2014 >> 1,125,9/15/2015 >> 1,350,1/30/2016 >> 2,0,12/25/2012 >> 2,450,9/15/2014 >> 2,750,1/30/2016 >> 2, 656, 11/30/2016 >> ",sep=",",header=TRUE) >> >> output looks like as follows >> >> ID,flag,y1,y2,y1dif,y2dif >> 1,1,0,12/25/2014,0,0 >> 1,2,125,9/15/2015,125,264 >> 1,3,350,1/30/2016,225,137 >> 2,1,0,12/25/2012,0,0 >> 2,2,450,9/15/2014,450,629 >> 2,3,750,1/30/2016,300,502 >> 2, 4, 656 11/30/2016, -94, 305 >> >> Thank you >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.