Hi all, I have a data set like dat<-read.table(text=" y1, flag 39958,S 40058,R 40105,X 40294,H 40332,S 40471,R 40493,R 40533,X 40718,H 40771,S 40829,R 40892,X 41056,H 41110,S 41160,R 41222,R 41250,R 41289,R 41324,X 41355,R 41415,X 41562,X 41562,H 41586,S ",sep=",",header=TRUE) First sort the data by y1. Then I want to create two columns . 1. the first new column is (x1): if flag is "S" then x1=1 and assign the following/subsequent rows 1 as well. When we reach to the next "S" then x1=2 and the subsequent rows will be assigned to 2. 2. the second variable (z2). Within each x1 find the difference between the first y1 and subsequent y1 values Example for the first few rows y1, flag, x1, z2 39958, S, 1, 0 z2 is calculated as z2=(39958, 39958) 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958) 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958) 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958) 40332, S, 2, 0 z2 is calculated as z2=(40332, 40332) etc Here is the complete output for the sample data 39958,S,1,0 40058,R,1,100 40105,X,1,147 40294,H,1,336 40332,S,2,0 40471,R,2,139 40493,R,2,161 40533,X,2,201 40718,H,2,386 40771,S,3,0 40829,R,3,58 40892,X,3,121 41056,H,3,285 41110,S,4,0 41160,R,4,50 41222,R,4,112 41250,R,4,140 41289,R,4,179 41324,X,4,214 41355,R,4,245 41415,X,4,305 41562,X,4,452 41562,H,4,452 41586,S,5,0 Val
Hello, Seems simple: # 1) dat$x1 <- cumsum(dat$flag == "S") # 2) dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) Hope this helps, Rui Barradas Em 12-10-2016 21:15, Val escreveu:> Hi all, > > I have a data set like > dat<-read.table(text=" y1, flag > 39958,S > 40058,R > 40105,X > 40294,H > 40332,S > 40471,R > 40493,R > 40533,X > 40718,H > 40771,S > 40829,R > 40892,X > 41056,H > 41110,S > 41160,R > 41222,R > 41250,R > 41289,R > 41324,X > 41355,R > 41415,X > 41562,X > 41562,H > 41586,S > ",sep=",",header=TRUE) > > First sort the data by y1. > Then > I want to create two columns . > 1. the first new column is (x1): if flag is "S" then x1=1 and > assign the following/subsequent rows 1 as well. When we reach to > the next "S" then x1=2 and the subsequent rows will be assigned to > 2. > > 2. the second variable (z2). Within each x1 find the difference > between the first y1 and subsequent y1 values > > Example for the first few rows > y1, flag, x1, z2 > 39958, S, 1, 0 z2 is calculated as z2=(39958, 39958) > 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958) > 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958) > 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958) > 40332, S, 2, 0 z2 is calculated as z2=(40332, 40332) > etc > > Here is the complete output for the sample data > 39958,S,1,0 > 40058,R,1,100 > 40105,X,1,147 > 40294,H,1,336 > 40332,S,2,0 > 40471,R,2,139 > 40493,R,2,161 > 40533,X,2,201 > 40718,H,2,386 > 40771,S,3,0 > 40829,R,3,58 > 40892,X,3,121 > 41056,H,3,285 > 41110,S,4,0 > 41160,R,4,50 > 41222,R,4,112 > 41250,R,4,140 > 41289,R,4,179 > 41324,X,4,214 > 41355,R,4,245 > 41415,X,4,305 > 41562,X,4,452 > 41562,H,4,452 > 41586,S,5,0 > > Val > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Rui, Thank You! the second one gave me NULL. dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) dat$z2 NULL On Wed, Oct 12, 2016 at 3:34 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:> Hello, > > Seems simple: > > > # 1) > dat$x1 <- cumsum(dat$flag == "S") > > # 2) > dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) > > Hope this helps, > > Rui Barradas > > > Em 12-10-2016 21:15, Val escreveu: >> >> Hi all, >> >> I have a data set like >> dat<-read.table(text=" y1, flag >> 39958,S >> 40058,R >> 40105,X >> 40294,H >> 40332,S >> 40471,R >> 40493,R >> 40533,X >> 40718,H >> 40771,S >> 40829,R >> 40892,X >> 41056,H >> 41110,S >> 41160,R >> 41222,R >> 41250,R >> 41289,R >> 41324,X >> 41355,R >> 41415,X >> 41562,X >> 41562,H >> 41586,S >> ",sep=",",header=TRUE) >> >> First sort the data by y1. >> Then >> I want to create two columns . >> 1. the first new column is (x1): if flag is "S" then x1=1 and >> assign the following/subsequent rows 1 as well. When we reach to >> the next "S" then x1=2 and the subsequent rows will be assigned to >> 2. >> >> 2. the second variable (z2). Within each x1 find the difference >> between the first y1 and subsequent y1 values >> >> Example for the first few rows >> y1, flag, x1, z2 >> 39958, S, 1, 0 z2 is calculated as z2=(39958, 39958) >> 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958) >> 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958) >> 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958) >> 40332, S, 2, 0 z2 is calculated as z2=(40332, 40332) >> etc >> >> Here is the complete output for the sample data >> 39958,S,1,0 >> 40058,R,1,100 >> 40105,X,1,147 >> 40294,H,1,336 >> 40332,S,2,0 >> 40471,R,2,139 >> 40493,R,2,161 >> 40533,X,2,201 >> 40718,H,2,386 >> 40771,S,3,0 >> 40829,R,3,58 >> 40892,X,3,121 >> 41056,H,3,285 >> 41110,S,4,0 >> 41160,R,4,50 >> 41222,R,4,112 >> 41250,R,4,140 >> 41289,R,4,179 >> 41324,X,4,214 >> 41355,R,4,245 >> 41415,X,4,305 >> 41562,X,4,452 >> 41562,H,4,452 >> 41586,S,5,0 >> >> Val >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >