Rui, Thank You! the second one gave me NULL. dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) dat$z2 NULL On Wed, Oct 12, 2016 at 3:34 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:> Hello, > > Seems simple: > > > # 1) > dat$x1 <- cumsum(dat$flag == "S") > > # 2) > dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) > > Hope this helps, > > Rui Barradas > > > Em 12-10-2016 21:15, Val escreveu: >> >> Hi all, >> >> I have a data set like >> dat<-read.table(text=" y1, flag >> 39958,S >> 40058,R >> 40105,X >> 40294,H >> 40332,S >> 40471,R >> 40493,R >> 40533,X >> 40718,H >> 40771,S >> 40829,R >> 40892,X >> 41056,H >> 41110,S >> 41160,R >> 41222,R >> 41250,R >> 41289,R >> 41324,X >> 41355,R >> 41415,X >> 41562,X >> 41562,H >> 41586,S >> ",sep=",",header=TRUE) >> >> First sort the data by y1. >> Then >> I want to create two columns . >> 1. the first new column is (x1): if flag is "S" then x1=1 and >> assign the following/subsequent rows 1 as well. When we reach to >> the next "S" then x1=2 and the subsequent rows will be assigned to >> 2. >> >> 2. the second variable (z2). Within each x1 find the difference >> between the first y1 and subsequent y1 values >> >> Example for the first few rows >> y1, flag, x1, z2 >> 39958, S, 1, 0 z2 is calculated as z2=(39958, 39958) >> 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958) >> 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958) >> 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958) >> 40332, S, 2, 0 z2 is calculated as z2=(40332, 40332) >> etc >> >> Here is the complete output for the sample data >> 39958,S,1,0 >> 40058,R,1,100 >> 40105,X,1,147 >> 40294,H,1,336 >> 40332,S,2,0 >> 40471,R,2,139 >> 40493,R,2,161 >> 40533,X,2,201 >> 40718,H,2,386 >> 40771,S,3,0 >> 40829,R,3,58 >> 40892,X,3,121 >> 41056,H,3,285 >> 41110,S,4,0 >> 41160,R,4,50 >> 41222,R,4,112 >> 41250,R,4,140 >> 41289,R,4,179 >> 41324,X,4,214 >> 41355,R,4,245 >> 41415,X,4,305 >> 41562,X,4,452 >> 41562,H,4,452 >> 41586,S,5,0 >> >> Val >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >
Hello, You must run the code to create x1 first, part 1), then part 2). I've tested with your data and all went well, the result is the following. > dput(dat) structure(list(y1 = c(39958L, 40058L, 40105L, 40294L, 40332L, 40471L, 40493L, 40533L, 40718L, 40771L, 40829L, 40892L, 41056L, 41110L, 41160L, 41222L, 41250L, 41289L, 41324L, 41355L, 41415L, 41562L, 41562L, 41586L), flag = structure(c(3L, 2L, 4L, 1L, 3L, 2L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 1L, 3L), .Label = c("H", "R", "S", "X"), class = "factor"), x1 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L), z2 = c(0L, 100L, 147L, 336L, 0L, 139L, 161L, 201L, 386L, 0L, 58L, 121L, 285L, 0L, 50L, 112L, 140L, 179L, 214L, 245L, 305L, 452L, 452L, 0L)), .Names = c("y1", "flag", "x1", "z2"), row.names = c(NA, -24L), class = "data.frame") Rui Barradas Em 12-10-2016 21:53, Val escreveu:> Rui, > Thank You! > > the second one gave me NULL. > dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) > > dat$z2 > NULL > > > > On Wed, Oct 12, 2016 at 3:34 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote: >> Hello, >> >> Seems simple: >> >> >> # 1) >> dat$x1 <- cumsum(dat$flag == "S") >> >> # 2) >> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) >> >> Hope this helps, >> >> Rui Barradas >> >> >> Em 12-10-2016 21:15, Val escreveu: >>> >>> Hi all, >>> >>> I have a data set like >>> dat<-read.table(text=" y1, flag >>> 39958,S >>> 40058,R >>> 40105,X >>> 40294,H >>> 40332,S >>> 40471,R >>> 40493,R >>> 40533,X >>> 40718,H >>> 40771,S >>> 40829,R >>> 40892,X >>> 41056,H >>> 41110,S >>> 41160,R >>> 41222,R >>> 41250,R >>> 41289,R >>> 41324,X >>> 41355,R >>> 41415,X >>> 41562,X >>> 41562,H >>> 41586,S >>> ",sep=",",header=TRUE) >>> >>> First sort the data by y1. >>> Then >>> I want to create two columns . >>> 1. the first new column is (x1): if flag is "S" then x1=1 and >>> assign the following/subsequent rows 1 as well. When we reach to >>> the next "S" then x1=2 and the subsequent rows will be assigned to >>> 2. >>> >>> 2. the second variable (z2). Within each x1 find the difference >>> between the first y1 and subsequent y1 values >>> >>> Example for the first few rows >>> y1, flag, x1, z2 >>> 39958, S, 1, 0 z2 is calculated as z2=(39958, 39958) >>> 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958) >>> 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958) >>> 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958) >>> 40332, S, 2, 0 z2 is calculated as z2=(40332, 40332) >>> etc >>> >>> Here is the complete output for the sample data >>> 39958,S,1,0 >>> 40058,R,1,100 >>> 40105,X,1,147 >>> 40294,H,1,336 >>> 40332,S,2,0 >>> 40471,R,2,139 >>> 40493,R,2,161 >>> 40533,X,2,201 >>> 40718,H,2,386 >>> 40771,S,3,0 >>> 40829,R,3,58 >>> 40892,X,3,121 >>> 41056,H,3,285 >>> 41110,S,4,0 >>> 41160,R,4,50 >>> 41222,R,4,112 >>> 41250,R,4,140 >>> 41289,R,4,179 >>> 41324,X,4,214 >>> 41355,R,4,245 >>> 41415,X,4,305 >>> 41562,X,4,452 >>> 41562,H,4,452 >>> 41586,S,5,0 >>> >>> Val >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>
Thank you Rui, It Worked! How about if the first variable is date format? Like the following dat<-read.table(text=" y1, flag 24-01-2016,S 24-02-2016,R 24-03-2016,X 24-04-2016,H 24-01-2016,S 24-11-2016,R 24-10-2016,R 24-02-2016,X 24-01-2016,H 24-11-2016,S 24-02-2016,R 24-10-2016,X 24-03-2016,H 24-04-2016,S ",sep=",",header=TRUE) dat dat$x1 <- cumsum(dat$flag == "S") dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) error message In Ops.factor(y, y[1]) : ?-? not meaningful for factors On Thu, Oct 13, 2016 at 5:30 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:> Hello, > > You must run the code to create x1 first, part 1), then part 2). > I've tested with your data and all went well, the result is the following. > >> dput(dat) > structure(list(y1 = c(39958L, 40058L, 40105L, 40294L, 40332L, > 40471L, 40493L, 40533L, 40718L, 40771L, 40829L, 40892L, 41056L, > 41110L, 41160L, 41222L, 41250L, 41289L, 41324L, 41355L, 41415L, > 41562L, 41562L, 41586L), flag = structure(c(3L, 2L, 4L, 1L, 3L, > 2L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 2L, 2L, 2L, 4L, 2L, 4L, > 4L, 1L, 3L), .Label = c("H", "R", "S", "X"), class = "factor"), > x1 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L), z2 = c(0L, 100L, > 147L, 336L, 0L, 139L, 161L, 201L, 386L, 0L, 58L, 121L, 285L, > 0L, 50L, 112L, 140L, 179L, 214L, 245L, 305L, 452L, 452L, > 0L)), .Names = c("y1", "flag", "x1", "z2"), row.names = c(NA, > -24L), class = "data.frame") > > > Rui Barradas > > > Em 12-10-2016 21:53, Val escreveu: >> >> Rui, >> Thank You! >> >> the second one gave me NULL. >> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) >> >> dat$z2 >> NULL >> >> >> >> On Wed, Oct 12, 2016 at 3:34 PM, Rui Barradas <ruipbarradas at sapo.pt> >> wrote: >>> >>> Hello, >>> >>> Seems simple: >>> >>> >>> # 1) >>> dat$x1 <- cumsum(dat$flag == "S") >>> >>> # 2) >>> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) >>> >>> Hope this helps, >>> >>> Rui Barradas >>> >>> >>> Em 12-10-2016 21:15, Val escreveu: >>>> >>>> >>>> Hi all, >>>> >>>> I have a data set like >>>> dat<-read.table(text=" y1, flag >>>> 39958,S >>>> 40058,R >>>> 40105,X >>>> 40294,H >>>> 40332,S >>>> 40471,R >>>> 40493,R >>>> 40533,X >>>> 40718,H >>>> 40771,S >>>> 40829,R >>>> 40892,X >>>> 41056,H >>>> 41110,S >>>> 41160,R >>>> 41222,R >>>> 41250,R >>>> 41289,R >>>> 41324,X >>>> 41355,R >>>> 41415,X >>>> 41562,X >>>> 41562,H >>>> 41586,S >>>> ",sep=",",header=TRUE) >>>> >>>> First sort the data by y1. >>>> Then >>>> I want to create two columns . >>>> 1. the first new column is (x1): if flag is "S" then x1=1 and >>>> assign the following/subsequent rows 1 as well. When we reach to >>>> the next "S" then x1=2 and the subsequent rows will be assigned to >>>> 2. >>>> >>>> 2. the second variable (z2). Within each x1 find the difference >>>> between the first y1 and subsequent y1 values >>>> >>>> Example for the first few rows >>>> y1, flag, x1, z2 >>>> 39958, S, 1, 0 z2 is calculated as z2=(39958, 39958) >>>> 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958) >>>> 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958) >>>> 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958) >>>> 40332, S, 2, 0 z2 is calculated as z2=(40332, 40332) >>>> etc >>>> >>>> Here is the complete output for the sample data >>>> 39958,S,1,0 >>>> 40058,R,1,100 >>>> 40105,X,1,147 >>>> 40294,H,1,336 >>>> 40332,S,2,0 >>>> 40471,R,2,139 >>>> 40493,R,2,161 >>>> 40533,X,2,201 >>>> 40718,H,2,386 >>>> 40771,S,3,0 >>>> 40829,R,3,58 >>>> 40892,X,3,121 >>>> 41056,H,3,285 >>>> 41110,S,4,0 >>>> 41160,R,4,50 >>>> 41222,R,4,112 >>>> 41250,R,4,140 >>>> 41289,R,4,179 >>>> 41324,X,4,214 >>>> 41355,R,4,245 >>>> 41415,X,4,305 >>>> 41562,X,4,452 >>>> 41562,H,4,452 >>>> 41586,S,5,0 >>>> >>>> Val >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >