Rui, Thank You! the second one gave me NULL. dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) dat$z2 NULL On Wed, Oct 12, 2016 at 3:34 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:> Hello, > > Seems simple: > > > # 1) > dat$x1 <- cumsum(dat$flag == "S") > > # 2) > dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) > > Hope this helps, > > Rui Barradas > > > Em 12-10-2016 21:15, Val escreveu: >> >> Hi all, >> >> I have a data set like >> dat<-read.table(text=" y1, flag >> 39958,S >> 40058,R >> 40105,X >> 40294,H >> 40332,S >> 40471,R >> 40493,R >> 40533,X >> 40718,H >> 40771,S >> 40829,R >> 40892,X >> 41056,H >> 41110,S >> 41160,R >> 41222,R >> 41250,R >> 41289,R >> 41324,X >> 41355,R >> 41415,X >> 41562,X >> 41562,H >> 41586,S >> ",sep=",",header=TRUE) >> >> First sort the data by y1. >> Then >> I want to create two columns . >> 1. the first new column is (x1): if flag is "S" then x1=1 and >> assign the following/subsequent rows 1 as well. When we reach to >> the next "S" then x1=2 and the subsequent rows will be assigned to >> 2. >> >> 2. the second variable (z2). Within each x1 find the difference >> between the first y1 and subsequent y1 values >> >> Example for the first few rows >> y1, flag, x1, z2 >> 39958, S, 1, 0 z2 is calculated as z2=(39958, 39958) >> 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958) >> 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958) >> 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958) >> 40332, S, 2, 0 z2 is calculated as z2=(40332, 40332) >> etc >> >> Here is the complete output for the sample data >> 39958,S,1,0 >> 40058,R,1,100 >> 40105,X,1,147 >> 40294,H,1,336 >> 40332,S,2,0 >> 40471,R,2,139 >> 40493,R,2,161 >> 40533,X,2,201 >> 40718,H,2,386 >> 40771,S,3,0 >> 40829,R,3,58 >> 40892,X,3,121 >> 41056,H,3,285 >> 41110,S,4,0 >> 41160,R,4,50 >> 41222,R,4,112 >> 41250,R,4,140 >> 41289,R,4,179 >> 41324,X,4,214 >> 41355,R,4,245 >> 41415,X,4,305 >> 41562,X,4,452 >> 41562,H,4,452 >> 41586,S,5,0 >> >> Val >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >
Hello,
You must run the code to create x1 first, part 1), then part 2).
I've tested with your data and all went well, the result is the following.
> dput(dat)
structure(list(y1 = c(39958L, 40058L, 40105L, 40294L, 40332L,
40471L, 40493L, 40533L, 40718L, 40771L, 40829L, 40892L, 41056L,
41110L, 41160L, 41222L, 41250L, 41289L, 41324L, 41355L, 41415L,
41562L, 41562L, 41586L), flag = structure(c(3L, 2L, 4L, 1L, 3L,
2L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 2L, 2L, 2L, 4L, 2L, 4L,
4L, 1L, 3L), .Label = c("H", "R", "S",
"X"), class = "factor"),
x1 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L), z2 = c(0L, 100L,
147L, 336L, 0L, 139L, 161L, 201L, 386L, 0L, 58L, 121L, 285L,
0L, 50L, 112L, 140L, 179L, 214L, 245L, 305L, 452L, 452L,
0L)), .Names = c("y1", "flag", "x1",
"z2"), row.names = c(NA,
-24L), class = "data.frame")
Rui Barradas
Em 12-10-2016 21:53, Val escreveu:> Rui,
> Thank You!
>
> the second one gave me NULL.
> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1]))
>
> dat$z2
> NULL
>
>
>
> On Wed, Oct 12, 2016 at 3:34 PM, Rui Barradas <ruipbarradas at
sapo.pt> wrote:
>> Hello,
>>
>> Seems simple:
>>
>>
>> # 1)
>> dat$x1 <- cumsum(dat$flag == "S")
>>
>> # 2)
>> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1]))
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>> Em 12-10-2016 21:15, Val escreveu:
>>>
>>> Hi all,
>>>
>>> I have a data set like
>>> dat<-read.table(text=" y1, flag
>>> 39958,S
>>> 40058,R
>>> 40105,X
>>> 40294,H
>>> 40332,S
>>> 40471,R
>>> 40493,R
>>> 40533,X
>>> 40718,H
>>> 40771,S
>>> 40829,R
>>> 40892,X
>>> 41056,H
>>> 41110,S
>>> 41160,R
>>> 41222,R
>>> 41250,R
>>> 41289,R
>>> 41324,X
>>> 41355,R
>>> 41415,X
>>> 41562,X
>>> 41562,H
>>> 41586,S
>>> ",sep=",",header=TRUE)
>>>
>>> First sort the data by y1.
>>> Then
>>> I want to create two columns .
>>> 1. the first new column is (x1): if flag is "S" then
x1=1 and
>>> assign the following/subsequent rows 1 as well. When we reach to
>>> the next "S" then x1=2 and the subsequent rows will be
assigned to
>>> 2.
>>>
>>> 2. the second variable (z2). Within each x1 find the difference
>>> between the first y1 and subsequent y1 values
>>>
>>> Example for the first few rows
>>> y1, flag, x1, z2
>>> 39958, S, 1, 0 z2 is calculated as z2=(39958,
39958)
>>> 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958)
>>> 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958)
>>> 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958)
>>> 40332, S, 2, 0 z2 is calculated as z2=(40332,
40332)
>>> etc
>>>
>>> Here is the complete output for the sample data
>>> 39958,S,1,0
>>> 40058,R,1,100
>>> 40105,X,1,147
>>> 40294,H,1,336
>>> 40332,S,2,0
>>> 40471,R,2,139
>>> 40493,R,2,161
>>> 40533,X,2,201
>>> 40718,H,2,386
>>> 40771,S,3,0
>>> 40829,R,3,58
>>> 40892,X,3,121
>>> 41056,H,3,285
>>> 41110,S,4,0
>>> 41160,R,4,50
>>> 41222,R,4,112
>>> 41250,R,4,140
>>> 41289,R,4,179
>>> 41324,X,4,214
>>> 41355,R,4,245
>>> 41415,X,4,305
>>> 41562,X,4,452
>>> 41562,H,4,452
>>> 41586,S,5,0
>>>
>>> Val
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
Thank you Rui, It Worked! How about if the first variable is date format? Like the following dat<-read.table(text=" y1, flag 24-01-2016,S 24-02-2016,R 24-03-2016,X 24-04-2016,H 24-01-2016,S 24-11-2016,R 24-10-2016,R 24-02-2016,X 24-01-2016,H 24-11-2016,S 24-02-2016,R 24-10-2016,X 24-03-2016,H 24-04-2016,S ",sep=",",header=TRUE) dat dat$x1 <- cumsum(dat$flag == "S") dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) error message In Ops.factor(y, y[1]) : ?-? not meaningful for factors On Thu, Oct 13, 2016 at 5:30 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:> Hello, > > You must run the code to create x1 first, part 1), then part 2). > I've tested with your data and all went well, the result is the following. > >> dput(dat) > structure(list(y1 = c(39958L, 40058L, 40105L, 40294L, 40332L, > 40471L, 40493L, 40533L, 40718L, 40771L, 40829L, 40892L, 41056L, > 41110L, 41160L, 41222L, 41250L, 41289L, 41324L, 41355L, 41415L, > 41562L, 41562L, 41586L), flag = structure(c(3L, 2L, 4L, 1L, 3L, > 2L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 2L, 2L, 2L, 4L, 2L, 4L, > 4L, 1L, 3L), .Label = c("H", "R", "S", "X"), class = "factor"), > x1 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L), z2 = c(0L, 100L, > 147L, 336L, 0L, 139L, 161L, 201L, 386L, 0L, 58L, 121L, 285L, > 0L, 50L, 112L, 140L, 179L, 214L, 245L, 305L, 452L, 452L, > 0L)), .Names = c("y1", "flag", "x1", "z2"), row.names = c(NA, > -24L), class = "data.frame") > > > Rui Barradas > > > Em 12-10-2016 21:53, Val escreveu: >> >> Rui, >> Thank You! >> >> the second one gave me NULL. >> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) >> >> dat$z2 >> NULL >> >> >> >> On Wed, Oct 12, 2016 at 3:34 PM, Rui Barradas <ruipbarradas at sapo.pt> >> wrote: >>> >>> Hello, >>> >>> Seems simple: >>> >>> >>> # 1) >>> dat$x1 <- cumsum(dat$flag == "S") >>> >>> # 2) >>> dat$z2 <- unlist(tapply(dat$y1, dat$x1, function(y) y - y[1])) >>> >>> Hope this helps, >>> >>> Rui Barradas >>> >>> >>> Em 12-10-2016 21:15, Val escreveu: >>>> >>>> >>>> Hi all, >>>> >>>> I have a data set like >>>> dat<-read.table(text=" y1, flag >>>> 39958,S >>>> 40058,R >>>> 40105,X >>>> 40294,H >>>> 40332,S >>>> 40471,R >>>> 40493,R >>>> 40533,X >>>> 40718,H >>>> 40771,S >>>> 40829,R >>>> 40892,X >>>> 41056,H >>>> 41110,S >>>> 41160,R >>>> 41222,R >>>> 41250,R >>>> 41289,R >>>> 41324,X >>>> 41355,R >>>> 41415,X >>>> 41562,X >>>> 41562,H >>>> 41586,S >>>> ",sep=",",header=TRUE) >>>> >>>> First sort the data by y1. >>>> Then >>>> I want to create two columns . >>>> 1. the first new column is (x1): if flag is "S" then x1=1 and >>>> assign the following/subsequent rows 1 as well. When we reach to >>>> the next "S" then x1=2 and the subsequent rows will be assigned to >>>> 2. >>>> >>>> 2. the second variable (z2). Within each x1 find the difference >>>> between the first y1 and subsequent y1 values >>>> >>>> Example for the first few rows >>>> y1, flag, x1, z2 >>>> 39958, S, 1, 0 z2 is calculated as z2=(39958, 39958) >>>> 40058, R, 1, 100 z2 is calculated as z2=(40058, 39958) >>>> 40105, X, 1, 147 z2 is calculated as z2=(40105, 39958) >>>> 40294, H, 1, 336 z2 is calculated as z2=(40294, 39958) >>>> 40332, S, 2, 0 z2 is calculated as z2=(40332, 40332) >>>> etc >>>> >>>> Here is the complete output for the sample data >>>> 39958,S,1,0 >>>> 40058,R,1,100 >>>> 40105,X,1,147 >>>> 40294,H,1,336 >>>> 40332,S,2,0 >>>> 40471,R,2,139 >>>> 40493,R,2,161 >>>> 40533,X,2,201 >>>> 40718,H,2,386 >>>> 40771,S,3,0 >>>> 40829,R,3,58 >>>> 40892,X,3,121 >>>> 41056,H,3,285 >>>> 41110,S,4,0 >>>> 41160,R,4,50 >>>> 41222,R,4,112 >>>> 41250,R,4,140 >>>> 41289,R,4,179 >>>> 41324,X,4,214 >>>> 41355,R,4,245 >>>> 41415,X,4,305 >>>> 41562,X,4,452 >>>> 41562,H,4,452 >>>> 41586,S,5,0 >>>> >>>> Val >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >