Hello R User, In the sample data given below, time is recorded for each id subsequently. For the analysis, for each id, I would like to set 1st recorded time to zero and thereafter find the difference from previous time. I.e. for ID==1, I would like to see Time=0,3,1,3,6. This needs to be implemented to big data set. Any suggestions are much appreciated! Thanks, Bibek ID Time 1 3 1 6 1 7 1 10 1 16 2 12 2 18 2 19 2 25 2 28 2 30
HI, Try this: dat1<-read.table(text=" ID??? Time 1??? 3 1??? 6 1??? 7 1??? 10 1??? 16 2??? 12 2??? 18 2??? 19 2??? 25 2??? 28 2??? 30 ",sep="",header=TRUE) ?dat1$Time1<-ave(dat1$Time,dat1$ID,FUN=function(x) c(0,diff(x))) head(dat1,3) #? ID Time Time1 #1? 1??? 3???? 0 #2? 1??? 6???? 3 #3? 1??? 7???? 1 #or dat2<-unsplit(lapply(split(dat1,dat1$ID),function(x) {x$Time<-c(0,diff(x[,2])); return(x)}),dat1$ID) head(dat2,3) #? ID Time #1? 1??? 0 #2? 1??? 3 #3? 1??? 1 A.K. ----- Original Message ----- From: bibek sharma <mbhpathak at gmail.com> To: R help <r-help at r-project.org> Cc: Sent: Friday, December 14, 2012 10:51 AM Subject: [R] Hello R User Hello R User, In the sample data given below, time is recorded for each id subsequently. For the analysis, for each id, I would like to set 1st recorded time to zero and thereafter find the difference from previous time. I.e. for ID==1, I would like to see Time=0,3,1,3,6. This needs to be implemented to big data set. Any suggestions are much appreciated! Thanks, Bibek ID??? Time 1??? 3 1??? 6 1??? 7 1??? 10 1??? 16 2??? 12 2??? 18 2??? 19 2??? 25 2??? 28 2??? 30 ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
dataset<-data.frame(id=c(1,1,2,3,3,3),time=c(3,5,1,2,4,6)) dataset id time 1 1 3 2 1 5 3 2 1 4 3 2 5 3 4 6 3 6 ids<-unique(dataset$id) for(id in ids){ + dataset$time[dataset$id==id]<-c(0,diff(dataset$time[dataset$id==id])) + } dataset id time 1 1 0 2 1 2 3 2 0 4 3 0 5 3 2 6 3 2 might not be the fastest though. On 14.12.2012, at 16:51, bibek sharma wrote:> Hello R User, > In the sample data given below, time is recorded for each id > subsequently. For the analysis, for each id, I would like to set 1st > recorded time to zero and thereafter find the difference from previous > time. I.e. for ID==1, I would like to see Time=0,3,1,3,6. This needs > to be implemented to big data set. > Any suggestions are much appreciated! > Thanks, > Bibek > > ID Time > 1 3 > 1 6 > 1 7 > 1 10 > 1 16 > 2 12 > 2 18 > 2 19 > 2 25 > 2 28 > 2 30 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Bibek, how about this? dta<-read.table(textConnection("ID Time 1 3 1 6 1 7 1 10 1 16 2 12 2 18 2 19 2 25 2 28 2 30"),header=T) dta$delta<-with(dta,ave(Time,ID,FUN=function(x)c(0,diff(x)))) dta hth. Am 14.12.2012 16:51, schrieb bibek sharma:> Hello R User, > In the sample data given below, time is recorded for each id > subsequently. For the analysis, for each id, I would like to set 1st > recorded time to zero and thereafter find the difference from previous > time. I.e. for ID==1, I would like to see Time=0,3,1,3,6. This needs > to be implemented to big data set. > Any suggestions are much appreciated! > Thanks, > Bibek > > ID Time > 1 3 > 1 6 > 1 7 > 1 10 > 1 16 > 2 12 > 2 18 > 2 19 > 2 25 > 2 28 > 2 30 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi, You could also use library(data.table) to do this faster. dat1<-read.table(text=" ID??? Time 1??? 3 1??? 6 1??? 7 1??? 10 1??? 16 2??? 12 2??? 18 2??? 19 2??? 25 2??? 28 2??? 30 ",sep="",header=TRUE) library(data.table) dat2<-data.table(dat1) res<-dat2[,Time1:=c(0,diff(Time)),by=ID] ?head(res,3) ?#? ID Time Time1 #1:? 1??? 3???? 0 #2:? 1??? 6???? 3 #3:? 1??? 7???? 1 #Comparing different approaches: set.seed(55) dat3<- data.frame(ID=rep(1:1000,each=500),Value=sample(1:800,5e5,replace=TRUE)) dat4<-data.table(dat3) system.time(dat3$Value1<-ave(dat3$Value,dat3$ID,FUN=function(x) c(0,diff(x)))) #?? user? system elapsed ?# 0.312?? 0.000?? 0.313 ids<-unique(dat3$ID) ?system.time({ ?? for(id in ids){ ?? dat3$Value[dat3$ID==id]<-c(0,diff(dat3$Value[dat3$ID==id])) ?? } }) #?? user? system elapsed # 36.938?? 0.868? 37.873 system.time(dat5<-dat4[,Value1:=c(0,diff(Value)),by=ID]) #?? user? system elapsed ?# 0.036?? 0.000?? 0.037 head(dat5) #?? ID Value Value1 #1:? 1?? 439????? 0 #2:? 1?? 175?? -264 #3:? 1??? 28?? -147 #4:? 1?? 634??? 606 #5:? 1?? 449?? -185 #6:? 1??? 60?? -389 ?head(dat3) #? ID Value Value1 #1? 1???? 0????? 0 #2? 1? -264?? -264 #3? 1? -147?? -147 #4? 1?? 606??? 606 #5? 1? -185?? -185 #6? 1? -389?? -389 A.K. ----- Original Message ----- From: bibek sharma <mbhpathak at gmail.com> To: R help <r-help at r-project.org> Cc: Sent: Friday, December 14, 2012 10:51 AM Subject: [R] Hello R User Hello R User, In the sample data given below, time is recorded for each id subsequently. For the analysis, for each id, I would like to set 1st recorded time to zero and thereafter find the difference from previous time. I.e. for ID==1, I would like to see Time=0,3,1,3,6. This needs to be implemented to big data set. Any suggestions are much appreciated! Thanks, Bibek ID??? Time 1??? 3 1??? 6 1??? 7 1??? 10 1??? 16 2??? 12 2??? 18 2??? 19 2??? 25 2??? 28 2??? 30 ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
HI, Try: dat2[,Time1:=c(0,diff(Time)),by=ID] dat2[,CumSTime1:=cumsum(Time1),by=ID] ? head(dat2,4) #?? ID Time Time1 CumSTime1 #1:? 1??? 3???? 0???????? 0 #2:? 1??? 6???? 3???????? 3 #3:? 1??? 7???? 1???????? 4 #4:? 1?? 10???? 3???????? 7 A.K. ----- Original Message ----- From: bibek sharma <mbhpathak at gmail.com> To: arun <smartpink111 at yahoo.com> Cc: Sent: Friday, December 14, 2012 12:56 PM Subject: Re: [R] Hello R User Hi Arun, Great! Once we get, time1, I again wanna add time to its previous value for example, wanna get 0,3 4 etc... can I have suggestion? On Fri, Dec 14, 2012 at 9:42 AM, arun <smartpink111 at yahoo.com> wrote:> Hi, > > You could also use library(data.table) to do this faster. > dat1<-read.table(text=" > ID? ? Time > 1? ? 3 > 1? ? 6 > 1? ? 7 > 1? ? 10 > 1? ? 16 > 2? ? 12 > 2? ? 18 > 2? ? 19 > 2? ? 25 > 2? ? 28 > 2? ? 30 > ",sep="",header=TRUE) > library(data.table) > dat2<-data.table(dat1) > res<-dat2[,Time1:=c(0,diff(Time)),by=ID] >? head(res,3) >? #? ID Time Time1 > #1:? 1? ? 3? ? 0 > #2:? 1? ? 6? ? 3 > #3:? 1? ? 7? ? 1 > > #Comparing different approaches: > set.seed(55) > dat3<- data.frame(ID=rep(1:1000,each=500),Value=sample(1:800,5e5,replace=TRUE)) > dat4<-data.table(dat3) > system.time(dat3$Value1<-ave(dat3$Value,dat3$ID,FUN=function(x) c(0,diff(x)))) > #? user? system elapsed >? # 0.312? 0.000? 0.313 > > ids<-unique(dat3$ID) >? system.time({ >? ? for(id in ids){ >? ? dat3$Value[dat3$ID==id]<-c(0,diff(dat3$Value[dat3$ID==id])) >? ? } }) > #? user? system elapsed > # 36.938? 0.868? 37.873 > > system.time(dat5<-dat4[,Value1:=c(0,diff(Value)),by=ID]) > #? user? system elapsed >? # 0.036? 0.000? 0.037 > head(dat5) > #? ID Value Value1 > #1:? 1? 439? ? ? 0 > #2:? 1? 175? -264 > #3:? 1? ? 28? -147 > #4:? 1? 634? ? 606 > #5:? 1? 449? -185 > #6:? 1? ? 60? -389 >? head(dat3) > #? ID Value Value1 > #1? 1? ? 0? ? ? 0 > #2? 1? -264? -264 > #3? 1? -147? -147 > #4? 1? 606? ? 606 > #5? 1? -185? -185 > #6? 1? -389? -389 > > A.K. > > > > > > > > > > > > > ----- Original Message ----- > From: bibek sharma <mbhpathak at gmail.com> > To: R help <r-help at r-project.org> > Cc: > Sent: Friday, December 14, 2012 10:51 AM > Subject: [R] Hello R User > > Hello R User, > In the sample data given below, time is recorded for each id > subsequently. For the analysis, for each id, I would like to set 1st > recorded time to zero and thereafter find the difference from previous > time. I.e. for ID==1, I would like to see Time=0,3,1,3,6. This needs > to be implemented to big data set. > Any suggestions are much appreciated! > Thanks, > Bibek > > ID? ? Time > 1? ? 3 > 1? ? 6 > 1? ? 7 > 1? ? 10 > 1? ? 16 > 2? ? 12 > 2? ? 18 > 2? ? 19 > 2? ? 25 > 2? ? 28 > 2? ? 30 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >