Hi, I guess you meant this: dat2<- read.table(text=" patient_id????? t???????? scores 1????????????????????? 0??????????????? 1.6 1????????????????????? 1??????????????? 2.6 1????????????????????? 2???????????????? 2.2 1????????????????????? 3???????????????? 1.8 2????????????????????? 0????????????????? 2.3 2?????????????????????? 2???????????????? 2.5 2????????????????????? 4????????????????? 2.6 2?????????????????????? 5???????????????? 1.5 3?????????????????????? 0???????????????? 1.2 4?????????????????????? 0???????????????? 1.3 4?????????????????????? 1???????????????? 1.8 ",sep="",header=TRUE) library(plyr) ?dat2New<-ddply(dat2,.(patient_id),summarize,t=seq(min(t),max(t))) ?res<-join(dat2New,dat2,type="full") ?lst1<-lapply(split(res,res$patient_id),function(x) {x1<-x[x$t!=0,];do.call(rbind,lapply(split(x1,((x1$t-1)%/%3)+1),function(y) {y1<-if(any(y$t==1)) rbind(x[x$t==0,],y) else y; data.frame(patient_id=unique(y1$patient_id),t=head(y1$t,1),scores=mean(y1$scores,na.rm=TRUE))}) ) }) lst1[lapply(lst1,length)==0]<-lapply(lst1[lapply(lst1,length)==0],function(x) x<- dat2[unlist(with(dat2,tapply(t,patient_id,FUN=function(x) x==0 & length(x)==1)),use.names=FALSE),]) res1<-do.call(rbind,lst1) ?row.names(res1)<- 1:nrow(res1) ?res2<- res1[,-2] res2$period<-with(res2,ave(patient_id,patient_id,FUN=seq_along)) ?res2 # patient_id scores period #1????????? 1?? 2.05????? 1 #2????????? 2?? 2.40????? 1 #3????????? 2?? 2.05????? 2 #4????????? 3?? 1.20????? 1 #5????????? 4?? 1.55????? 1 A.K. ________________________________ From: GUANGUAN LUO <guanguanluo at gmail.com> To: arun <smartpink111 at yahoo.com> Sent: Wednesday, May 22, 2013 5:42 AM Subject: calcul of the mean in a period of time Hello, AK, This is the code zhich you have written. dat2<- read.table(text=" patient_id????? t???????? scores 1????????????????????? 0??????????????? 1.6 1????????????????????? 1??????????????? 2.6 1????????????????????? 2???????????????? 2.2 1????????????????????? 3???????????????? 1.8 2????????????????????? 0????????????????? 2.3 2?????????????????????? 2???????????????? 2.5 2????????????????????? 4????????????????? 2.6 2?????????????????????? 5???????????????? 1.5 ",sep="",header=TRUE) library(plyr) ?dat2New<-ddply(dat2,.( patient_id),summarize,t=seq(min(t),max(t))) ?res<-join(dat2New,dat2,type="full") res1<-do.call(rbind,lapply(split(res,res$patient_id),function(x) {x1<-x[x$t!=0,];do.call(rbind,lapply(split(x1,((x1$t-1)%/%3)+1),function(y) {y1<-if(any(y$t==1)) rbind(x[x$t==0,],y) else y; data.frame(patient_id=unique(y1$patient_id),scores=mean(y1$scores,na.rm=TRUE))}) ) })) ?row.names(res1)<-1:nrow(res1) res1$period<-with(res1,ave(patient_id,patient_id,FUN=seq)) ?res1 #? patient_id scores period #1????????? 1?? 2.05????? 1 #2????????? 2?? 2.40????? 1 #3????????? 2?? 2.05????? 2 ?for the same problem, in the case that you have done, you have select the data x[t!=0], if there are some patients who have only one data when t=0, can i change a little the code so that i can retain the informations when t=0? That means when the patients have only one score, so i regarde the score of t=0 as the average of period 1 for these patients. Thank you so much for your help. I have never worked on programming before, so i really don't understand much for it. You are really helpful. Thank you so much. GG