Tasnuva Tabassum
2013-Feb-23 14:28 UTC
[R] Selecting First Incidence from Longitudinal Data
I have a longitudinal competing risk data of the form: ID COMPL SEX HEREDITY 1 0 1 2 1 0 1 2 1 3 1 2 2 0 0 1 2 1 0 1 2 2 0 1 2 2 0 1 3 0 0 1 3 0 0 1 3 0 0 1 3 0 0 1 3 2 0 1 4 0 1 2 4 0 1 2. Where, COMPL= health complication of diabetic patients which has value labels as 0= no complication,1=coronary heart disease, 2=retinopathy, 3nephropathy. I want to select only the first complication that occurred to each patient. What R function can I use? [[alternative HTML version deleted]]
Hi, Try this: dat1<- read.table(text=" ID??? COMPL? SEX? HEREDITY 1??? 0????? 1????? 2 1??? 0????? 1????? 2 1??? 3????? 1????? 2 2??? 0????? 0????? 1 2??? 1????? 0????? 1 2??? 2????? 0????? 1 2??? 2????? 0????? 1 3??? 0????? 0????? 1 3??? 0????? 0????? 1 3??? 0????? 0????? 1 3??? 0????? 0????? 1 3??? 2????? 0????? 1 4??? 0????? 1????? 2 4??? 0????? 1????? 2 ",sep="",header=TRUE) library(plyr) dat2<- dat1[ddply(dat1,.(ID),summarise,COMPL!=0)[,2],] ?aggregate(.~ID,data=dat2,head,1) #? ID COMPL SEX HEREDITY #1? 1???? 3?? 1??????? 2 #2? 2???? 1?? 0??????? 1 #3? 3???? 2?? 0??????? 1 A.K. ----- Original Message ----- From: Tasnuva Tabassum <t.tasnuva at gmail.com> To: r-help at r-project.org Cc: Sent: Saturday, February 23, 2013 9:28 AM Subject: [R] Selecting First Incidence from Longitudinal Data I have a longitudinal competing risk data of the form: ID? ? COMPL? SEX? HEREDITY 1? ? 0? ? ? 1? ? ? 2 1? ? 0? ? ? 1? ? ? 2 1? ? 3? ? ? 1? ? ? 2 2? ? 0? ? ? 0? ? ? 1 2? ? 1? ? ? 0? ? ? 1 2? ? 2? ? ? 0? ? ? 1 2? ? 2? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 2? ? ? 0? ? ? 1 4? ? 0? ? ? 1? ? ? 2 4? ? 0? ? ? 1? ? ? 2. Where, COMPL= health complication of diabetic patients which has value labels? as? 0= no complication,1=coronary heart disease, 2=retinopathy, 3nephropathy. I want to select only the first complication that occurred to each patient. What R function can I use? ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius
2013-Feb-23 18:10 UTC
[R] Selecting First Incidence from Longitudinal Data
On Feb 23, 2013, at 6:28 AM, Tasnuva Tabassum wrote:> I have a longitudinal competing risk data of the form: > > ID COMPL SEX HEREDITY > 1 0 1 2 > 1 0 1 2 > 1 3 1 2 > 2 0 0 1 > 2 1 0 1 > 2 2 0 1 > 2 2 0 1 > 3 0 0 1 > 3 0 0 1 > 3 0 0 1 > 3 0 0 1 > 3 2 0 1 > 4 0 1 2 > 4 0 1 2. > > Where, COMPL= health complication of diabetic patients which has value > labels as 0= no complication,1=coronary heart disease, 2=retinopathy, 3> nephropathy. > > > I want to select only the first complication that occurred to each patient. > What R function can I use? >> dat[ with(dat, ave(COMPL, ID, FUN=function(x) cumsum(x>0) ) ) ==1,]ID COMPL SEX HEREDITY 3 1 3 1 2 5 2 1 0 1 12 3 2 0 1> [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA
Hi, You can also use: ?do.call(rbind,lapply(split(dat1,dat1$ID),function(x) head(x[x$COMPL!=0,],1))) #? ID COMPL SEX HEREDITY #1? 1???? 3?? 1??????? 2 #2? 2???? 1?? 0??????? 1 #3? 3???? 2?? 0??????? 1 ----- Original Message ----- From: Tasnuva Tabassum <t.tasnuva at gmail.com> To: r-help at r-project.org Cc: Sent: Saturday, February 23, 2013 9:28 AM Subject: [R] Selecting First Incidence from Longitudinal Data I have a longitudinal competing risk data of the form: ID? ? COMPL? SEX? HEREDITY 1? ? 0? ? ? 1? ? ? 2 1? ? 0? ? ? 1? ? ? 2 1? ? 3? ? ? 1? ? ? 2 2? ? 0? ? ? 0? ? ? 1 2? ? 1? ? ? 0? ? ? 1 2? ? 2? ? ? 0? ? ? 1 2? ? 2? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 0? ? ? 0? ? ? 1 3? ? 2? ? ? 0? ? ? 1 4? ? 0? ? ? 1? ? ? 2 4? ? 0? ? ? 1? ? ? 2. Where, COMPL= health complication of diabetic patients which has value labels? as? 0= no complication,1=coronary heart disease, 2=retinopathy, 3nephropathy. I want to select only the first complication that occurred to each patient. What R function can I use? ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, You can use ?aggregate and ?head to do what you want. Try the following. dat <- read.table(text=" ID COMPL SEX HEREDITY 1 0 1 2 1 0 1 2 1 3 1 2 2 0 0 1 2 1 0 1 2 2 0 1 2 2 0 1 3 0 0 1 3 0 0 1 3 0 0 1 3 0 0 1 3 2 0 1 4 0 1 2 4 0 1 2 ", header = TRUE) aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1) Hope this helps, Rui Barradas Em 23-02-2013 14:28, Tasnuva Tabassum escreveu:> I have a longitudinal competing risk data of the form: > > ID COMPL SEX HEREDITY > 1 0 1 2 > 1 0 1 2 > 1 3 1 2 > 2 0 0 1 > 2 1 0 1 > 2 2 0 1 > 2 2 0 1 > 3 0 0 1 > 3 0 0 1 > 3 0 0 1 > 3 0 0 1 > 3 2 0 1 > 4 0 1 2 > 4 0 1 2. > > Where, COMPL= health complication of diabetic patients which has value > labels as 0= no complication,1=coronary heart disease, 2=retinopathy, 3> nephropathy. > > > I want to select only the first complication that occurred to each patient. > What R function can I use? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi, I am not sure why you are getting different results.? I couldn't reproduce your problem. dat1<- read.table(text=" ID??? COMPL? SEX? HEREDITY 1??? 0????? 1????? 2 1??? 0????? 1????? 2 1??? 3????? 1????? 2 2??? 0????? 0????? 1 2??? 1????? 0????? 1 2??? 2????? 0????? 1 2??? 2????? 0????? 1 3??? 0????? 0????? 1 3??? 0????? 0????? 1 3??? 0????? 0????? 1 3??? 0????? 0????? 1 3??? 2????? 0????? 1 4??? 0????? 1????? 2 4??? 0????? 1????? 2 ",sep="",header=TRUE) do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0)) head(x[x$COMPL!=0,],1) else head(x,1))) #? ID COMPL SEX HEREDITY #1? 1???? 3?? 1??????? 2 #2? 2???? 1?? 0??????? 1 #3? 3???? 2?? 0??????? 1 #4? 4???? 0?? 1??????? 2 You could also try: dat1[with(dat1,ave(COMPL,ID,FUN=function(x) if(any(x!=0)) cumsum(x>0) else seq_along(x)))==1,] #modification of David's code #?? ID COMPL SEX HEREDITY #3?? 1???? 3?? 1??????? 2 #5?? 2???? 1?? 0??????? 1 #12? 3???? 2?? 0??????? 1 #13? 4???? 0?? 1??????? 2 A.K. ________________________________ From: Tasnuva Tabassum <t.tasnuva at gmail.com> To: arun <smartpink111 at yahoo.com> Sent: Sunday, February 24, 2013 12:08 AM Subject: Re: [R] Selecting First Incidence from Longitudinal Data sorry, I tried this. But it gave me answer: ?#?? ID COMPL SEX HEREDITY #1?? 1???? 0?? 1??????? 2??????? #4?? 2???? 0?? 0??????? 1??????? #8?? 3???? 0?? 0??????? 1??????? #13? 4???? 0?? 1??????? 2??????? On Sat, Feb 23, 2013 at 8:44 PM, arun <smartpink111 at yahoo.com> wrote: Hi,>Try this: >#dat1 >?do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0)) head(x[x$COMPL!=0,],1) else head(x,1))) > >#? ID COMPL SEX HEREDITY > >#1? 1???? 3?? 1??????? 2 >#2? 2???? 1?? 0??????? 1 >#3? 3???? 2?? 0??????? 1 >#4? 4???? 0?? 1??????? 2 >A.K. > > > > > > >________________________________ >From: Tasnuva Tabassum <t.tasnuva at gmail.com> >To: Xiaogang Su <xiaogangsu at gmail.com> >Cc: arun <smartpink111 at yahoo.com>; R help <r-help at r-project.org>; Rui Barradas <ruipbarradas at sapo.pt> >Sent: Saturday, February 23, 2013 11:23 PM > >Subject: Re: [R] Selecting First Incidence from Longitudinal Data > > >Hi >Thank you very much, but I forgot to tell that I also want to include the patients for which no complication occurred. That is, for my data I want to include patient no. 4, for which the COMPL value will be 0. > >In that case, what R function should I write? > > > > >On Sat, Feb 23, 2013 at 12:23 PM, Xiaogang Su <xiaogangsu at gmail.com> wrote: > >My bad. I didn't try it out with the real data. Here you go. HTH, X >> >> >>dat <- read.table(text=" >>ID ? ?COMPL ?SEX ?HEREDITY >>1 ? ?0 ? ? ?1 ? ? ?2 >>1 ? ?0 ? ? ?1 ? ? ?2 >>1 ? ?3 ? ? ?1 ? ? ?2 >>2 ? ?0 ? ? ?0 ? ? ?1 >>2 ? ?1 ? ? ?0 ? ? ?1 >>2 ? ?2 ? ? ?0 ? ? ?1 >>2 ? ?2 ? ? ?0 ? ? ?1 >>3 ? ?0 ? ? ?0 ? ? ?1 >>3 ? ?0 ? ? ?0 ? ? ?1 >>3 ? ?0 ? ? ?0 ? ? ?1 >>3 ? ?0 ? ? ?0 ? ? ?1 >>3 ? ?2 ? ? ?0 ? ? ?1 >>4 ? ?0 ? ? ?1 ? ? ?2 >>4 ? ?0 ? ? ?1 ? ? ?2 >>", header = TRUE) >> >> >>dat0 <- dat[dat$COMPL!=0, ] >>dat0$sequence <- as.vector(unlist(lapply(aggregate(dat0$ID, by=list(dat0$ID),FUN=length)$x, FUN=function(x){seq(1, x)}))) >>dat0 <- dat0[dat0$sequence==1, ]? >>dat0 >> >> >> >> >>On Sat, Feb 23, 2013 at 2:09 PM, arun <smartpink111 at yahoo.com> wrote: >> >>HI, >>>Tried your approach: >>> >>> >>>?dat1$sequence <- as.vector(unlist(lapply( aggregate(dat1$ID, by=list(dat1$ID),FUN=length)$x, FUN=function(x){seq(1, x)}))) >>>?dat0 <- dat1[dat1$sequence==1 & dat1$COMPL!= 0, ] #your second solution >>>?dat0 >>>#[1] ID?????? COMPL??? SEX????? HEREDITY sequence >>>#<0 rows> (or 0-length row.names) >>>? >>> >>>dat1[dat1$sequence==1,] #here the OP wanted first incidence where COMPL!=0 >>>#?? ID COMPL SEX HEREDITY sequence >>>#1?? 1???? 0?? 1??????? 2??????? 1 >>>#4?? 2???? 0?? 0??????? 1??????? 1 >>>#8?? 3???? 0?? 0??????? 1??????? 1 >>>#13? 4???? 0?? 1??????? 2??????? 1 >>>A.K. >>> >>> >>> >>> >>>----- Original Message ----- >>>From: Xiaogang Su <xiaogangsu at gmail.com> >>>To: Rui Barradas <ruipbarradas at sapo.pt> >>>Cc: r-help at r-project.org >>>Sent: Saturday, February 23, 2013 2:15 PM >>>Subject: Re: [R] Selecting First Incidence from Longitudinal Data >>> >>>Try this: >>>dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x), >>>FUN=length)$x, FUN=function(x){seq(1, x)))) >>>dat0 <- dat[dat$sequence==1, ] >>> >>>HTH, X >>> >>> >>>On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote: >>> >>>> Hello, >>>> >>>> You can use ?aggregate and ?head to do what you want. Try the following. >>>> >>>> >>>> >>>> dat <- read.table(text=" >>>> >>>> ID? ? COMPL? SEX? HEREDITY >>>> 1? ? 0? ? ? 1? ? ? 2 >>>> 1? ? 0? ? ? 1? ? ? 2 >>>> 1? ? 3? ? ? 1? ? ? 2 >>>> 2? ? 0? ? ? 0? ? ? 1 >>>> 2? ? 1? ? ? 0? ? ? 1 >>>> 2? ? 2? ? ? 0? ? ? 1 >>>> 2? ? 2? ? ? 0? ? ? 1 >>>> 3? ? 0? ? ? 0? ? ? 1 >>>> 3? ? 0? ? ? 0? ? ? 1 >>>> 3? ? 0? ? ? 0? ? ? 1 >>>> 3? ? 0? ? ? 0? ? ? 1 >>>> 3? ? 2? ? ? 0? ? ? 1 >>>> 4? ? 0? ? ? 1? ? ? 2 >>>> 4? ? 0? ? ? 1? ? ? 2 >>>> ", header = TRUE) >>>> >>>> aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1) >>>> >>>> >>>> Hope this helps, >>>> >>>> Rui Barradas >>>> >>>> Em 23-02-2013 14:28, Tasnuva Tabassum escreveu: >>>> >>>>? I have a longitudinal competing risk data of the form: >>>>> >>>>> ID? ? COMPL? SEX? ?HEREDITY >>>>> 1? ? ?0? ? ? ?1? ? ? 2 >>>>> 1? ? ?0? ? ? ?1? ? ? 2 >>>>> 1? ? ?3? ? ? ?1? ? ? 2 >>>>> 2? ? ?0? ? ? ?0? ? ? 1 >>>>> 2? ? ?1? ? ? ?0? ? ? 1 >>>>> 2? ? ?2? ? ? ?0? ? ? 1 >>>>> 2? ? ?2? ? ? ?0? ? ? 1 >>>>> 3? ? ?0? ? ? ?0? ? ? 1 >>>>> 3? ? ?0? ? ? ?0? ? ? 1 >>>>> 3? ? ?0? ? ? ?0? ? ? 1 >>>>> 3? ? ?0? ? ? ?0? ? ? 1 >>>>> 3? ? ?2? ? ? ?0? ? ? 1 >>>>> 4? ? ?0? ? ? ?1? ? ? 2 >>>>> 4? ? ?0? ? ? ?1? ? ? 2. >>>>> >>>>> Where, COMPL= health complication of diabetic patients which has value >>>>> labels? ?as? 0= no complication,1=coronary heart disease, 2=retinopathy, >>>>> 3>>>>> nephropathy. >>>>> >>>>> >>>>> I want to select only the first complication that occurred to each >>>>> patient. >>>>> What R function can I use? >>>>> >>>>>? ? ? ? ?[[alternative HTML version deleted]] >>>>> >>>>> ______________________________**________________ >>>>> R-help at r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >>>>> PLEASE do read the posting guide http://www.R-project.org/** >>>>> posting-guide.html <http://www.R-project.org/posting-guide.html> >>> >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>> >>>> ______________________________**________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >>>> PLEASE do read the posting guide http://www.R-project.org/** >>>> posting-guide.html <http://www.R-project.org/posting-guide.html> >>> >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>>-- >>>=============================>>>Xiaogang Su, Ph.D. >>>Associate Professor & Statistician >>>School of Nursing, University of Alabama >>>Birmingham, AL 35294-1210 >>>(205) 934-2355?[Office] >>>xgsu at uab.edu >>>xiaogangsu at gmail.com >>>https://sites.google.com/site/xgsu00/ >>> >>> >>>??? [[alternative HTML version deleted]] >>> >>>______________________________________________ >>>R-help at r-project.org mailing list >>>https://stat.ethz.ch/mailman/listinfo/r-help >>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>and provide commented, minimal, self-contained, reproducible code. >>> >>> >> >> >> >>-- >>=============================>>Xiaogang Su, Ph.D. >>Associate Professor & Statistician >>School of Nursing, University of Alabama >>Birmingham, AL 35294-1210 >>(205) 934-2355 [Office] >>xgsu at uab.edu >>xiaogangsu at gmail.com? >>https://sites.google.com/site/xgsu00/ >