I have the complete data like id time censor 1 10 0 1 20 0 1 30 0 2 10 0 2 20 1 2 30 0 2 40 0 3 10 0 3 20 0 3 30 1 .... for id 1, i want to select the last row since all censor indicator is 0; for id 2, i want to select the row where censor ==1; for id 3, i also want to select the row where censor==1. So if there is a 1 for censor, then I want to select such a row, otherwise I want to select the last obs. for this id. I am wondering if there is a quick way to solve this? [[alternative HTML version deleted]]
gallon li schrieb:> for id 1, i want to select the last row since all censor indicator is 0; for > id 2, i want to select the row where censor ==1; for id 3, i also want to > select the row where censor==1. So if there is a 1 for censor, then I want > to select such a row, otherwise I want to select the last obs. for this id. > I am wondering if there is a quick way to solve this? >?subset and please do read an introductionary text....
---------- Forwarded message ---------- From: gallon li <gallon.li@gmail.com> Date: Tue, Nov 25, 2008 at 1:58 PM Subject: Re: [R] select a subset To: Stefan Grosse <singularitaet@gmx.net> I am sorry but my question is not solvable by using subset alone. You see, the selection criterion is different for different id. This may not be easily specified in the subset function. Yes, I did think about this function before but couldn't find ways to use it. On Mon, Nov 24, 2008 at 8:42 PM, Stefan Grosse <singularitaet@gmx.net>wrote:> gallon li schrieb: > > for id 1, i want to select the last row since all censor indicator is 0; > for > > id 2, i want to select the row where censor ==1; for id 3, i also want to > > select the row where censor==1. So if there is a 1 for censor, then I > want > > to select such a row, otherwise I want to select the last obs. for this > id. > > I am wondering if there is a quick way to solve this? > > > ?subset > > and please do read an introductionary text.... >[[alternative HTML version deleted]]
How about something like: censor_choose <- function(fr) do.call(rbind, lapply( split( fr, fr$id), function(sub) sub[which.max( if (max(sub$censor)) sub$censor else sub$time) ,] ) ) Using your data, itc <- data.frame(id= c(1,1,1,2,2,2,2,3,3,3), time= c(1,2,3,1,2,3,4,1,2,3), censor=c(0,0,0,0,1,0,0,0,0,1)) we get censor_choose(itc) => id time censor 1 1 3 0 2 2 2 1 3 3 3 1 Modularizing the above a bit better, I get: choose_row_from_groups <- function(frame,grouping,filter) do.call(rbind, lapply( split( frame, grouping), function(sub) sub[filter(sub),])) choose_row_from_groups ( fr, fr$id, function(sub) which.max( if (max(sub$censor)) sub$censor else sub$time )) But there must be some more standard R way to do choose_row_from_groups? -s On Mon, Nov 24, 2008 at 4:15 AM, gallon li <gallon.li at gmail.com> wrote:> I have the complete data like > > id time censor > 1 10 0 > 1 20 0 > 1 30 0 > 2 10 0 > 2 20 1 > 2 30 0 > 2 40 0 > 3 10 0 > 3 20 0 > 3 30 1 > .... > > for id 1, i want to select the last row since all censor indicator is 0; for > id 2, i want to select the row where censor ==1; for id 3, i also want to > select the row where censor==1. So if there is a 1 for censor, then I want > to select such a row, otherwise I want to select the last obs. for this id. > I am wondering if there is a quick way to solve this? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >