Andrzej Bienczak
2013-Nov-21 18:59 UTC
[R] How to add unique occasions based on date within a subject in R?
Hi All, I'm trying to figure out how in my data set to add a column including a count of unique events based on date. Here is a part of my data set: trialno event date time 3 11301 pm_intake 2010-11-24 19:00 4 11301 am_intake 2010-11-25 07:00 5 11301 pk1 2010-11-25 10:30 6 11301 pm_intake 2010-12-22 19:00 7 11301 am_intake 2010-12-23 07:00 8 11301 pk1 2010-12-23 09:54 9 11301 pk2 2010-12-23 13:07 10 11301 pm_intake 2011-02-02 19:00 11 11301 am_intake 2011-02-03 07:00 12 11301 pk1 2011-02-03 11:30 Basically each date within each patient would indicate a new occasion. If patient has just drug administration - it's one occasion but if patient had drug administration and two measurements on the same day, they all count as the same occasion. The data set does not have a regular patters (each patient has a different number of events on each date and events in total). What I'm trying to achieve is: trialno event date time OCC 3 11301 pm_intake 2010-11-24 19:00 1 4 11301 am_intake 2010-11-25 07:00 2 5 11301 pk1 2010-11-25 10:30 2 6 11301 pm_intake 2010-12-22 19:00 3 7 11301 am_intake 2010-12-23 07:00 4 8 11301 pk1 2010-12-23 09:54 4 9 11301 pk2 2010-12-23 13:07 4 10 11301 pm_intake 2011-02-02 19:00 5 11 11301 am_intake 2011-02-03 07:00 6 12 11301 pk1 2011-02-03 11:30 6 I think I should apply some kind of a loop to identify within each patient unique dates and count them... I thought about splitting the whole data set into patients using split function: splitData<- split(data, data$trialno) And applying lapply and transform to add a new column OCC (occasion) but I don't know how to count those as integers... I was thinking: splitData<- lapply(splitData, function(df) { transform(df, OCC= ??????????????? )} do.call ("rbind", splitData) I know how to do it in Excell: =IF(D5=D4, E4,E4+1) (if the cell value in neighbouring cell is same as in the cell above, then value in my cell is same as in one above, else it's one greater)-this way first cell in E column has to be 1 and the others are integers of new date events. Help much appreciated! Andrzej [[alternative HTML version deleted]]
arun
2013-Nov-21 19:42 UTC
[R] How to add unique occasions based on date within a subject in R?
Hi, May be you can try: ###Use dput() dat1 <- structure(list(trialno = c(11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L), event = c("pm_intake", "am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1"), ??? date = c("2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", ??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", ??? "2011-02-03", "2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", ??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", ??? "2011-02-03"), time = c("19:00", "07:00", "10:30", "19:00", ??? "07:00", "09:54", "13:07", "19:00", "07:00", "11:30", "19:00", ??? "07:00", "10:30", "19:00", "07:00", "09:54", "13:07", "19:00", ??? "07:00", "11:30")), .Names = c("trialno", "event", "date", "time"), class = "data.frame", row.names = c("3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22")) splitData<- split(dat1, dat1$trialno) #using your code res <-? unsplit(lapply(splitData,function(x) within(x,OCC <- cumsum(ave(seq_along(date),date,FUN=seq_along)==1))),dat1$trialno) ?res$OCC ?#[1] 1 2 2 3 4 4 4 5 6 6 1 2 2 3 4 4 4 5 6 6 A.K. On Thursday, November 21, 2013 2:04 PM, Andrzej Bienczak <andrzej.bienczak at googlemail.com> wrote: Hi All, I'm trying to figure out how in my data set to add a column including a count of unique events based on date. Here is a part of my data set: ? ? ? ? ? ? ? ? trialno? ? ? event? ? ? ? ? ? ? ? ? date? ? ? ? ? time 3? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-11-24? ? ? ? ? 19:00 4? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-11-25? ? ? ? ? 07:00 5? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2010-11-25 10:30 6? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-12-22? ? ? ? ? 19:00 7? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-12-23? ? ? ? ? 07:00 8? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2010-12-23 09:54 9? ? ? ? ? ? ? 11301? ? pk2? ? ? ? ? ? ? ? ? ? ? ? 2010-12-23 13:07 10? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2011-02-02? ? ? ? ? 19:00 11? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2011-02-03? ? ? ? ? 07:00 12? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2011-02-03? ? ? ? ? 11:30 Basically each date within each patient would indicate a new occasion. If patient has just drug administration - it's one occasion but if patient had drug administration and two measurements on the same day, they all count as the same occasion. The data set does not have a regular patters (each patient has a different number of events on each date and events in total). What I'm trying to achieve is: ? ? ? ? ? ? ? ? trialno? ? ? event? ? ? ? ? ? ? ? ? ? date? ? ? ? ? time OCC 3? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-11-24? ? ? ? ? 19:00? ? ? 1 4? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-11-25? ? ? ? ? 07:00? ? ? 2 5? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2010-11-25 10:30? ? ? 2 6? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-12-22? ? ? ? ? 19:00? ? ? 3 7? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-12-23? ? ? ? ? 07:00? ? ? 4 8? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2010-12-23 09:54? ? ? 4 9? ? ? ? ? ? ? 11301? ? pk2? ? ? ? ? ? ? ? ? ? ? ? 2010-12-23 13:07? ? ? 4 10? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2011-02-02? ? ? ? ? 19:00? ? ? 5 11? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2011-02-03? ? ? ? ? 07:00? ? ? 6 12? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2011-02-03? ? ? ? ? 11:30 6 I think I should apply some kind of a loop to identify within each patient unique dates and count them... I thought about splitting the whole data set into patients using split function: splitData<- split(data, data$trialno) And applying lapply and transform to add a new column OCC (occasion) but I don't know how to count those as integers... I was thinking: splitData<- lapply(splitData, function(df) { ? ? ? transform(df, OCC= ???????????????? )} do.call ("rbind", splitData) I know how to do it in Excell: =IF(D5=D4, E4,E4+1) (if the cell value in neighbouring cell is same as in the cell above, then value in my cell is same as in one above, else it's one greater)-this way first cell in E column has to be 1 and the others are integers of new date events. Help much appreciated! Andrzej ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.