Dear R help, I have the following data frame: structure(list(prochi = c("ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1"), date_1st_event = structure(c(14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784 ), class = "Date"), bp_date = structure(c(12660, 14571, 13392, 13080, 12012, 13080, 13894, 14622, 12654, 13894), class = "Date"), SBP = c(135L, 160L, 135L, 153L, 150L, 153L, 151L, 126L, 150L, 151L), DBP = c(85L, 80L, NA, 79L, 82L, 79L, 76L, 60L, 82L, 91L)), .Names = c("prochi", "date_1st_event", "bp_date", "SBP", "DBP"), row.names = 108:117, class = "data.frame") It consists of repeated measures for the same individual. What I want to do is find the two most recent blood pressure readings (SBP and DBP) using date_1st_event and bp_date. What I would do to find the most recent date is to subtract date_1st_event-bp_date and then aggregate by min. I'm not sure how to find the two most recent dates. Are there some functions that can help me or will I have to write a function from scratch. Any help just to point me in the right direction. Thanks, Natalie -- View this message in context: http://r.789695.n4.nabble.com/Finding-the-two-most-recent-dates-tp2528185p2528185.html Sent from the R help mailing list archive at Nabble.com.
Hi Natalie, By far the easiest thing to do is to convert the date to a special date class. See as.POSIXct for example. I'm not sure that 14784 means, nor what the data says in the bp_date column. Probably the two combine into a specific date? Once you've converted the columns into a POSIXct object, you can use the min() function to find the minimum. cheers, Paul On 09/06/2010 12:45 PM, Newbie19_02 wrote:> Dear R help, > > I have the following data frame: > > structure(list(prochi = c("ind_1", "ind_1", "ind_1", > "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", > "ind_1", "ind_1"), date_1st_event = structure(c(14784, > 14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784 > ), class = "Date"), bp_date = structure(c(12660, 14571, 13392, > 13080, 12012, 13080, 13894, 14622, 12654, 13894), class = "Date"), > SBP = c(135L, 160L, 135L, 153L, 150L, 153L, 151L, 126L, 150L, > 151L), DBP = c(85L, 80L, NA, 79L, 82L, 79L, 76L, 60L, 82L, > 91L)), .Names = c("prochi", "date_1st_event", "bp_date", "SBP", > "DBP"), row.names = 108:117, class = "data.frame") > > It consists of repeated measures for the same individual. What I want to do > is find the two most recent blood pressure readings (SBP and DBP) using > date_1st_event and bp_date. What I would do to find the most recent date is > to subtract date_1st_event-bp_date and then aggregate by min. I'm not sure > how to find the two most recent dates. > > Are there some functions that can help me or will I have to write a function > from scratch. Any help just to point me in the right direction. > > Thanks, > Natalie >-- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 253 5773 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770
Nathalie, your method of sending sample data is fine. dt = structure(list(prochi = c("ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", "ind_1"), date_1st_event = structure(c(14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784 ), class = "Date"), bp_date = structure(c(12660, 14571, 13392, 13080, 12012, 13080, 13894, 14622, 12654, 13894), class = "Date"), SBP = c(135L, 160L, 135L, 153L, 150L, 153L, 151L, 126L, 150L, 151L), DBP = c(85L, 80L, NA, 79L, 82L, 79L, 76L, 60L, 82L, 91L)), .Names = c("prochi", "date_1st_event", "bp_date", "SBP", "DBP"), row.names = 108:117, class = "data.frame") # The most recent date iRecent = which.max(dt$bp_date) # index of record with most recent date dt[iRecent,] # ind_1 2010-06-24 2010-01-13 126 60 # To Paul Hiemstra: we have the integer representation of date here as.integer(dt$bpdate[iRecent]) I believe that the description of your problem is not complete (homework?), because there is only on subject and one 1st event. So please try to restate the rest of the problem. Dieter -- View this message in context: http://r.789695.n4.nabble.com/Finding-the-two-most-recent-dates-tp2528185p2528346.html Sent from the R help mailing list archive at Nabble.com.
Here is one way of doing it:> xprochi date_1st_event bp_date SBP DBP 108 ind_1 2010-06-24 2004-08-30 135 85 109 ind_1 2010-06-24 2009-11-23 160 80 110 ind_1 2010-06-24 2006-09-01 135 NA 111 ind_1 2010-06-24 2005-10-24 153 79 112 ind_1 2010-06-24 2002-11-21 150 82 113 ind_1 2010-06-24 2005-10-24 153 79 114 ind_1 2010-06-24 2008-01-16 151 76 115 ind_1 2010-06-24 2010-01-13 126 60 116 ind_1 2010-06-24 2004-08-24 150 82 117 ind_1 2010-06-24 2008-01-16 151 91> # find the two most recent date for an individual > mostRecent <- lapply(split(x, x$prochi), function(.ind){+ # get two most recent (or 1 if only one exists) + index <- order(.ind$bp_date, decreasing=TRUE)[1:2] + .ind[index,] # return + })> # put back into a dataframe > do.call(rbind, mostRecent)prochi date_1st_event bp_date SBP DBP ind_1.115 ind_1 2010-06-24 2010-01-13 126 60 ind_1.109 ind_1 2010-06-24 2009-11-23 160 80>On Mon, Sep 6, 2010 at 6:45 AM, Newbie19_02 <nvanzuydam at gmail.com> wrote:> > Dear R help, > > I have the following data frame: > > structure(list(prochi = c("ind_1", "ind_1", "ind_1", > "ind_1", "ind_1", "ind_1", "ind_1", "ind_1", > "ind_1", "ind_1"), date_1st_event = structure(c(14784, > 14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784, 14784 > ), class = "Date"), bp_date = structure(c(12660, 14571, 13392, > 13080, 12012, 13080, 13894, 14622, 12654, 13894), class = "Date"), > ? ?SBP = c(135L, 160L, 135L, 153L, 150L, 153L, 151L, 126L, 150L, > ? ?151L), DBP = c(85L, 80L, NA, 79L, 82L, 79L, 76L, 60L, 82L, > ? ?91L)), .Names = c("prochi", "date_1st_event", "bp_date", "SBP", > "DBP"), row.names = 108:117, class = "data.frame") > > It consists of repeated measures for the same individual. ?What I want to do > is find the two most recent blood pressure readings (SBP and DBP) using > date_1st_event and bp_date. ?What I would do to find the most recent date is > to subtract date_1st_event-bp_date and then aggregate by min. ?I'm not sure > how to find the two most recent dates. > > Are there some functions that can help me or will I have to write a function > from scratch. ?Any help just to point me in the right direction. > > Thanks, > Natalie > -- > View this message in context: http://r.789695.n4.nabble.com/Finding-the-two-most-recent-dates-tp2528185p2528185.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?