Hi to all members of the list, I have a data frame with subjects who can get into a certain study from 2010-01-01 onwards. Small example: DF <- data.frame(id=as.factor(1:3), born=as.Date(c("1939/10/28", "1946/02/23", "1948/02/29"))) id born 1 1 1939-10-28 2 2 1946-02-23 3 3 1948-02-29 Now, I add a new column "enter" as follows: 1) If the subject is 65 years old before 2010-01-01, then enter=2010-01-01. 2) If the subject i NOT 65 years old before 2010-01-01, then enter="Date on which subject reach 65" DF_new <- data.frame(DF, enter= as.Date( ifelse(unclass(round(difftime(open, DF$born)/365.25,1))<=65, paste(year(DF$born)+65,substr(DF$born,6,10),sep="-"), paste(open))) ) The problem is that the DF_new output has a NA in subject id=3: id born enter 1 1 1939-10-28 2010-01-01 2 2 1946-02-23 2011-02-23 3 3 1948-02-29 <NA> I'm afraid (I'm not really sure) that the matter is that subject id=3 would reach 65 yr at 2013-02-29, but this date does not exist, so R gives a missing. Can any help me? Thank you!!! [[alternative HTML version deleted]]
Richard M. Heiberger
2014-Sep-18 18:53 UTC
[R] Data frame which includes a non-existent date
Frank, Dates are extremely difficult. I recommend you do not attempt to do your own data computations with paste(). Use the lubridate package.> install.packages(lubridate) > library(lubridate)Read the end section of> vignette("lubridate")>From that you will most likely be wanting one of these > ymd("19480229") %m+% years(65)[1] "2013-02-28 UTC"> daydiff <- ymd("19480229") - floor_date(ymd("19480229"), "month") > floor_date(ymd("19480229"), "month") + years(65) + daydiff[1] "2013-03-01 UTC">Rich On Thu, Sep 18, 2014 at 11:22 AM, Frank S. <f_j_rod at hotmail.com> wrote:> > > Hi to all members of the list, > > I have a data frame with subjects who can get into a certain study from 2010-01-01 onwards. Small example: > > DF <- data.frame(id=as.factor(1:3), born=as.Date(c("1939/10/28", "1946/02/23", "1948/02/29"))) > > id born > 1 1 1939-10-28 > 2 2 1946-02-23 > 3 3 1948-02-29 > > Now, I add a new column "enter" as follows: > > 1) If the subject is 65 years old before 2010-01-01, then enter=2010-01-01. > 2) If the subject i NOT 65 years old before 2010-01-01, then enter="Date on which subject reach 65" > > DF_new <- data.frame(DF, > enter= as.Date( ifelse(unclass(round(difftime(open, DF$born)/365.25,1))<=65, > paste(year(DF$born)+65,substr(DF$born,6,10),sep="-"), paste(open))) ) > > The problem is that the DF_new output has a NA in subject id=3: > > id born enter > 1 1 1939-10-28 2010-01-01 > 2 2 1946-02-23 2011-02-23 > 3 3 1948-02-29 <NA> > > I'm afraid (I'm not really sure) that the matter is that subject id=3 would reach 65 yr at 2013-02-29, but this date does not exist, > so R gives a missing. > > Can any help me? > > Thank you!!! > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.