Hello All, I've been reading books about R for awhile now and am in the process of replicating the SAS analyses from an old report. I want to be sure that I can do all the things I need to in R before using it in my daily work. So far, I've managed to read in all my data and have done some data manipulation. I'm having trouble with fixing an error in a date variable though, and was hoping someone could help. One of the patients in my data has a DOB incorrectly entered as: '11/23/21931' Their DOB should be: '11/23/1931' How can I correct this problem before calculating age in the code below? DOB starts out as a factor in the Demo dataframe but then is converted into a date. So I had thought the ifelse that follows could be used to correct the problem, but this doesn't seem to be the case. Thanks, Paul Demo_Char <- within(Demo, { DateCompleted <- as.Date(DateCompleted, format = "%m/%d/%Y") DOB <- as.Date(DOB, format = "%m/%d/%Y") DOB <- ifelse(Subject==108945, as.Date("1931-11-23"), DOB) Age <- as.integer((DateCompleted - DOB) / 365.25) })
On Nov 7, 2011, at 9:39 AM, Paul Miller wrote:> Hello All, > > I've been reading books about R for awhile now and am in the process > of replicating the SAS analyses from an old report. I want to be > sure that I can do all the things I need to in R before using it in > my daily work. > > So far, I've managed to read in all my data and have done some data > manipulation. I'm having trouble with fixing an error in a date > variable though, and was hoping someone could help. > > One of the patients in my data has a DOB incorrectly entered as: > > '11/23/21931' > > Their DOB should be: > > '11/23/1931' > > How can I correct this problem before calculating age in the code > below? > DOB starts out as a factor in the Demo dataframe but then is > converted into a date. So I had thought the ifelse that follows > could be used to correct the problem, but this doesn't seem to be > the case. > > Thanks, > > Paul >Why not fix the single error first? Demo[ Demo$Subject==108945, "DateCompleted"] <- '11/23/1931' Then you can skip the time-consuming ifelse() inside the within() call. -- David> Demo_Char <- within(Demo, { > DateCompleted <- as.Date(DateCompleted, format = "%m/%d/%Y") > DOB <- as.Date(DOB, format = "%m/%d/%Y") > DOB <- ifelse(Subject==108945, as.Date("1931-11-23"), DOB) > Age <- as.integer((DateCompleted - DOB) / 365.25) > }) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
I think you are making the transform much more complicated than it needs to be: Suppose you have a data frame with a bunch of things that look like dates but are really factors: Then the following transform should work from factor to Date: df <- as.Date(as.character(df), format = "%Y/%m/%d") and to address the mistyped element: df[df == "11/2321931"] <- "11/23/1931" You should probably do this before the conversion to date type. If you want to do it in a look up-ish sort of way, this is probably better: within(Demo, DOB[Subject == 108945] <- "11/23/1931") Michael On Mon, Nov 7, 2011 at 9:39 AM, Paul Miller <pjmiller_57 at yahoo.com> wrote:> Hello All, > > I've been reading books about R for awhile now and am in the process of replicating the SAS analyses from an old report. I want to be sure that I can do all the things I need to in R before using it in my daily work. > > So far, I've managed to read in all my data and have done some data manipulation. I'm having trouble with fixing an error in a date variable though, and was hoping someone could help. > > One of the patients in my data has a DOB incorrectly entered as: > > '11/23/21931' > > Their DOB should be: > > '11/23/1931' > > How can I correct this problem before calculating age in the code below? > DOB starts out as a factor in the Demo dataframe but then is converted into a date. So I had thought the ifelse that follows could be used to correct the problem, but this doesn't seem to be the case. > > Thanks, > > Paul > > Demo_Char <- within(Demo, { > DateCompleted <- as.Date(DateCompleted, format = "%m/%d/%Y") > DOB <- as.Date(DOB, format = "%m/%d/%Y") > DOB <- ifelse(Subject==108945, as.Date("1931-11-23"), DOB) > Age <- as.integer((DateCompleted - DOB) / 365.25) > }) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >