On Fri, Jun 1, 2012 at 11:02 AM, Rick Admiraal <admiraal.rick at
gmail.com> wrote:> Dear all,
>
> As a novice user of R I ran into a problem that's quite hard for me to
> resolve. I have a database containing data of a clinical trial in which
> patients are included that survived or died:
>
> x <- matrix(data=c(1:5,0, "1/1/2012
00:00:00",0,0,"1/7/2012 00:00:00"),
> nrow=5, ncol=2, dimnames= list(NULL,c("ID",
"dateofdeath")))
>
> My file is a .csv so I don't have a problem with the brackets,
don't know
> how to show you the idea without the brackets...
>
> I want to be able to do calculations with the values in the column
> x$dateofdeath so use the chron-package for this:
>
> library(chron)
> x$dateofdeath<- as.character(x$dateofdeath)
> dateofdeath1 <- t(as.data.frame(strsplit(x$dateofdeath,' ')))
> row.names(dateofdeath1) <- NULL
> x$dateofdeath<-
>
chron(dates=dateofdeath1[,1],times=dateofdeath1[,2],format=c('d-m-y','h:m:s'))
>
> When I run this script the database looks like this:
>
>> x
> ?ID ? ? ? ? dateofdeath
> 1 ?1 ? ? ? ? ? ? (NA NA)
> 2 ?2 (01-01-12 00:00:00)
> 3 ?3 ? ? ? ? ? ? (NA NA)
> 4 ?4 ? ? ? ? ? ? (NA NA)
> 5 ?5 (01-07-12 00:00:00)
>
> The times are how I want them to be, the zero's on the other hand
should
> stay that way. In other words: how can I make the database look like this:
>
>> x
> ?ID ? ? ? ? dateofdeath
> 1 ?1 ? ? ? ? ? ? 0
> 2 ?2 (01-01-12 00:00:00)
> 3 ?3 ? ? ? ? ? ? 0
> 4 ?4 ? ? ? ? ? ? 0
> 5 ?5 (01-07-12 00:00:00)
>
> I already tried replacing the NA's with 0 which gives me 1-1-1970 (the
> default date by chron) and making a subset and only replacing values that
> are not equal to zero doesn't work either:
>
> d <- x
> ## above coding for chron with x replaced with d ##
> mat <- match(x$ID, d$ID)
> x$dateofdeath <- d$dateofdeath[mat]
>
>
> Does anybody know how to resolve this issue?
> Thanks in advance!
> Cheers, Rick Admiraal
>
Every cell of a matrix object must be of the same class whereas only
each column of a data.frame need be so you like want a data.frame and
not a matrix.
Since there are no times it would be better to use "Date" class than
chron. See R News 4/1.
To compute with the dates you don't want 0's since 0 is not a valid
value for Date class. If you want 0's for presentation that is
another matter and it can be done at the end after all your
calculations by converting the dates to character and then sticking in
"0' where needed.
If x is the matrix you posted, i.e.
x <- matrix(data=c(1:5,0, "1/1/2012 00:00:00",0,0,"1/7/2012
00:00:00"),
nrow=5, ncol=2, dimnames= list(NULL,c("ID", "dateofdeath")))
then:
# data frame with numeric and Date columns
DF <- data.frame(
ID = as.numeric(x[, 1]),
dateofdeath = as.Date(ifelse(x[, 2] == "0", NA, x[,2]))
)
# for presentation only convert dateofdeath
# to character with "0"' replacing NA
DF2 <- transform(DF, dateofdeath = ifelse(is.na(dateofdeath), "0",
as.character(dateofdeath)))
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com