Hello R user, I have a data set from a longitudinal study ( sample below) where subjects are followed over time. Second column (status) contains info about if subject is dead or still in the study and third column is time measured in the week. Here is what I need: if status is not dead or unknown take the last week, if status is dead or unknown I need to have corresponding week. Desired resulst: 1 no 7 2 yes 5 3 Unknown 4 Sample data id status week 1 no 1 1 no 2 1 no 3 1 no 4 1 no 5 1 no 6 1 no 7 2 no 1 2 no 2 2 no 3 2 no 4 2 yes 5 2 yes 6 2 na 7 2 na 8 2 na 9 3 no 1 3 no 2 3 no 3 3 Unknown 4 3 unknown 5 3 na 6 3 na 7 3 na 8 Any suggestion is much appreciated! Thank you. Bibek
On Fri, 18 Jan 2013, bibek sharma wrote:> I have a data set from a longitudinal study ( sample below) where subjects > are followed over time. Second column (status) contains info about if > subject is dead or still in the study and third column is time measured in > the week. Here is what I need: if status is not dead or unknown take the > last week, if status is dead or unknown I need to have corresponding week. > > Desired resulst: > > 1 no 7 > 2 yes 5 > 3 Unknown 4Looks like a survival analysis situation. I know there are R packages for this. Rich
HI, May be this helps: dat1<-read.table(text=" id status week 1 no 1 1 no 2 1 no 3 1 no 4 1 no 5 1 no 6 1 no 7 2 no 1 2 no 2 2 no 3 2 no 4 2 yes 5 2 yes 6 2 na 7 2 na 8 2 na 9 3 no 1 3 no 2 3 no 3 3 Unknown 4 3 unknown 5 3 na 6 3 na 7 3 na 8 ",sep="",header=TRUE,stringsAsFactors=FALSE,na.strings="na") ?dat2<-dat1[complete.cases(dat1),] ?res<-do.call(rbind,lapply(split(dat2,dat2$id),function(x) rbind(tail(x[all(x[,2]=="no")],1),head(x[x[,2]=="yes"|x[,2]=="Unknown",],1)))) ?res #? id? status week #1? 1????? no??? 7 #2? 2???? yes??? 5 #3? 3 Unknown??? 4 A.K. ----- Original Message ----- From: Rich Shepard <rshepard at appl-ecosys.com> To: R help <r-help at r-project.org> Cc: Sent: Friday, January 18, 2013 12:18 PM Subject: Re: [R] longitudinal study On Fri, 18 Jan 2013, bibek sharma wrote:> I have a data set from a longitudinal study ( sample below) where subjects > are followed over time. Second column (status) contains info about if > subject is dead or still in the study and third column is time measured in > the week. Here is what I need: if status is not dead or unknown take the > last week, if status is dead or unknown I need to have corresponding week. > > Desired resulst: > > 1??? no??? 7 > 2??? yes??? 5 > 3??? Unknown??? 4? Looks like a survival analysis situation. I know there are R packages for this. Rich ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Bibek,
You can do this in different ways:
set.seed(15)
dat1<-as.data.frame(matrix(sample(c("Yes","No"),20,replace=TRUE),ncol=5),stringsAsFactors=FALSE)
dat1[dat1=="Yes"]<-1
dat1[dat1=="No"]<-0
dat1[]<- sapply(dat1,as.numeric)
?dat1
# ?V1 V2 V3 V4 V5
#1 ?0 ?1 ?0 ?0 ?1
#2 ?1 ?0 ?0 ?0 ?0
#3 ?0 ?0 ?1 ?0 ?1
#4 ?0 ?1 ?0 ?0 ?0
#or
set.seed(15)
?dat2<-as.data.frame(matrix(sample(c("Yes","No"),20,replace=TRUE),ncol=5),stringsAsFactors=TRUE)
dat2[]<-sapply(dat2,as.numeric)-1
?dat2
# ?V1 V2 V3 V4 V5
#1 ?0 ?1 ?0 ?0 ?1
#2 ?1 ?0 ?0 ?0 ?0
#3 ?0 ?0 ?1 ?0 ?1
#4 ?0 ?1 ?0 ?0 ?0
?identical(dat1,dat2)
#[1] TRUE
A.K.
----- Original Message -----
From: bibek sharma <mbhpathak at gmail.com>
To: arun <smartpink111 at yahoo.com>
Cc:
Sent: Monday, February 4, 2013 12:54 PM
Subject: Re: [R] longitudinal study
Dear Arun,
I have a data set with dim 1200 by 56 where the variable has binomial
response ( yes or no). I want to replace all the "yes" by 1 and
"no"by
0 in the entire data set. is there any sort cut way of recoding?
Thank in advance for your help.
Bibek
On Fri, Jan 18, 2013 at 11:54 AM, arun <smartpink111 at yahoo.com>
wrote:> Dear Bibek,
>
> No problem.
> Arun
>
>
>
>
> ----- Original Message -----
> From: bibek sharma <mbhpathak at gmail.com>
> To: arun <smartpink111 at yahoo.com>
> Cc:
> Sent: Friday, January 18, 2013 1:56 PM
> Subject: Re: [R] longitudinal study
>
> Arun,
> Thank you friend.
> You helped me a lot.
> I really appreciate it.
> Thanks,
>
>
> On Fri, Jan 18, 2013 at 10:25 AM, arun <smartpink111 at yahoo.com>
wrote:
>> Hi,
>>
>> In my reply, I deleted the "na" values.? The reason was that
your dataset had both "na" and "Unknown" and the `Desired
results` didn't had "na".
>>
>> Arun
>>
>>
>>
>> ----- Original Message -----
>> From: Rich Shepard <rshepard at appl-ecosys.com>
>> To: R help <r-help at r-project.org>
>> Cc:
>> Sent: Friday, January 18, 2013 12:18 PM
>> Subject: Re: [R] longitudinal study
>>
>> On Fri, 18 Jan 2013, bibek sharma wrote:
>>
>>> I have a data set from a longitudinal study ( sample below) where
subjects
>>> are followed over time. Second column (status) contains info about
if
>>> subject is dead or still in the study and third column is time
measured in
>>> the week. Here is what I need: if status is not dead or unknown
take the
>>> last week, if status is dead or unknown I need to have
corresponding week.
>>>
>>> Desired resulst:
>>>
>>> 1? ? no? ? 7
>>> 2? ? yes? ? 5
>>> 3? ? Unknown? ? 4
>>
>>? ? Looks like a survival analysis situation. I know there are R
packages for
>> this.
>>
>> Rich
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>