John-Paul Bogers
2010-Jun-26 12:37 UTC
[R] Recoding dates to session id in a longitudinal dataset
Hi, I'm fairly new to R but I have a large dataset (300000 obs) containing patient material. Some patients came 2-9 times during the three year observation period. The patients are identified by a unique idnr, the sessions can be distinguished using the session date. How can I recode the date of the session to a session id (1-9). This would be necessary to obtain information and do some analysis on the first occurence of a specific patient or to look for trends. Thanks JP Bogers University of Antwerp [[alternative HTML version deleted]]
John-Paul Bogers
2010-Jun-27 05:56 UTC
[R] Recoding dates to session id in a longitudinal dataset
---------- Forwarded message ---------- From: John-Paul Bogers <john-paul.bogers@ua.ac.be> Date: Sat, Jun 26, 2010 at 10:14 PM Subject: Re: [R] Recoding dates to session id in a longitudinal dataset To: jim holtman <jholtman@gmail.com> Dear Jim, he data concerns HPV screening data. The data looks as follows pat1 sampledate1 HPV16 0.3 pat2 sampledate2 HPV16 0 pat3 sampledata3 HPV16 0.5 pat1 sampledate4 HPV16 0.6 pat4 sampledate5 HPV16 0 pat2 sampledate6 HPV16 0 pat1 sampledate7 HPV16 0 What I would like is pat1 1 HPV16 0.3 pat2 1 HPV16 0 pat3 1 HPV16 0.5 pat1 2 HPV16 0.6 pat4 1 HPV16 0 pat2 2 HPV16 0 pat1 3 HPV16 0 I would like to recode sampledate (real date, in date format) to session sequence (first sample of this patient, second sample of this patient, ....) I hope this makes it clear. Thanks JP PS: I answered this as a reply to your private mail, how do I get this on the mailinglist? On Sat, Jun 26, 2010 at 7:59 PM, jim holtman <jholtman@gmail.com> wrote:> It would be useful if you could provide an example of what the data > looks like now and what you would like it to look like; otherwise it > is impossible to help. > > On Sat, Jun 26, 2010 at 8:37 AM, John-Paul Bogers > <john-paul.bogers@ua.ac.be> wrote: > > Hi, > > > > I'm fairly new to R but I have a large dataset (300000 obs) containing > > patient material. Some patients came 2-9 times during the three year > > observation period. The patients are identified by a unique idnr, the > > sessions can be distinguished using the session date. How can I recode > the > > date of the session to a session id (1-9). This would be necessary to > obtain > > information and do some analysis on the first occurence of a specific > > patient or to look for trends. > > > > Thanks > > > > JP Bogers > > University of Antwerp > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? >[[alternative HTML version deleted]]
JP Bogers
2010-Jun-27 06:59 UTC
[R] Recoding dates to session id in a longitudinal dataset
Hi Jim, Thanks for the answer. What I actually want is a session sequence 1,2,... per patient. This would be very useful to look at trends of HPV infections from the first to the second sample etc. It would also allow me to extract the HPV data of the first sample (session 1). Thx JP On Sat, Jun 26, 2010 at 10:23 PM, jim holtman <jholtman@gmail.com> wrote:> Here is one way of doing it: > > > # data by patient and then sequentially number the data > > x > V1 V2 V3 V4 > 1 pat1 sampledate1 HPV16 0.3 > 2 pat2 sampledate2 HPV16 0.0 > 3 pat3 sampledata3 HPV16 0.5 > 4 pat1 sampledate4 HPV16 0.6 > 5 pat4 sampledate5 HPV16 0.0 > 6 pat2 sampledate6 HPV16 0.0 > 7 pat1 sampledate7 HPV16 0.0 > > x.s <- split(x, x$V1) > > # now put in the ids > > do.call(rbind, lapply(x.s, function(.pat){ > + .pat$V2 <- seq(nrow(.pat)) > + .pat > + })) > V1 V2 V3 V4 > pat1.1 pat1 1 HPV16 0.3 > pat1.4 pat1 2 HPV16 0.6 > pat1.7 pat1 3 HPV16 0.0 > pat2.2 pat2 1 HPV16 0.0 > pat2.6 pat2 2 HPV16 0.0 > pat3 pat3 1 HPV16 0.5 > pat4 pat4 1 HPV16 0.0 > > > On Sat, Jun 26, 2010 at 4:14 PM, John-Paul Bogers > <john-paul.bogers@ua.ac.be> wrote: > > Dear Jim, > > he data concerns HPV screening data. > > The data looks as follows > > pat1 sampledate1 HPV16 0.3 > > pat2 sampledate2 HPV16 0 > > pat3 sampledata3 HPV16 0.5 > > pat1 sampledate4 HPV16 0.6 > > pat4 sampledate5 HPV16 0 > > pat2 sampledate6 HPV16 0 > > pat1 sampledate7 HPV16 0 > > What I would like is > > pat1 1 HPV16 0.3 > > pat2 1 HPV16 0 > > pat3 1 HPV16 0.5 > > pat1 2 HPV16 0.6 > > pat4 1 HPV16 0 > > pat2 2 HPV16 0 > > pat1 3 HPV16 0 > > I would like to recode sampledate (real date, in date format) to session > > sequence (first sample of this patient, second sample of this patient, > ....) > > I hope this makes it clear. > > Thanks > > JP > > PS: I answered this as a reply to your private mail, how do I get this on > > the mailinglist? > > On Sat, Jun 26, 2010 at 7:59 PM, jim holtman <jholtman@gmail.com> wrote: > >> > >> It would be useful if you could provide an example of what the data > >> looks like now and what you would like it to look like; otherwise it > >> is impossible to help. > >> > >> On Sat, Jun 26, 2010 at 8:37 AM, John-Paul Bogers > >> <john-paul.bogers@ua.ac.be> wrote: > >> > Hi, > >> > > >> > I'm fairly new to R but I have a large dataset (300000 obs) containing > >> > patient material. Some patients came 2-9 times during the three year > >> > observation period. The patients are identified by a unique idnr, the > >> > sessions can be distinguished using the session date. How can I recode > >> > the > >> > date of the session to a session id (1-9). This would be necessary to > >> > obtain > >> > information and do some analysis on the first occurence of a specific > >> > patient or to look for trends. > >> > > >> > Thanks > >> > > >> > JP Bogers > >> > University of Antwerp > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-help@r-project.org mailing list > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > >> > > >> > >> > >> > >> -- > >> Jim Holtman > >> Cincinnati, OH > >> +1 513 646 9390 > >> > >> What is the problem that you are trying to solve? > > > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? >[[alternative HTML version deleted]]
Reasonably Related Threads
- Deleting rows or cols that do not meet cut off
- how to edit my R codes into a efficient way
- In-string variable/symbol substitution: What formats/syntax is out there?
- Extract some character from a character vector of length 1
- number of patients in a hospital on a given date