Zhixin Liu
2008-Dec-15 01:56 UTC
[R] how to create duplicated ID in multi-records per subject dataset
Hi R helpers, If I have a dataset looks like: ID record 1 20 . 30 . 25 2 26 . 15 3 21 4..................... And I want it becomes ID record 1 20 1 30 1 25 2 26 2 15 3 21 4..................... That is, I have to duplicate IDs for those with multiple records. I am wondering it is possible to be done in R, and I am grateful if you would like to show me the direction. Many thanks! Zhixin
markleeds at verizon.net
2008-Dec-15 02:12 UTC
[R] how to create duplicated ID in multi-records per subject dataset
hi: change your dots to NAs and then use na.locf in the zoo package. i didn't test it but i think that should work. DF$ID[DF$ID == .]<-NA DF$ID<-na.locf(DF$ID) On Sun, Dec 14, 2008 at 8:56 PM, Zhixin Liu wrote:> Hi R helpers, > > If I have a dataset looks like: > ID record 1 20 > . 30 > . 25 > 2 26 > . 15 > 3 21 > 4..................... > > And I want it becomes ID record 1 20 > 1 30 > 1 25 > 2 26 > 2 15 > 3 21 > 4..................... > > That is, I have to duplicate IDs for those with multiple records. I am > wondering it is possible to be done in R, and I am grateful if you > would like to show me the direction. > > Many thanks! > > Zhixin > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
andrew
2008-Dec-15 02:19 UTC
[R] how to create duplicated ID in multi-records per subject dataset
if the records are in the file dupIDs.txt, then when you read them in, the IDs become factors. Coercing them to numeric gets them to assign a unique number to each factor. So, you could try the following: dupIDs <- read.table("dupIDs.txt", header = T) dupIDs$ID2 <- cummax(as.numeric(dupIDs$ID)-1)> dupIDsID record ID2 1 1 20 1 2 . 30 1 3 . 25 1 4 2 26 2 5 . 15 2 6 3 21 3 HTH, Andrew. On Dec 15, 12:56?pm, "Zhixin Liu" <z... at efs.mq.edu.au> wrote:> Hi R helpers, > > If I have a dataset looks like: > ID ? record > 1 ? ? ? ?20 > . ? ? ? ? 30 > . ? ? ? ? 25 > 2 ? ? ? ? 26 > . ? ? ? ? 15 > 3 ? ? ? ? 21 > 4..................... > > And I want it becomes > ID ? record > 1 ? ? ? ?20 > 1 ? ? ? ?30 > 1 ? ? ? ?25 > 2 ? ? ? ? 26 > 2 ? ? ? ?15 > 3 ? ? ? ? 21 > 4..................... > > That is, I have to duplicate IDs for those with multiple records. I am wondering it is possible to be done in R, and I am grateful if you would like to show me the direction. > > Many thanks! > > Zhixin > > ______________________________________________ > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
markleeds at verizon.net
2008-Dec-15 02:52 UTC
[R] how to create duplicated ID in multi-records per subject dataset
andrew has a point which makes my solution wrong. you'd have to change the factors to numerics and I'm not sure what would happen when you did that. if you want to send a sample file of your data, that would be best but andrew's suggestion may work right off the bat. On Sun, Dec 14, 2008 at 9:19 PM, andrew wrote:> if the records are in the file dupIDs.txt, then when you read them in, > the IDs become factors. Coercing them to numeric gets them to assign > a unique number to each factor. > > So, you could try the following: > > dupIDs <- read.table("dupIDs.txt", header = T) > dupIDs$ID2 <- cummax(as.numeric(dupIDs$ID)-1) > >> dupIDs > ID record ID2 > 1 1 20 1 > 2 . 30 1 > 3 . 25 1 > 4 2 26 2 > 5 . 15 2 > 6 3 21 3 > > HTH, > > Andrew. > > > On Dec 15, 12:56?pm, "Zhixin Liu" <z... at efs.mq.edu.au> wrote: >> Hi R helpers, >> >> If I have a dataset looks like: >> ID ? record >> 1 ? ? ? ?20 >> . ? ? ? ? 30 >> . ? ? ? ? 25 >> 2 ? ? ? ? 26 >> . ? ? ? ? 15 >> 3 ? ? ? ? 21 >> 4..................... >> >> And I want it becomes >> ID ? record >> 1 ? ? ? ?20 >> 1 ? ? ? ?30 >> 1 ? ? ? ?25 >> 2 ? ? ? ? 26 >> 2 ? ? ? ?15 >> 3 ? ? ? ? 21 >> 4..................... >> >> That is, I have to duplicate IDs for those with multiple records. I >> am wondering it is possible to be done in R, and I am grateful if you >> would like to show me the direction. >> >> Many thanks! >> >> Zhixin >> >> ______________________________________________ >> R-h... at r-project.org mailing >> listhttps://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting >> guidehttp://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Henrique Dallazuanna
2008-Dec-15 10:47 UTC
[R] how to create duplicated ID in multi-records per subject dataset
Try this: Lines <- "ID;record 1;20 ;30 ;25 2;26 ;15 3;21" x <- read.table(textConnection(Lines), sep = ";", header = TRUE) library(zoo) x$ID <- na.locf(x$ID) On Sun, Dec 14, 2008 at 11:56 PM, Zhixin Liu <zliu@efs.mq.edu.au> wrote:> Hi R helpers, > > If I have a dataset looks like: > ID record > 1 20 > . 30 > . 25 > 2 26 > . 15 > 3 21 > 4..................... > > And I want it becomes > ID record > 1 20 > 1 30 > 1 25 > 2 26 > 2 15 > 3 21 > 4..................... > > That is, I have to duplicate IDs for those with multiple records. I am > wondering it is possible to be done in R, and I am grateful if you would > like to show me the direction. > > Many thanks! > > Zhixin > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
Jagat.K.Sheth at wellsfargo.com
2008-Dec-15 17:32 UTC
[R] how to create duplicated ID in multi-records per subject dataset
You can also try ?rep and something like dat <- read.table(textConnection("ID record 1 20 . 30 . 25 2 26 . 15 3 21 "),header=TRUE,na.strings=".") ind <- !is.na(dat$ID) id <- dat$ID[ind] reps <- diff(c(seq_len(nrow(dat))[ind],nrow(dat)+1)) dat$new.id <- rep(id,reps) dat ID record new.id 1 1 20 1 2 NA 30 1 3 NA 25 1 4 2 26 2 5 NA 15 2 6 3 21 3> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Zhixin Liu > Sent: Sunday, December 14, 2008 7:57 PM > To: r-help at r-project.org > Subject: [R] how to create duplicated ID in multi-records per > subject dataset > > Hi R helpers, > > If I have a dataset looks like: > ID record > 1 20 > . 30 > . 25 > 2 26 > . 15 > 3 21 > 4..................... > > And I want it becomes > ID record > 1 20 > 1 30 > 1 25 > 2 26 > 2 15 > 3 21 > 4..................... > > That is, I have to duplicate IDs for those with multiple > records. I am wondering it is possible to be done in R, and I > am grateful if you would like to show me the direction. > > Many thanks! > > Zhixin > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >