Steven Ranney
2013-Oct-11 13:26 UTC
[R] Create sequential vector for values in another column
Hello all - I have an example column in a dataFrame id.name 123.45 123.45 123.45 123.45 234.56 234.56 234.56 234.56 234.56 234.56 234.56 345.67 345.67 345.67 456.78 456.78 456.78 456.78 456.78 456.78 456.78 456.78 456.78 ... [truncated] And I'd like to create a second vector of sequential values (i.e., 1:N) for each unique id.name value. In other words, I need id.name x 123.45 1 123.45 2 123.45 3 123.45 4 234.56 1 234.56 2 234.56 3 234.56 4 234.56 5 234.56 6 234.56 7 345.67 1 345.67 2 345.67 3 456.78 1 456.78 2 456.78 3 456.78 4 456.78 5 456.78 6 456.78 7 456.78 8 456.78 9 The number of unique id.name values is different; for some values, nrow() may be 42 and for others it may be 36, etc. The only way I could think of to do this is with two nested for loops. I tried it but because this data set is so large (nrow = 112,679 with 2,161 unique values of id.name), it took several hours to run. Is there an easier way to create this vector? I'd appreciate your thoughts. Thanks - SR Steven H. Ranney [[alternative HTML version deleted]]
Also, it might be faster to use ?data.table() library(data.table) ?dt1<- data.table(dat1,key='id.name') dt1[,x:=seq(.N),by='id.name'] A.K. On , arun <smartpink111 at yahoo.com> wrote: Hi, Try: dat1<- structure(list(id.name = c(123.45, 123.45, 123.45, 123.45, 234.56, 234.56, 234.56, 234.56, 234.56, 234.56, 234.56, 345.67, 345.67, 345.67, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78, 456.78)), .Names = "id.name", class = "data.frame", row.names = c(NA, -23L)) dat1$x <- with(dat1,ave(id.name,id.name,FUN=seq)) A.K. On Friday, October 11, 2013 9:28 AM, Steven Ranney <steven.ranney at gmail.com> wrote: Hello all - I have an example column in a dataFrame id.name 123.45 123.45 123.45 123.45 234.56 234.56 234.56 234.56 234.56 234.56 234.56 345.67 345.67 345.67 456.78 456.78 456.78 456.78 456.78 456.78 456.78 456.78 456.78 ... [truncated] And I'd like to create a second vector of sequential values (i.e., 1:N) for each unique id.name value.? In other words, I need id.name? x 123.45?? 1 123.45?? 2 123.45?? 3 123.45?? 4 234.56?? 1 234.56?? 2 234.56?? 3 234.56?? 4 234.56?? 5 234.56?? 6 234.56?? 7 345.67?? 1 345.67?? 2 345.67?? 3 456.78?? 1 456.78?? 2 456.78?? 3 456.78?? 4 456.78?? 5 456.78?? 6 456.78?? 7 456.78?? 8 456.78?? 9 The number of unique id.name values is different; for some values, nrow() may be 42 and for others it may be 36, etc. The only way I could think of to do this is with two nested for loops.? I tried it but because this data set is so large (nrow = 112,679 with 2,161 unique values of id.name), it took several hours to run. Is there an easier way to create this vector?? I'd appreciate your thoughts. Thanks - SR Steven H. Ranney ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
PIKAL Petr
2013-Oct-11 13:49 UTC
[R] Create sequential vector for values in another column
Hi I named your data test test$x<-1 test$x<-unlist(lapply(split(test$x, test$id.name), cumsum))> testid.name x 1 123.45 1 2 123.45 2 3 123.45 3 4 123.45 4 5 234.56 1 6 234.56 2 7 234.56 3 8 234.56 4 9 234.56 5 10 234.56 6 11 234.56 7 12 345.67 1 13 345.67 2 14 345.67 3 15 456.78 1 16 456.78 2 17 456.78 3 18 456.78 4 19 456.78 5 20 456.78 6 21 456.78 7 22 456.78 8 23 456.78 9>Two comments: This works only when your data are sorted Beware of FAQ 7.31 Regards Petr> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Steven Ranney > Sent: Friday, October 11, 2013 3:26 PM > To: r-help at r-project.org > Subject: [R] Create sequential vector for values in another column > > Hello all - > > I have an example column in a dataFrame > > id.name > 123.45 > 123.45 > 123.45 > 123.45 > 234.56 > 234.56 > 234.56 > 234.56 > 234.56 > 234.56 > 234.56 > 345.67 > 345.67 > 345.67 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > ... > [truncated] > > And I'd like to create a second vector of sequential values (i.e., 1:N) > for each unique id.name value. In other words, I need > > id.name x > 123.45 1 > 123.45 2 > 123.45 3 > 123.45 4 > 234.56 1 > 234.56 2 > 234.56 3 > 234.56 4 > 234.56 5 > 234.56 6 > 234.56 7 > 345.67 1 > 345.67 2 > 345.67 3 > 456.78 1 > 456.78 2 > 456.78 3 > 456.78 4 > 456.78 5 > 456.78 6 > 456.78 7 > 456.78 8 > 456.78 9 > > The number of unique id.name values is different; for some values, > nrow() may be 42 and for others it may be 36, etc. > > The only way I could think of to do this is with two nested for loops. > I tried it but because this data set is so large (nrow = 112,679 with > 2,161 unique values of id.name), it took several hours to run. > > Is there an easier way to create this vector? I'd appreciate your > thoughts. > > Thanks - > > SR > Steven H. Ranney > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Berend Hasselman
2013-Oct-11 13:58 UTC
[R] Create sequential vector for values in another column
On 11-10-2013, at 15:26, Steven Ranney <steven.ranney at gmail.com> wrote:> Hello all - > > I have an example column in a dataFrame > > id.name > 123.45 > 123.45 > 123.45 > 123.45 > 234.56 > 234.56 > 234.56 > 234.56 > 234.56 > 234.56 > 234.56 > 345.67 > 345.67 > 345.67 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > 456.78 > ... > [truncated] > > And I'd like to create a second vector of sequential values (i.e., 1:N) for > each unique id.name value. In other words, I need > > id.name x > 123.45 1 > 123.45 2 > 123.45 3 > 123.45 4 > 234.56 1 > 234.56 2 > 234.56 3 > 234.56 4 > 234.56 5 > 234.56 6 > 234.56 7 > 345.67 1 > 345.67 2 > 345.67 3 > 456.78 1 > 456.78 2 > 456.78 3 > 456.78 4 > 456.78 5 > 456.78 6 > 456.78 7 > 456.78 8 > 456.78 9 > > The number of unique id.name values is different; for some values, nrow() > may be 42 and for others it may be 36, etc. > > The only way I could think of to do this is with two nested for loops. I > tried it but because this data set is so large (nrow = 112,679 with 2,161 > unique values of id.name), it took several hours to run. > > Is there an easier way to create this vector? I'd appreciate your thoughts.I named your dataframe dat1. You can also do this unlist(sapply(rle(dat1$id.name)$lengths, function(k) 1:k )) And as Petr told you: beware of FAQ 7.31 Berend