I'm running into a problem I can't seem to find a solution for. I'm attempting to add sequences into an existing data set based on subsets of the data. I've done this using a for loop with a small subset of data, but attempting the same process using real data (200k rows) is taking way too long. Here is some sample data and my ultimate goal> row1<-c(0,1,2,3,4,5,1,2,3,4)> row2<-c(1,1,1,1,1,1,2,2,2,2)> stuff<-data.frame(row1=row1,row2=row2)> stuffrow1 row2 1 0 1 2 1 1 3 2 1 4 3 1 5 4 1 6 5 1 7 1 2 8 2 2 9 3 2 10 4 2 I need to derive 2 columns. I need a sequence for each unique row2, and then I need a sequence that restarts based on a cutoff value for row1 and unique row2. The following table is what is -should- look like using a cutoff of 3 for row4 row1 row2 row3 row4 1 0 1 1 1 2 1 1 2 2 3 2 1 3 3 4 3 1 4 1 5 4 1 5 2 6 5 1 6 3 7 1 2 1 1 8 2 2 2 2 9 3 2 3 1 10 4 2 4 2 I need something like row3<-sequence(nrow(unique(stuff$row2))) that actually works :-) Here is the for loop that functions properly for row3: stuff$row3<-c(1) for (i in 2:nrow(stuff)) { if ( stuff$row2[i] == stuff$row2[i-1]) { stuff$row3[i] = stuff$row3[i-1]+1}} Thanks! Jason Baucom Ateb, Inc. 919.882.4992 O 919.872.1645 F www.ateb.com <http://www.ateb.com/> [[alternative HTML version deleted]]
Henrique Dallazuanna
2009-Aug-27 15:02 UTC
[R] generating multiple sequences in subsets of data
Try this; stuff$row3 <- with(stuff, ave(row1, row2, FUN = seq)) I don't understand the fourth column On Thu, Aug 27, 2009 at 11:55 AM, Jason Baucom <jason.baucom@ateb.com>wrote:> I'm running into a problem I can't seem to find a solution for. I'm > attempting to add sequences into an existing data set based on subsets > of the data. I've done this using a for loop with a small subset of > data, but attempting the same process using real data (200k rows) is > taking way too long. > > > > Here is some sample data and my ultimate goal > > > row1<-c(0,1,2,3,4,5,1,2,3,4) > > > row2<-c(1,1,1,1,1,1,2,2,2,2) > > > stuff<-data.frame(row1=row1,row2=row2) > > > stuff > > row1 row2 > > 1 0 1 > > 2 1 1 > > 3 2 1 > > 4 3 1 > > 5 4 1 > > 6 5 1 > > 7 1 2 > > 8 2 2 > > 9 3 2 > > 10 4 2 > > > > > > I need to derive 2 columns. I need a sequence for each unique row2, and > then I need a sequence that restarts based on a cutoff value for row1 > and unique row2. The following table is what is -should- look like using > a cutoff of 3 for row4 > > > > row1 row2 row3 row4 > > 1 0 1 1 1 > > 2 1 1 2 2 > > 3 2 1 3 3 > > 4 3 1 4 1 > > 5 4 1 5 2 > > 6 5 1 6 3 > > 7 1 2 1 1 > > 8 2 2 2 2 > > 9 3 2 3 1 > > 10 4 2 4 2 > > > > I need something like row3<-sequence(nrow(unique(stuff$row2))) that > actually works :-) Here is the for loop that functions properly for > row3: > > > > stuff$row3<-c(1) > > for (i in 2:nrow(stuff)) { if ( stuff$row2[i] == stuff$row2[i-1]) { > stuff$row3[i] = stuff$row3[i-1]+1}} > > Thanks! > > > > Jason Baucom > > Ateb, Inc. > > 919.882.4992 O > > 919.872.1645 F > > www.ateb.com <http://www.ateb.com/> > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]