Hello, I have a very long (~50,000) sequence of repeating numbers. The first 100 are: [1] 0 0 0 0 0 0 0 0 0 0 0 429 [13] 429 429 429 429 429 429 429 858 858 858 858 858 [25] 858 1287 1287 1287 1287 1287 1716 2145 2145 2574 2574 3003 [37] 3003 3432 3432 3861 4290 4719 5148 5577 5577 6006 6006 6006 [49] 6435 6435 6435 6864 6864 7293 7293 7293 7722 7722 7722 7722 [61] 8151 8151 8151 8580 8580 8580 9009 9009 9009 9009 9438 9438 [73] 9438 9438 9867 9867 9867 10296 10296 10296 10725 10725 10725 10725 [85] 11154 11154 11154 11154 11154 11583 11583 11583 11583 12012 12012 12012 [97] 12012 12441 12441 12441 What I want is to produce a vector of lengths for each contiguous run of numbers . i.e. for the above example, the first three items of the vector returned would be: 11 8 6 ...to represent the counts of 0, 429, and 585, respectively. I could do this with unique() and a for loop, but this would be very inefficient. Any advice on how to do this efficiently would be most appreciated. thanks Tony
Henrique Dallazuanna
2010-Feb-22 17:32 UTC
[R] counting repeating sequence lengths in a vector
Try this: rle(x)$length On Mon, Feb 22, 2010 at 2:27 PM, Larson, TR <trl1 at york.ac.uk> wrote:> Hello, > I have a very long (~50,000) sequence of repeating numbers. ?The first 100 > are: > > [1] ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? 429 > ?[13] ? 429 ? 429 ? 429 ? 429 ? 429 ? 429 ? 429 ? 858 ? 858 ? 858 ? 858 ?858 > ?[25] ? 858 ?1287 ?1287 ?1287 ?1287 ?1287 ?1716 ?2145 ?2145 ?2574 ?2574 > ?3003 > ?[37] ?3003 ?3432 ?3432 ?3861 ?4290 ?4719 ?5148 ?5577 ?5577 ?6006 ?6006 > ?6006 > ?[49] ?6435 ?6435 ?6435 ?6864 ?6864 ?7293 ?7293 ?7293 ?7722 ?7722 ?7722 > ?7722 > ?[61] ?8151 ?8151 ?8151 ?8580 ?8580 ?8580 ?9009 ?9009 ?9009 ?9009 ?9438 > ?9438 > ?[73] ?9438 ?9438 ?9867 ?9867 ?9867 10296 10296 10296 10725 10725 10725 > 10725 > ?[85] 11154 11154 11154 11154 11154 11583 11583 11583 11583 12012 12012 > 12012 > ?[97] 12012 12441 12441 12441 > > > What I want is to produce a vector of lengths for each contiguous run of > numbers . i.e. for the above example, the first three items of the vector > returned would be: > > 11 8 6 > > ...to represent the counts of 0, 429, and 585, respectively. ?I could do > this with unique() and a for loop, but this would be very inefficient. Any > advice on how to do this efficiently would be most appreciated. > > thanks > Tony > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O