thr3ads.net - R help - [R] Numbering sequences of non-NAs in a vector [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Krishna Tateneni

2009-Jul-07 21:08 UTC

[R] Numbering sequences of non-NAs in a vector

Greetings, I have a vector of the form:
[10,8,1,3,0,8,NA,NA,NA,NA,2,1,6,NA,NA,NA,0,5,1,9...]  That is, a combination
of sequences of non-missing values and missing values, with each sequence
possibly of a different length.

I'd like to create another vector which will help me pick out the sequences
of non-missing values.  For the example above, this would be:
[1,1,1,1,1,1,NA,NA,NA,NA,2,2,2,NA,NA,NA,3,3,3,3...].  The goal ultimately is
to calculate means separately for each sequence.

Your help is appreciated.  If I'm making this more complicated than
necessary, I'd appreciate knowing that as well!

Many thanks.

	[[alternative HTML version deleted]]

Jorge Ivan Velez

2009-Jul-07 21:43 UTC

head link

[R] Numbering sequences of non-NAs in a vector

Dear Krishna,
Here is one way. It is not very elegant, but seems to work:

# x is the vector you want to change
foo <- function(x){
   R1 <- rle(!is.na(x))
   R2 <- rle(is.na(x))
   len <- R1$lengths[!R2$values]
   x[!is.na(x)] <- rep(1:length(len), len)
   x
  }

# Example
x <- c(10, 8, 1, 3, 0, 8, NA, NA, NA, NA, 2, 1, 6, NA, NA, NA, 0, 5, 1, 9)
foo(x)
# [1]  1  1  1  1  1  1 NA NA NA NA  2  2  2 NA NA NA  3  3  3  3

HTH,

Jorge


On Tue, Jul 7, 2009 at 5:08 PM, Krishna Tateneni <tateneni@gmail.com>
wrote:
> Greetings, I have a vector of the form:
> [10,8,1,3,0,8,NA,NA,NA,NA,2,1,6,NA,NA,NA,0,5,1,9...]  That is, a
> combination
> of sequences of non-missing values and missing values, with each sequence
> possibly of a different length.
>
> I'd like to create another vector which will help me pick out the
sequences
> of non-missing values.  For the example above, this would be:
> [1,1,1,1,1,1,NA,NA,NA,NA,2,2,2,NA,NA,NA,3,3,3,3...].  The goal ultimately
> is
> to calculate means separately for each sequence.
>
> Your help is appreciated.  If I'm making this more complicated than
> necessary, I'd appreciate knowing that as well!
>
> Many thanks.
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Stavros Macrakis

2009-Jul-07 21:52 UTC

head link

[R] Numbering sequences of non-NAs in a vector

Here's one possibility:

vv <-
c(10,8,1,3,0,8,NA,NA,NA,NA,2,1,6,NA,NA,NA,0,5,1,9)> (1+cumsum(diff(is.na(c(vv[1],vv)))==1)) * !is.na(vv) [1] 1 1 1 1 1 1 0 0 0 0 2 2 2 0 0 0 3 3 3 3



On Tue, Jul 7, 2009 at 5:08 PM, Krishna Tateneni <tateneni@gmail.com>
wrote:
> Greetings, I have a vector of the form:
> [10,8,1,3,0,8,NA,NA,NA,NA,2,1,6,NA,NA,NA,0,5,1,9...]  That is, a
> combination
> of sequences of non-missing values and missing values, with each sequence
> possibly of a different length.
>
> I'd like to create another vector which will help me pick out the
sequences
> of non-missing values.  For the example above, this would be:
> [1,1,1,1,1,1,NA,NA,NA,NA,2,2,2,NA,NA,NA,3,3,3,3...].  The goal ultimately
> is
> to calculate means separately for each sequence.
>
> Your help is appreciated.  If I'm making this more complicated than
> necessary, I'd appreciate knowing that as well!
>
> Many thanks.
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Marc Schwartz

2009-Jul-07 21:53 UTC

head link

[R] Numbering sequences of non-NAs in a vector

On Jul 7, 2009, at 4:08 PM, Krishna Tateneni wrote:
> Greetings, I have a vector of the form:
> [10,8,1,3,0,8,NA,NA,NA,NA,2,1,6,NA,NA,NA,0,5,1,9...]  That is, a  
> combination
> of sequences of non-missing values and missing values, with each  
> sequence
> possibly of a different length.
>
> I'd like to create another vector which will help me pick out the  
> sequences
> of non-missing values.  For the example above, this would be:
> [1,1,1,1,1,1,NA,NA,NA,NA,2,2,2,NA,NA,NA,3,3,3,3...].  The goal  
> ultimately is
> to calculate means separately for each sequence.
>
> Your help is appreciated.  If I'm making this more complicated than
> necessary, I'd appreciate knowing that as well!
>
> Many thanks.
Here is one possibility:

Vec <- c(10,8,1,3,0,8,NA,NA,NA,NA,2,1,6,NA,NA,NA,0,5,1,9)

 > Vec
  [1] 10  8  1  3  0  8 NA NA NA NA  2  1  6 NA NA NA  0  5  1  9


Use rle() to get the runs of NA and non-NA values. See ?rle

Runs <- rle(is.na(Vec))

 > Runs
Run Length Encoding
   lengths: int [1:5] 6 4 3 3 4
   values : logi [1:5] FALSE TRUE FALSE TRUE FALSE


Create grouping values for each run:

Grps <- rep(seq(length(Runs$lengths)), Runs$lengths)

 > Grps
  [1] 1 1 1 1 1 1 2 2 2 2 3 3 3 4 4 4 5 5 5 5


Now get the means for each run, split by Grps. See ?aggregate

 > aggregate(Vec, list(Grps = Grps), mean)
   Grps    x
1    1 5.00
2    2   NA
3    3 3.00
4    4   NA
5    5 3.75


If you don't want the NA runs included in the result, you could use  
subset():

 > subset(aggregate(Vec, list(Grps = Grps), mean), !is.na(x))
   Grps    x
1    1 5.00
3    3 3.00
5    5 3.75


HTH,

Marc Schwartz

Reasonably Related Threads

Search for more seemingly similar threads

R help - Jul 2009 - Numbering sequences of non-NAs in a vector

[R] Numbering sequences of non-NAs in a vector

[R] Numbering sequences of non-NAs in a vector

[R] Numbering sequences of non-NAs in a vector

[R] Numbering sequences of non-NAs in a vector

Reasonably Related Threads