Happy Friday!
Using this function:
fixSeq <- function(df) {
shift1 <- function(x) c(1, x[-length(x)])
df$state_shift<-df$state
df.rle<-rle(df$state_shift)
repeat {
shifted.sf<-shift1(df.rle$values)
change <- df.rle$values >= 4 & shifted.sf >= 4 & shifted.sf
!= df.rle$values
if(any(change))
df.rle$values[change] <- shifted.sf[change] else break
}
gc()
df$state_shift<-inverse.rle(df.rle)
return(df)
}
I would like to separate runs where the removed NAs will separate runs
into two separate runs.
to illustrate with a short example:
> dat<-data.frame(id=1,state=c(1,2,4,4,5,NA,5,5,1))
>
> fixSeq(dat)
Error in df.rle$values[change] <- shifted.sf[change] :
NAs are not allowed in subscripted assignments>
> fixSeq(na.omit(dat))
id state state_shift
1 1 1 1
2 1 2 2
3 1 4 4
4 1 4 4
5 1 5 4
7 1 5 4
8 1 5 4
9 1 1 1>
rather than the true output of 1 2 4 4 4 5 5 1. The NA makes the
second pair of 5s a unique state rather than a continuation of the
previous state 4. Is this best accomplished by assigning NA to a
value like -99? or do I have other options?