jeff6868
2014-Jul-10 12:34 UTC
find & remove sequences of at least N values for a specific value
Hi everybody,
I have a small problem in a function, about removing short sequences of
identical numeric values.
For the example, we can consider this data, containing only some "0"
and
"1":
test <- data.frame(x=c(0,0,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1))
The aim of my purpose here is simply to remove each sequence of "1"
with a
length shorter than 5, and to keep sequences of "1" which are bigger
than 5.
So my final data should look like this:
final <- data.frame(x=c(0,0,NA,NA,NA,0,0,0,0,1,1,1,1,1,1,1,1))
For the moment, I have this function:
foo <- function(X,N){
tab <- table(X[X==1])
under.n <- as.numeric(names(tab)[tab<N])
ind <- X %in% under.n
Ind.sup <- which(ind)
X <- ifelse(ind,NA,X)
}
test$x <- apply(as.data.frame(test$x),2,function(x) foo(x,5))
The problem is that the function doesn't consider each sequence separately,
but only one sequence. I think that adding rle() instead of table() in my
function should to the trick, but it doesn't work yet.
Does someone have an idea about fixing this problem?
--
View this message in context:
http://r.789695.n4.nabble.com/find-remove-sequences-of-at-least-N-values-for-a-specific-value-tp4693810.html
Sent from the R help mailing list archive at Nabble.com.