jeff6868
2014-Jul-10 12:34 UTC
find & remove sequences of at least N values for a specific value
Hi everybody, I have a small problem in a function, about removing short sequences of identical numeric values. For the example, we can consider this data, containing only some "0" and "1": test <- data.frame(x=c(0,0,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1)) The aim of my purpose here is simply to remove each sequence of "1" with a length shorter than 5, and to keep sequences of "1" which are bigger than 5. So my final data should look like this: final <- data.frame(x=c(0,0,NA,NA,NA,0,0,0,0,1,1,1,1,1,1,1,1)) For the moment, I have this function: foo <- function(X,N){ tab <- table(X[X==1]) under.n <- as.numeric(names(tab)[tab<N]) ind <- X %in% under.n Ind.sup <- which(ind) X <- ifelse(ind,NA,X) } test$x <- apply(as.data.frame(test$x),2,function(x) foo(x,5)) The problem is that the function doesn't consider each sequence separately, but only one sequence. I think that adding rle() instead of table() in my function should to the trick, but it doesn't work yet. Does someone have an idea about fixing this problem? -- View this message in context: http://r.789695.n4.nabble.com/find-remove-sequences-of-at-least-N-values-for-a-specific-value-tp4693810.html Sent from the R help mailing list archive at Nabble.com.