Rosa
2012-Aug-01 22:52 UTC
[R] how to use function of rle approx ifelse etc. in data frame
Hello R help, I have this data frame M2[160000,5] with NAs, a simple example would be: set.seed(1234) M2<-expand.grid(ID=182:183, year=2012, month=1:3, day=1:3, KEEP.OUT.ATTRS=FALSE) M2 <- M2[with(M2, order(ID, year, month, day)),] #sort the data M2$value <- sample(c(NA, rnorm(100)), nrow(M2), prob=c(0.5, rep(0.5/100, 100)), replace=TRUE) M2: ID year month day value 1 182 2012 1 1 -0.5012581 7 182 2012 1 2 1.1022975 13 182 2012 1 3 NA 3 182 2012 2 1 -0.1623095 9 182 2012 2 2 1.1022975 15 182 2012 2 3 -1.2519859 5 182 2012 3 1 NA 11 182 2012 3 2 NA 17 182 2012 3 3 NA 2 183 2012 1 1 0.9729168 8 183 2012 1 2 0.9594941 14 183 2012 1 3 NA 4 183 2012 2 1 NA 10 183 2012 2 2 -1.1088896 16 183 2012 2 3 0.9594941 6 183 2012 3 1 -0.4027320 12 183 2012 3 2 -0.0151383 18 183 2012 3 3 -1.0686427 In this example the max continuous NA is 3, while the data I have could have more than 10 NAs, what I need to do is: 1, split the data according to ID, year and month; 2, in each subset, if there are less than 5 continuous NA, repeat the prior data; if there are 5-10 NA, do a linear interpolation; and if there are more than 10 NA, delete the whole month; 3, if the first day of the month is NA, use the function backward. So far thanks to sebastian-c, the part of more than 10 NA is done: library(zoo) NA_run <- function(x, maxlen){ runs <- rle(is.na(x$value)) if(any(runs$lengths[runs$values] >= maxlen)) NULL else x } library(plyr) rem <- ddply(M2, .(ID, year, month), NA_run, 10) As to the other two parts, I figured out if less than 5 NA, use: na.locf(rem$value, na.rm=FALSE, maxgap=5); and if 5<NA<10, use:approx(rem$value, n=length(rem$value))$y; however when I put them into if else, it keeps failing me, is it because it is in data frame? I checked many posts on this issue, but doesn't work on mine, any help would be appreciated, thanks. -- View this message in context: http://r.789695.n4.nabble.com/how-to-use-function-of-rle-approx-ifelse-etc-in-data-frame-tp4638778.html Sent from the R help mailing list archive at Nabble.com.