Hello again,
I believe we are all missing something. Isn't it possible to have NAs as the
first values of 'y'?
And isn't it also possible to have x[1] > 3?
Here is my point (I have changed function 'f2' to predict for such
cases,
'f1' is rubbish)
# Rui
f3 <- function(x, y){
inx <- which(x > 3)
ynx <- which(is.na(y))
for(i in which(inx %in% ynx)) y[ynx[i]] <- y[ynx[i]-1] + 2L
y
}
# Jim's, as a function, 'na.rm' option added or else 'df3'
would produce an
error
require(zoo)
f4 <- function(x, y){
y <- na.locf(y, na.rm=FALSE)
inc <- cumsum(x > 3) * 2
y + inc
}
df <- data.frame(x = c(1,2,3,4,5), y = c(10,20,30,NA,NA))
df
df2 <- data.frame(x = c(1,2,3,4,5), y = c(10,20,NA,40,NA))
df2
df3 <- data.frame(x = c(1,2,3,4,5), y = rev(c(10,20,30,NA,NA)))
df3
# Joshua
f(df$x, df$y) # works
f(df2$x, df2$y) # infinite loop
f(df3$x, df3$y) # infinite loop
# Rui
f3(df$x, df$y) # works
f3(df2$x, df2$y) # works as expected?
f3(df3$x, df3$y) # works as expected?
# Jim
f4(df$x, df$y) # works
f4(df2$x, df2$y) # works as expected?
f4(df3$x, df3$y) # works as expected?
If this makes sense, the performance tests are very much in favour of Jim's
solution.
# If this is what is asked for, test the performance
# with large enough N
N <- 1.e5
dftest <- data.frame(x=1:N, y=c(sample(c(rep(NA, 5), 10*1:5), N,
replace=TRUE)))
sum(is.na(dftest))/N # proportion of NAs in 'dftest'
t2 <- system.time(invisible(apply(dftest, 2, f2)))[c(1, 3)]
t3 <- system.time(invisible(f3(dftest$x, dftest$y)))[c(1, 3)]
t4 <- system.time(invisible(f4(dftest$x, dftest$y)))[c(1, 3)]
rbind(t2=t2, t3=t3, t4=t4, t2.t3=t2/t3, t2.t4=t2/t4, t3.t4=t3/t4)
Sample output
user.self elapsed
t2 2.93000 2.95000
t3 0.22000 0.22000
t4 0.01000 0.01000
t2.t3 13.31818 13.40909
t2.t4 293.00000 295.00000
t3.t4 22.00000 22.00000
A factor of 300 over the initial solution or 20+ over the other loop based
one.
Downside, it needs an extra package loaded, but 'zoo' is rather common
place.
Rui Barradas
--
View this message in context:
http://r.789695.n4.nabble.com/Conditionally-adding-a-constant-tp4253049p4254470.html
Sent from the R help mailing list archive at Nabble.com.