Sergey Goriatchev
2009-Jun-19 07:36 UTC
[R] (FULL) Need help to optimize a piece of code involving zoo objects
(Sorry, sent the message before I finished it) Hello, everyone I have a long script that uses zoo objects. In this script I used simple moving averages and these I can very efficiently calculate with filter() functions. Now, I have to use special "exponential" moving averages, and the only way I could write the code was with a for-loop, which makes everything extremely slow. I don't know how to optimize the code, but I need to find a solution. I hope someone can help me. The special moving average is calculated in the following way: EMA = ( K x ( C - P ) ) + P where, C = Current Value P = Previous periods EMA ? ?(A SMA is used for the first period's calculation) K = Exponential smoothing constant K = 2 / ( 1 + Periods ) Below is the code with the for-loop. -"temp" contains C -Periods is variable "j" in the for loop (so K varies) - I first produce a vector of simple equally weighted moving average, and use the first non-NA value to initiate the second for-loop x.Date <- as.Date("2003-02-01") + seq(1,1100) - 1 temp <- zoo(rnorm(1100, 0, 10)+100, x.Date) start.time <- proc.time() for(j in seq(5,100,by=5)){ ? ? ? ?#PRODUCE FAST MOVING AVERAGE ? ? ? ?#Create equally weighted MA vector (we need only the first value) ? ? ? ?smafast <- zoo(coredata(filter(coredata(temp[,1]), filter=rep(1/j, j), sides=1)), order.by=time(temp)) ? ? ? ?#index of first non-NA value, which is the first SMA needed ? ? ? ?#which(is.na(smafast))[length(which(is.na(smafast)))]+1 ? ? ? ?#Calculate decay factor K ? ? ? ?#number of periods is j ? ? ? ?K <- 2/(1+j) ? ? ? ?#Calculate recursively the EMA for the fast index (starting with second non-NA value) ? ? ? ?for (k in (which(is.na(smafast))[length(which(is.na(smafast)))]+2):length(smafast)) { ? ? ? ? ? ? ? ?smafast[k] <- coredata(smafast[k-1])+K*(coredata(temp[k,1])-coredata(smafast[k-1])) ? ? ? ?} ? ? ? ?#PRODUCE SLOW MOVING AVERAGE ? ? ? ?#Create equally weighted MA vector (we need only the first value) ? ? ? ?smaslow <- zoo(coredata(filter(coredata(temp[,1]), filter=rep(1/(j*4), (j*4)), sides=1)), order.by=time(temp)) ? ? ? ?K <- 2/(1+j*4) #Calculate EMA ? ? ? ?for (k in (which(is.na(smaslow))[length(which(is.na(smaslow)))]+2):length(smaslow)) { ? ? ? ? ? ? ? ?smaslow[k] <- coredata(smaslow[k-1])+K*(coredata(temp[k,1])-coredata(smaslow[k-1])) ? ? ? ?} ? ? ? ?#COMBINE DIFFERENCES OF FAST AND SLOW ? ? ? ?temp <- ? ? ? ? merge(temp, ma=smafast-smaslow) } proc.time()-start.time -------------------------------------------------------------------------------------- Could someone help me to optimize the two EMA for-loops within the bigger for-loop? I need to cut the execution time down by at least half. Thank you in advance for your help! Best, Sergey
Gabor Grothendieck
2009-Jun-19 09:38 UTC
[R] (FULL) Need help to optimize a piece of code involving zoo objects
Check out method = "recursive" in filter(). On Fri, Jun 19, 2009 at 3:36 AM, Sergey Goriatchev<sergeyg at gmail.com> wrote:> (Sorry, sent the message before I finished it) > Hello, everyone > > I have a long script that uses zoo objects. In this script I used > simple moving averages and these I can very efficiently calculate with > filter() functions. > Now, I have to use special "exponential" moving averages, and the only > way I could write the code was with a for-loop, which makes everything > extremely slow. > I don't know how to optimize the code, but I need to find a solution. > I hope someone can help me. > > The special moving average is calculated in the following way: > > EMA = ( K x ( C - P ) ) + P > > where, > > C = Current Value > P = Previous periods EMA ? ?(A SMA is used for the first period's calculation) > K = Exponential smoothing constant > > K = 2 / ( 1 + Periods ) > > Below is the code with the for-loop. > > -"temp" contains C > -Periods is variable "j" in the for loop (so K varies) > - I first produce a vector of simple equally weighted moving average, > and use the first non-NA value to initiate the second for-loop > > x.Date <- as.Date("2003-02-01") + seq(1,1100) - 1 > temp <- zoo(rnorm(1100, 0, 10)+100, x.Date) > > start.time <- proc.time() > > for(j in seq(5,100,by=5)){ > > ? ? ? ?#PRODUCE FAST MOVING AVERAGE > ? ? ? ?#Create equally weighted MA vector (we need only the first value) > ? ? ? ?smafast <- zoo(coredata(filter(coredata(temp[,1]), filter=rep(1/j, > j), sides=1)), order.by=time(temp)) > > ? ? ? ?#index of first non-NA value, which is the first SMA needed > ? ? ? ?#which(is.na(smafast))[length(which(is.na(smafast)))]+1 > > ? ? ? ?#Calculate decay factor K > ? ? ? ?#number of periods is j > ? ? ? ?K <- 2/(1+j) > > ? ? ? ?#Calculate recursively the EMA for the fast index (starting with > second non-NA value) > ? ? ? ?for (k in > (which(is.na(smafast))[length(which(is.na(smafast)))]+2):length(smafast)) > { > ? ? ? ? ? ? ? ?smafast[k] <- > coredata(smafast[k-1])+K*(coredata(temp[k,1])-coredata(smafast[k-1])) > ? ? ? ?} > > ? ? ? ?#PRODUCE SLOW MOVING AVERAGE > ? ? ? ?#Create equally weighted MA vector (we need only the first value) > ? ? ? ?smaslow <- zoo(coredata(filter(coredata(temp[,1]), > filter=rep(1/(j*4), (j*4)), sides=1)), order.by=time(temp)) > ? ? ? ?K <- 2/(1+j*4) > #Calculate EMA > ? ? ? ?for (k in > (which(is.na(smaslow))[length(which(is.na(smaslow)))]+2):length(smaslow)) > { > ? ? ? ? ? ? ? ?smaslow[k] <- > coredata(smaslow[k-1])+K*(coredata(temp[k,1])-coredata(smaslow[k-1])) > ? ? ? ?} > > ? ? ? ?#COMBINE DIFFERENCES OF FAST AND SLOW > ? ? ? ?temp <- ? ? ? ? merge(temp, ma=smafast-smaslow) > } > > proc.time()-start.time > > -------------------------------------------------------------------------------------- > Could someone help me to optimize the two EMA for-loops within the > bigger for-loop? > I need to cut the execution time down by at least half. > > Thank you in advance for your help! > > Best, > Sergey > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >