Hello I use R to run a simple model of rainfall interception by vegetation: rainfall falls on vegetation, some is retained by the vegetation (part of which can evaporate), the rest falls on the ground (quite crude but very similar to those used in SWAT or MikeSHE, for the hydrologists among you). It uses a loop on zoo time-series of rainfall and potential evapotranspiration. Unfortunately I did not find a way to vectorize it and it takes ages to run on long datasets. Could anybody help me to make it run faster? library(zoo) set.seed(1) # artificial potential evapotranspiration time-series ETmax<-zoo(runif(10, min=1, max=6), c(1:10)) # artificial rainfall time-series RR<-zoo(runif(10, min=0, max=6), c(1:10)) ## create empty time-series to fill # effective rainfall (i.e. rainfall minus intercepted rainfall) RReff<-zoo(NA, c(1:10)) # intercepted rainfall int<-zoo(NA, c(1:10)) # residual potential evapotranspiration (ie ETmax minus evaporation from interception) ETres<-zoo(NA, c(1:10)) # define maximum interception storage capacity (maximum volume of rainfall that can be intercepted per time step, provided the interception store is empty at start of time-step) Smax<-3 # volume of water in interception storage at start of computation storage<-0 for (i in 1:length(ETmax)) { # compute interception capacity for time step i (maximum interception capacity minus any water intercepted but not evaporated during previous time-step). int[i]<-Smax-storage # compute intercepted rainfall: equal to rainfall if smaller than interception capacity, and to interception capacity if larger. if(RR[i]<int[i]) int[i]<-RR[i] # compute effective rainfall (rainfall minus intercepted rainfall). RReff[i]<-RR[i]-int[i] # update interception storage: initial interception storage + intercepted rainfall. storage<-storage+coredata(int[i]) # compute evaporation from interception storage: equal to potential evapotranspiration if the latter is smaller than interception storage, and to interception storage if larger. if(storage>coredata(ETmax[i])) evap<-coredata(ETmax[i]) else evap<-storage # compute residual potentiel evapotranspiration: potential evapotranspiration minus evaporation from interception storage. ETres[i]<-ETmax[i]-evap # update interception storage, to be carried over to next time-step: interception storage minus evaporation from interception storage. storage<-storage-evap } Many thanks for your help! Arnaud UCL Department of Geography, UK -- View this message in context: http://r.789695.n4.nabble.com/vectorization-of-rolling-function-tp4700487.html Sent from the R help mailing list archive at Nabble.com.
Please don't post in HTML... you may not recognize it, but the receiving end does not necessarily (and in this case did not) look like the sending end, and the cleanup can impede answers you are hoping to get. In many cases, loops can be vectorized. However, near as I can tell this is an example of an algorithm that simply needs a loop [1]. One bit of advice: the coredata function is horribly slow. Just converting your time series objects to numeric vectors for the purpose of this computation sped up the algorithm by 500x on 10000 point series. Converting it to inline C++ as below sped it up by yet another factor of 40x. 20000x is nothing to sneeze at. ####### ## optional temporary setup for windows ## assumes you have installed Rtools gcc <- "C:\\Rtools\\bin" rtools <- "C:\\Rtools\\gcc-4.6.3\\bin" path <- strsplit(Sys.getenv("PATH"), ";")[[1]] new_path <- c(rtools, gcc, path) new_path <- new_path[!duplicated(tolower(new_path))] Sys.setenv(PATH = paste(new_path, collapse = ";")) ## end of optional library(Rcpp) cppFunction( "DataFrame EvapSimRcpp( NumericVector RR , NumericVector ETmax , const double Smax , const double initialStorage ) { int n = RR.size(); // create empty time-series to fill // effective rainfall (i.e. rainfall minus intercepted rainfall) NumericVector RReff( n ); // intercepted rainfall( n ); NumericVector Rint( n ); // residual potential evapotranspiration (ie ETmax minus // evaporation from interception) NumericVector ETres( n, NA_REAL ); double evap; // volume of water in interception storage at start of // computation double storage = initialStorage; for ( int i=0; i<n; i++ ) { // compute interception capacity for time step i (maximum // interception capacity minus any water intercepted but not // evaporated during previous time-step). Rint[ i ] = Smax - storage; // compute intercepted rainfall: equal to rainfall if smaller // than interception capacity, and to interception capacity if // larger. if ( RR[ i ] < Rint[ i ] ) Rint[ i ] = RR[ i ]; // compute effective rainfall (rainfall minus intercepted // rainfall). RReff[ i ] = RR[ i ] - Rint[ i ]; // update interception storage: initial interception storage + // intercepted // rainfall. storage = storage + Rint[ i ]; // compute evaporation from interception storage: equal to // potential evapotranspiration if the latter is smaller than // interception storage, and to interception storage if larger. if ( storage > ETmax[ i ] ) evap = ETmax[ i ]; else evap = storage; // compute residual potentiel evapotranspiration: potential // evapotranspiration minus evaporation from interception // storage. ETres[ i ] = ETmax[ i ] - evap; // update interception storage, to be carried over to next // time-step: interception storage minus evaporation from // interception storage. storage = storage - evap; } DataFrame DF = DataFrame::create( Named( \"int\" ) = Rint , Named( \"RReff\" ) = RReff , Named( \"ETres\" ) = ETres ); return DF; } ") # Assumes your initial variables are already defined EvapSimRcpp( RR, ETmax, Smax, 0 ) ####### [1] http://stackoverflow.com/questions/7153586/can-i-vectorize-a-calculation-which-depends-on-previous-elements On Sat, 6 Dec 2014, A Duranel wrote:> Hello > I use R to run a simple model of rainfall interception by vegetation: > rainfall falls on vegetation, some is retained by the vegetation (part of > which can evaporate), the rest falls on the ground (quite crude but very > similar to those used in SWAT or MikeSHE, for the hydrologists among you). > It uses a loop on zoo time-series of rainfall and potential > evapotranspiration. Unfortunately I did not find a way to vectorize it and > it takes ages to run on long datasets. Could anybody help me to make it run > faster? > > library(zoo) > set.seed(1) > # artificial potential evapotranspiration time-series > ETmax<-zoo(runif(10, min=1, max=6), c(1:10)) > # artificial rainfall time-series > RR<-zoo(runif(10, min=0, max=6), c(1:10)) > > ## create empty time-series to fill > # effective rainfall (i.e. rainfall minus intercepted rainfall) > RReff<-zoo(NA, c(1:10)) > # intercepted rainfall > int<-zoo(NA, c(1:10)) > # residual potential evapotranspiration (ie ETmax minus evaporation from > interception) > ETres<-zoo(NA, c(1:10)) > > # define maximum interception storage capacity (maximum volume of rainfall > that can be intercepted per time step, provided the interception store is > empty at start of time-step) > Smax<-3 > # volume of water in interception storage at start of computation > storage<-0 > > for (i in 1:length(ETmax)) { > # compute interception capacity for time step i (maximum interception > capacity minus any water intercepted but not evaporated during previous > time-step). > int[i]<-Smax-storage > # compute intercepted rainfall: equal to rainfall if smaller than > interception capacity, and to interception capacity if larger. > if(RR[i]<int[i]) int[i]<-RR[i] > # compute effective rainfall (rainfall minus intercepted rainfall). > RReff[i]<-RR[i]-int[i] > # update interception storage: initial interception storage + intercepted > rainfall. > storage<-storage+coredata(int[i]) > # compute evaporation from interception storage: equal to potential > evapotranspiration if the latter is smaller than interception storage, and > to interception storage if larger. > if(storage>coredata(ETmax[i])) evap<-coredata(ETmax[i]) else evap<-storage > # compute residual potentiel evapotranspiration: potential > evapotranspiration minus evaporation from interception storage. > ETres[i]<-ETmax[i]-evap > # update interception storage, to be carried over to next time-step: > interception storage minus evaporation from interception storage. > storage<-storage-evap > } > > Many thanks for your help! > > Arnaud > UCL Department of Geography, UK > > > > -- > View this message in context: http://r.789695.n4.nabble.com/vectorization-of-rolling-function-tp4700487.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
Great, many thanks for your help Jeff. Apologies for the HTML format, I'll be more careful next time. Arnaud On 08/12/2014 08:25, Jeff Newmiller wrote:> Please don't post in HTML... you may not recognize it, but the > receiving end does not necessarily (and in this case did not) look > like the sending end, and the cleanup can impede answers you are > hoping to get. > > In many cases, loops can be vectorized. However, near as I can tell > this is an example of an algorithm that simply needs a loop [1]. > > One bit of advice: the coredata function is horribly slow. Just > converting your time series objects to numeric vectors for the purpose > of this computation sped up the algorithm by 500x on 10000 point > series. Converting it to inline C++ as below sped it up by yet another > factor of 40x. 20000x is nothing to sneeze at. > > ####### > ## optional temporary setup for windows > ## assumes you have installed Rtools > gcc <- "C:\\Rtools\\bin" > rtools <- "C:\\Rtools\\gcc-4.6.3\\bin" > path <- strsplit(Sys.getenv("PATH"), ";")[[1]] > new_path <- c(rtools, gcc, path) > new_path <- new_path[!duplicated(tolower(new_path))] > Sys.setenv(PATH = paste(new_path, collapse = ";")) > ## end of optional > > library(Rcpp) > > cppFunction( > "DataFrame EvapSimRcpp( NumericVector RR > , NumericVector ETmax > , const double Smax > , const double initialStorage ) { > int n = RR.size(); > > // create empty time-series to fill > // effective rainfall (i.e. rainfall minus intercepted rainfall) > NumericVector RReff( n ); > // intercepted rainfall( n ); > NumericVector Rint( n ); > // residual potential evapotranspiration (ie ETmax minus > // evaporation from interception) > NumericVector ETres( n, NA_REAL ); > double evap; > > // volume of water in interception storage at start of > // computation > double storage = initialStorage; > > for ( int i=0; i<n; i++ ) { > // compute interception capacity for time step i (maximum > // interception capacity minus any water intercepted but not > // evaporated during previous time-step). > Rint[ i ] = Smax - storage; > // compute intercepted rainfall: equal to rainfall if smaller > // than interception capacity, and to interception capacity if > // larger. > if ( RR[ i ] < Rint[ i ] ) Rint[ i ] = RR[ i ]; > // compute effective rainfall (rainfall minus intercepted > // rainfall). > RReff[ i ] = RR[ i ] - Rint[ i ]; > // update interception storage: initial interception storage + > // intercepted > // rainfall. > storage = storage + Rint[ i ]; > // compute evaporation from interception storage: equal to > // potential evapotranspiration if the latter is smaller than > // interception storage, and to interception storage if larger. > if ( storage > ETmax[ i ] ) > evap = ETmax[ i ]; > else > evap = storage; > // compute residual potentiel evapotranspiration: potential > // evapotranspiration minus evaporation from interception > // storage. > ETres[ i ] = ETmax[ i ] - evap; > // update interception storage, to be carried over to next > // time-step: interception storage minus evaporation from > // interception storage. > storage = storage - evap; > } > DataFrame DF = DataFrame::create( Named( \"int\" ) = Rint > , Named( \"RReff\" ) = RReff > , Named( \"ETres\" ) = ETres > ); > return DF; > } > ") > > # Assumes your initial variables are already defined > EvapSimRcpp( RR, ETmax, Smax, 0 ) > > ####### > > [1] > http://stackoverflow.com/questions/7153586/can-i-vectorize-a-calculation-which-depends-on-previous-elements > > On Sat, 6 Dec 2014, A Duranel wrote: > >> Hello >> I use R to run a simple model of rainfall interception by vegetation: >> rainfall falls on vegetation, some is retained by the vegetation >> (part of >> which can evaporate), the rest falls on the ground (quite crude but very >> similar to those used in SWAT or MikeSHE, for the hydrologists among >> you). >> It uses a loop on zoo time-series of rainfall and potential >> evapotranspiration. Unfortunately I did not find a way to vectorize >> it and >> it takes ages to run on long datasets. Could anybody help me to make >> it run >> faster? >> >> library(zoo) >> set.seed(1) >> # artificial potential evapotranspiration time-series >> ETmax<-zoo(runif(10, min=1, max=6), c(1:10)) >> # artificial rainfall time-series >> RR<-zoo(runif(10, min=0, max=6), c(1:10)) >> >> ## create empty time-series to fill >> # effective rainfall (i.e. rainfall minus intercepted rainfall) >> RReff<-zoo(NA, c(1:10)) >> # intercepted rainfall >> int<-zoo(NA, c(1:10)) >> # residual potential evapotranspiration (ie ETmax minus evaporation from >> interception) >> ETres<-zoo(NA, c(1:10)) >> >> # define maximum interception storage capacity (maximum volume of >> rainfall >> that can be intercepted per time step, provided the interception >> store is >> empty at start of time-step) >> Smax<-3 >> # volume of water in interception storage at start of computation >> storage<-0 >> >> for (i in 1:length(ETmax)) { >> # compute interception capacity for time step i (maximum interception >> capacity minus any water intercepted but not evaporated during previous >> time-step). >> int[i]<-Smax-storage >> # compute intercepted rainfall: equal to rainfall if smaller than >> interception capacity, and to interception capacity if larger. >> if(RR[i]<int[i]) int[i]<-RR[i] >> # compute effective rainfall (rainfall minus intercepted rainfall). >> RReff[i]<-RR[i]-int[i] >> # update interception storage: initial interception storage + >> intercepted >> rainfall. >> storage<-storage+coredata(int[i]) >> # compute evaporation from interception storage: equal to potential >> evapotranspiration if the latter is smaller than interception >> storage, and >> to interception storage if larger. >> if(storage>coredata(ETmax[i])) evap<-coredata(ETmax[i]) else >> evap<-storage >> # compute residual potentiel evapotranspiration: potential >> evapotranspiration minus evaporation from interception storage. >> ETres[i]<-ETmax[i]-evap >> # update interception storage, to be carried over to next time-step: >> interception storage minus evaporation from interception storage. >> storage<-storage-evap >> } >> >> Many thanks for your help! >> >> Arnaud >> UCL Department of Geography, UK >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/vectorization-of-rolling-function-tp4700487.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > --------------------------------------------------------------------------- > > Jeff Newmiller The ..... ..... Go > Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. > rocks...1k > --------------------------------------------------------------------------- > > . >