Occasionally I run some rather trivial timings to get an idea of what might be the best way to compute some quantities. The program below gave timings for sums of squares of 100 elements much greater than those for 1000, which seems surprising. Does anyone know the cause of this? This isn't holding up my work. Just causing some head scratching. JN> source("sstimer.R")n t(forloop) : ratio t(sum) : ratio t(crossprod) all.equal 100 38719.15 : 1.766851 13421.12 : 0.6124391 21914.21 TRUE 1000 44722.71 : 20.98348 3093.94 : 1.451648 2131.33 TRUE 10000 420149.9 : 42.10269 27341.6 : 2.739867 9979.17 TRUE 1e+05 4070469 : 39.89473 343293.5 : 3.364625 102030.2 TRUE 1e+06 42293696 : 33.27684 3605866 : 2.837109 1270965 TRUE 1e+07 408123066 : 29.20882 35415106 : 2.534612 13972596 TRUE># crossprod timer library(microbenchmark) suml<-function(vv) { ss<-0.0 for (i in 1:length(vv)) {ss<-ss+vv[i]^2} ss } sums<-function(vv) { ss<-sum(vv^2) ss } sumc<-function(vv) { ss<-as.numeric(crossprod(vv)) ss } ll <- c(100, 1000, 10000, 100000, 1000000, 10000000) cat(" n \t t(forloop) : ratio \t t(sum) : ratio \t t(crossprod) \t all.equal \n") for (nn in ll ){ set.seed(1234) vv <- runif(nn) tsuml<-microbenchmark(sl<-suml(vv), unit="us") tsums<-microbenchmark(ss<-sums(vv), unit="us") tsumc<-microbenchmark(sc<-sumc(vv), unit="us") ml<-mean(tsuml$time) ms<-mean(tsums$time) mc<-mean(tsumc$time) cat(nn,"\t",ml," : ",ml/mc,"\t",ms," : ",ms/mc,"\t",mc,"\t",all.equal(sl, ss, sc),"\n") }
I think R spends a little bit of time compiling the functions into byte code the first time you call the function. The hint is that the oddity seems to go away when running the code repeatedly. The first timing for the first three values of ll returns this: 100 65347.45 : 2.491943 24329.94 : 0.9277918 26223.49 TRUE 1000 71387.79 : 11.49891 4619.35 : 0.74407 6208.22 TRUE 10000 711115.8 : 16.82563 61732.79 : 1.460653 42263.83 TRUE the subsequent runs return this: 100 8349.68 : 3.125699 1821.28 : 0.6817954 2671.3 TRUE 1000 72191.88 : 11.8359 4712.81 : 0.7726678 6099.4 TRUE 10000 706826.9 : 15.95787 61574.9 : 1.390162 44293.32 TRUE 100 8390.46 : 3.159023 1815.71 : 0.683618 2656.03 TRUE 1000 93731.96 : 15.3112 4630.55 : 0.7564046 6121.79 TRUE 10000 809345.8 : 18.75048 61903.45 : 1.434145 43164 TRUE Peter On Fri, Jan 21, 2022 at 5:51 PM J C Nash <profjcnash at gmail.com> wrote:> Occasionally I run some rather trivial timings to get an idea of what might > be the best way to compute some quantities. > > The program below gave timings for sums of squares of 100 elements much > greater > than those for 1000, which seems surprising. Does anyone know the cause of > this? > > This isn't holding up my work. Just causing some head scratching. > > JN > > > source("sstimer.R") > n t(forloop) : ratio t(sum) : ratio t(crossprod) > all.equal > 100 38719.15 : 1.766851 13421.12 : 0.6124391 21914.21 > TRUE > 1000 44722.71 : 20.98348 3093.94 : 1.451648 2131.33 > TRUE > 10000 420149.9 : 42.10269 27341.6 : 2.739867 9979.17 > TRUE > 1e+05 4070469 : 39.89473 343293.5 : 3.364625 102030.2 > TRUE > 1e+06 42293696 : 33.27684 3605866 : 2.837109 1270965 > TRUE > 1e+07 408123066 : 29.20882 35415106 : 2.534612 13972596 > TRUE > > > > # crossprod timer > library(microbenchmark) > suml<-function(vv) { > ss<-0.0 > for (i in 1:length(vv)) {ss<-ss+vv[i]^2} > ss > } > sums<-function(vv) { > ss<-sum(vv^2) > ss > } > sumc<-function(vv) { > ss<-as.numeric(crossprod(vv)) > ss > } > ll <- c(100, 1000, 10000, 100000, 1000000, 10000000) > cat(" n \t t(forloop) : ratio \t t(sum) : ratio \t t(crossprod) \t > all.equal \n") > for (nn in ll ){ > set.seed(1234) > vv <- runif(nn) > tsuml<-microbenchmark(sl<-suml(vv), unit="us") > tsums<-microbenchmark(ss<-sums(vv), unit="us") > tsumc<-microbenchmark(sc<-sumc(vv), unit="us") > ml<-mean(tsuml$time) > ms<-mean(tsums$time) > mc<-mean(tsumc$time) > cat(nn,"\t",ml," : ",ml/mc,"\t",ms," : > ",ms/mc,"\t",mc,"\t",all.equal(sl, ss, sc),"\n") > } > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Just to add a bit more, stripping out most of your test shows that there is one iteration (the 2nd one) that takes a lot longer than the others because the sums() function gets bytecode compiled. library(microbenchmark) sums <- function(vv) { ss <- sum(vv^2) ss } sums2 <- compiler::cmpfun(sums) x <- runif(100) head(as.data.frame(microbenchmark(sums(x), sums2(x)))) expr time 1 sums(x) 29455 2 sums(x) 3683091 3 sums2(x) 7108 4 sums(x) 4305 5 sums(x) 2733 6 sums(x) 2797 The paragraph on JIT in the details of ?compiler::compile explains that this is the default behavior. Steve On Fri, 21 Jan 2022 at 20:51, J C Nash <profjcnash at gmail.com> wrote:> > Occasionally I run some rather trivial timings to get an idea of what might > be the best way to compute some quantities. > > The program below gave timings for sums of squares of 100 elements much greater > than those for 1000, which seems surprising. Does anyone know the cause of this? > > This isn't holding up my work. Just causing some head scratching. > > JN > > > source("sstimer.R") > n t(forloop) : ratio t(sum) : ratio t(crossprod) all.equal > 100 38719.15 : 1.766851 13421.12 : 0.6124391 21914.21 TRUE > 1000 44722.71 : 20.98348 3093.94 : 1.451648 2131.33 TRUE > 10000 420149.9 : 42.10269 27341.6 : 2.739867 9979.17 TRUE > 1e+05 4070469 : 39.89473 343293.5 : 3.364625 102030.2 TRUE > 1e+06 42293696 : 33.27684 3605866 : 2.837109 1270965 TRUE > 1e+07 408123066 : 29.20882 35415106 : 2.534612 13972596 TRUE > > > > # crossprod timer > library(microbenchmark) > suml<-function(vv) { > ss<-0.0 > for (i in 1:length(vv)) {ss<-ss+vv[i]^2} > ss > } > sums<-function(vv) { > ss<-sum(vv^2) > ss > } > sumc<-function(vv) { > ss<-as.numeric(crossprod(vv)) > ss > } > ll <- c(100, 1000, 10000, 100000, 1000000, 10000000) > cat(" n \t t(forloop) : ratio \t t(sum) : ratio \t t(crossprod) \t all.equal \n") > for (nn in ll ){ > set.seed(1234) > vv <- runif(nn) > tsuml<-microbenchmark(sl<-suml(vv), unit="us") > tsums<-microbenchmark(ss<-sums(vv), unit="us") > tsumc<-microbenchmark(sc<-sumc(vv), unit="us") > ml<-mean(tsuml$time) > ms<-mean(tsums$time) > mc<-mean(tsumc$time) > cat(nn,"\t",ml," : ",ml/mc,"\t",ms," : ",ms/mc,"\t",mc,"\t",all.equal(sl, ss, sc),"\n") > } > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel