thr3ads.net - R devel - [Rd] reason for odd timings [Jan 2022]

If this information is useful, please help other people find it:
Share via:

J C Nash

2022-Jan-22 01:51 UTC

[Rd] reason for odd timings

Occasionally I run some rather trivial timings to get an idea of what might
be the best way to compute some quantities.

The program below gave timings for sums of squares of 100 elements much greater
than those for 1000, which seems surprising. Does anyone know the cause of this?

This isn't holding up my work. Just causing some head scratching.

JN
> source("sstimer.R") n  	  t(forloop) : ratio 	  t(sum) : ratio 	 t(crossprod) 	 all.equal
100 	 38719.15  :  1.766851 	 13421.12  :  0.6124391 	 21914.21 	 TRUE
1000 	 44722.71  :  20.98348 	 3093.94  :  1.451648 	 2131.33 	 TRUE
10000 	 420149.9  :  42.10269 	 27341.6  :  2.739867 	 9979.17 	 TRUE
1e+05 	 4070469  :  39.89473 	 343293.5  :  3.364625 	 102030.2 	 TRUE
1e+06 	 42293696  :  33.27684 	 3605866  :  2.837109 	 1270965 	 TRUE
1e+07 	 408123066  :  29.20882 	 35415106  :  2.534612 	 13972596 	
TRUE>
# crossprod timer
library(microbenchmark)
suml<-function(vv) {
    ss<-0.0
    for (i in 1:length(vv)) {ss<-ss+vv[i]^2}
    ss
}
sums<-function(vv) {
  ss<-sum(vv^2)
  ss
}
sumc<-function(vv) {
  ss<-as.numeric(crossprod(vv))
  ss
}
ll <- c(100, 1000, 10000, 100000, 1000000, 10000000)
cat(" n  \t  t(forloop) : ratio \t  t(sum) : ratio \t t(crossprod) \t
all.equal \n")
for (nn in ll ){
   set.seed(1234)
   vv <- runif(nn)
   tsuml<-microbenchmark(sl<-suml(vv), unit="us")
   tsums<-microbenchmark(ss<-sums(vv), unit="us")
   tsumc<-microbenchmark(sc<-sumc(vv), unit="us")
   ml<-mean(tsuml$time)
   ms<-mean(tsums$time)
   mc<-mean(tsumc$time)
   cat(nn,"\t",ml," : ",ml/mc,"\t",ms," :
",ms/mc,"\t",mc,"\t",all.equal(sl, ss,
sc),"\n")
}

Peter Langfelder

2022-Jan-22 04:27 UTC

head link

[Rd] reason for odd timings

I think R spends a little bit of time compiling the functions into byte
code the first time you call the function. The hint is that the oddity
seems to go away when running the code repeatedly. The first timing for the
first three values of ll returns this:

100 65347.45  :  2.491943 24329.94  :  0.9277918 26223.49 TRUE
1000 71387.79  :  11.49891 4619.35  :  0.74407 6208.22 TRUE
10000 711115.8  :  16.82563 61732.79  :  1.460653 42263.83 TRUE

the subsequent runs return this:

100 8349.68  :  3.125699 1821.28  :  0.6817954 2671.3 TRUE
1000 72191.88  :  11.8359 4712.81  :  0.7726678 6099.4 TRUE
10000 706826.9  :  15.95787 61574.9  :  1.390162 44293.32 TRUE

100 8390.46  :  3.159023 1815.71  :  0.683618 2656.03 TRUE
1000 93731.96  :  15.3112 4630.55  :  0.7564046 6121.79 TRUE
10000 809345.8  :  18.75048 61903.45  :  1.434145 43164 TRUE

Peter

On Fri, Jan 21, 2022 at 5:51 PM J C Nash <profjcnash at gmail.com> wrote:
> Occasionally I run some rather trivial timings to get an idea of what might
> be the best way to compute some quantities.
>
> The program below gave timings for sums of squares of 100 elements much
> greater
> than those for 1000, which seems surprising. Does anyone know the cause of
> this?
>
> This isn't holding up my work. Just causing some head scratching.
>
> JN
>
> > source("sstimer.R")
>  n        t(forloop) : ratio      t(sum) : ratio         t(crossprod)
> all.equal
> 100      38719.15  :  1.766851   13421.12  :  0.6124391          21914.21
>       TRUE
> 1000     44722.71  :  20.98348   3093.94  :  1.451648    2131.33
>  TRUE
> 10000    420149.9  :  42.10269   27341.6  :  2.739867    9979.17
>  TRUE
> 1e+05    4070469  :  39.89473    343293.5  :  3.364625   102030.2
> TRUE
> 1e+06    42293696  :  33.27684   3605866  :  2.837109    1270965
>  TRUE
> 1e+07    408123066  :  29.20882          35415106  :  2.534612   13972596
>       TRUE
> >
>
> # crossprod timer
> library(microbenchmark)
> suml<-function(vv) {
>     ss<-0.0
>     for (i in 1:length(vv)) {ss<-ss+vv[i]^2}
>     ss
> }
> sums<-function(vv) {
>   ss<-sum(vv^2)
>   ss
> }
> sumc<-function(vv) {
>   ss<-as.numeric(crossprod(vv))
>   ss
> }
> ll <- c(100, 1000, 10000, 100000, 1000000, 10000000)
> cat(" n  \t  t(forloop) : ratio \t  t(sum) : ratio \t t(crossprod) \t
> all.equal \n")
> for (nn in ll ){
>    set.seed(1234)
>    vv <- runif(nn)
>    tsuml<-microbenchmark(sl<-suml(vv), unit="us")
>    tsums<-microbenchmark(ss<-sums(vv), unit="us")
>    tsumc<-microbenchmark(sc<-sumc(vv), unit="us")
>    ml<-mean(tsuml$time)
>    ms<-mean(tsums$time)
>    mc<-mean(tsumc$time)
>    cat(nn,"\t",ml," : ",ml/mc,"\t",ms,"
:
> ",ms/mc,"\t",mc,"\t",all.equal(sl, ss,
sc),"\n")
> }
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
	[[alternative HTML version deleted]]

Steve Martin

2022-Jan-22 04:38 UTC

head link

[Rd] reason for odd timings

Just to add a bit more, stripping out most of your test shows that
there is one iteration (the 2nd one) that takes a lot longer than the
others because the sums() function gets bytecode compiled.

library(microbenchmark)

sums <- function(vv) {
  ss <- sum(vv^2)
  ss
}

sums2 <- compiler::cmpfun(sums)

x <- runif(100)

head(as.data.frame(microbenchmark(sums(x), sums2(x))))
          expr    time
1  sums(x)   29455
2  sums(x) 3683091
3 sums2(x)    7108
4  sums(x)    4305
5  sums(x)    2733
6  sums(x)    2797

The paragraph on JIT in the details of ?compiler::compile explains
that this is the default behavior.

Steve

On Fri, 21 Jan 2022 at 20:51, J C Nash <profjcnash at gmail.com>
wrote:>
> Occasionally I run some rather trivial timings to get an idea of what might
> be the best way to compute some quantities.
>
> The program below gave timings for sums of squares of 100 elements much
greater
> than those for 1000, which seems surprising. Does anyone know the cause of
this?
>
> This isn't holding up my work. Just causing some head scratching.
>
> JN
>
> > source("sstimer.R")
>  n        t(forloop) : ratio      t(sum) : ratio         t(crossprod)   
all.equal
> 100      38719.15  :  1.766851   13421.12  :  0.6124391          21914.21  
TRUE
> 1000     44722.71  :  20.98348   3093.94  :  1.451648    2131.33        
TRUE
> 10000    420149.9  :  42.10269   27341.6  :  2.739867    9979.17        
TRUE
> 1e+05    4070469  :  39.89473    343293.5  :  3.364625   102030.2       
TRUE
> 1e+06    42293696  :  33.27684   3605866  :  2.837109    1270965        
TRUE
> 1e+07    408123066  :  29.20882          35415106  :  2.534612   13972596  
TRUE
> >
>
> # crossprod timer
> library(microbenchmark)
> suml<-function(vv) {
>     ss<-0.0
>     for (i in 1:length(vv)) {ss<-ss+vv[i]^2}
>     ss
> }
> sums<-function(vv) {
>   ss<-sum(vv^2)
>   ss
> }
> sumc<-function(vv) {
>   ss<-as.numeric(crossprod(vv))
>   ss
> }
> ll <- c(100, 1000, 10000, 100000, 1000000, 10000000)
> cat(" n  \t  t(forloop) : ratio \t  t(sum) : ratio \t t(crossprod) \t
all.equal \n")
> for (nn in ll ){
>    set.seed(1234)
>    vv <- runif(nn)
>    tsuml<-microbenchmark(sl<-suml(vv), unit="us")
>    tsums<-microbenchmark(ss<-sums(vv), unit="us")
>    tsumc<-microbenchmark(sc<-sumc(vv), unit="us")
>    ml<-mean(tsuml$time)
>    ms<-mean(tsums$time)
>    mc<-mean(tsumc$time)
>    cat(nn,"\t",ml," : ",ml/mc,"\t",ms,"
: ",ms/mc,"\t",mc,"\t",all.equal(sl, ss,
sc),"\n")
> }
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

R devel - Jan 2022 - reason for odd timings

[Rd] reason for odd timings

[Rd] reason for odd timings

[Rd] reason for odd timings