I noticed Doug only circulated the following on S-news, but it may be of interest to R users who don't follow S-news. His findings are certainly consistent with my own, where my problems often force me into the element by element type of situation. Paul Gilbert _______ At the risk of beating this example to death, I went back and compared the execution time for the element-by-element method with that of the indexing method. At vector sizes of 50000 the indexing method is close to 500 times faster. S-PLUS : Copyright (c) 1988, 1996 MathSoft, Inc. S : Copyright AT&T. Version 3.4 Release 1 for Sun SPARC, SunOS 5.3 : 1996 Working data will be in .Data > x <- rnorm(50000) > y <- rnorm(50000) > op <- sample(c("*","+","-","/"), 50000, repl = T) > foo <- + function(x, y, op) + { + z <- x + for(i in seq(along = op)) + z[i] <- do.call(op[i], list(x[i], y[i])) + z + } > unix.time(res1 <- cbind(x*y,x-y,x+y,x/y)[cbind(seq(along=x),match(op,c("*", "-", "+", "/")))]) [1] 0.40000010 0.04999995 0.00000000 0.00000000 0.00000000 > length(res1) [1] 50000 > unix.time(res2 <- foo(x, y, op)) [1] 196.23 0.29 199.00 0.00 0.00 > length(res2) [1] 50000 > all.equal(res1, res2) [1] T It is interesting that the difference in execution speed when using R is much less than when using S-PLUS. On this machine (Sun UltraSparc 1 running Solaris 2.5.1) R is somewhat slower than S-PLUS for the indexing version but close to 10 times faster for the element-by-element version R> unix.time(res1 <- cbind(x*y,x-y,x+y,x/y)[cbind(seq(along=x),match(op,c("*", "-", "+", "/")))]) [1] 1.55 0.07 2.00 0.00 0.00 R> unix.time(res2 <- foo(x, y, op)) [1] 23.96 0.05 24.00 0.00 0.00 R> all(res1 == res2) [1] TRUE -- Douglas Bates bates@stat.wisc.edu Statistics Department 608/262-2598 University of Wisconsin - Madison http://www.stat.wisc.edu/~bates/ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
From r-devel-owner@stat.math.ethz.ch Tue Nov 11 05:58 NZD 1997 Date: Mon, 10 Nov 1997 11:56:06 -0500 From: Paul Gilbert <pgilbert@bank-banque-canada.ca> S> unix.time(res1 <- cbind(x*y,x-y,x+y,x/y)[cbind(seq(along=x),match(op,c("*", "-", "+", "/")))]) [1] 0.40000010 0.04999995 0.00000000 0.00000000 0.00000000 R> unix.time(res1 <- cbind(x*y,x-y,x+y,x/y)[cbind(seq(along=x),match(op,c("*", "-", "+", "/")))]) [1] 1.55 0.07 2.00 0.00 0.00 We have received a patch from someone in CS at Berkeley which will be in 0.60 and should claw back a decent fraction of the difference between R and S in this comparison. [ It eliminates our use of the % operator in the loops for +, -, * / etc. ] The looping speed was an original design goal for R. It's good to know that it hasn't been fixed (yet). Ross =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
On Mon, 10 Nov 1997, Paul Gilbert wrote:> I noticed Doug only circulated the following on S-news, but it may be of > interest to R users who don't follow S-news. His findings are certainly > consistent with my own, where my problems often force me into the > element by element type of situation.It is also interesting to look at the allegedly realistic if extreme benchmark program provided with Matt Calder's partial S compiler. LMS <- function(M, N) { ### Pre-allocate result and filter. ### R <- matrix(0,nrow=M, ncol=5) W <- rep(0,5) for (i in 1:M) { ### Simulate MA(1) ### Z <- rnorm(N+1) X <- Z[2:(N+1)] + 0.5 * Z[1:N] ### Perform LMS ### for (j in 5:N) { U <- X[j:(j-4)] E <- Z[j+1] - sum(W * U) W <- W + 0.1* E * U } ### Save final result ### R[i,] <- W } return(R) } I used this in S, R and compiled to C on an Ultra 1 (Solaris) and in R and compiled to C on a Pentium Pro 150 under Linux The results are quite interesting. These are CPU times in seconds for a fixed value of N (100, I believe, but it was some time ago) Ultra Intel M R C S R C 100 7.24 2.08 11 6.38 0.79 200 14.19 4.06 25.8 12.36 1.56 300 21.59 6.3 56.8 18.89 2.31 400 28.76 8.37 89 24.97 3.09 800 57.62 16.42 295 50.33 6.16 The time taken by R scales roughly linearly with M, S time grows much faster. The two are comparable for small M. It's also interesting that the Intel machine was slightly faster, though this doesn't translate to disk and memory intensive tasks. Thomas Lumley ------------------------------------------------------+------ Biostatistics : "Never attribute to malice what : Uni of Washington : can be adequately explained by : Box 357232 : incompetence" - Hanlon's Razor : Seattle WA 98195-7232 : : ------------------------------------------------------------ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=