I noticed Doug only circulated the following on S-news, but it may be of
interest to R users who don't follow S-news. His findings are certainly
consistent with my own, where my problems often force me into the
element by element type of situation.
Paul Gilbert
_______
At the risk of beating this example to death, I went back and compared
the execution time for the element-by-element method with that of the
indexing method. At vector sizes of 50000 the indexing method is
close to 500 times faster.
S-PLUS : Copyright (c) 1988, 1996 MathSoft, Inc.
S : Copyright AT&T.
Version 3.4 Release 1 for Sun SPARC, SunOS 5.3 : 1996
Working data will be in .Data
> x <- rnorm(50000)
> y <- rnorm(50000)
> op <-
sample(c("*","+","-","/"), 50000, repl =
T)
> foo <-
+ function(x, y, op)
+ {
+ z <- x
+ for(i in seq(along = op))
+ z[i] <- do.call(op[i], list(x[i], y[i]))
+ z
+ }
> unix.time(res1 <-
cbind(x*y,x-y,x+y,x/y)[cbind(seq(along=x),match(op,c("*",
"-", "+",
"/")))])
[1] 0.40000010 0.04999995 0.00000000 0.00000000 0.00000000
> length(res1)
[1] 50000
> unix.time(res2 <- foo(x, y, op))
[1] 196.23 0.29 199.00 0.00 0.00
> length(res2)
[1] 50000
> all.equal(res1, res2)
[1] T
It is interesting that the difference in execution speed when using R is
much less than when using S-PLUS. On this machine (Sun UltraSparc 1
running Solaris 2.5.1) R is somewhat slower than S-PLUS for the
indexing version but close to 10 times faster for the
element-by-element version
R> unix.time(res1 <-
cbind(x*y,x-y,x+y,x/y)[cbind(seq(along=x),match(op,c("*",
"-", "+",
"/")))])
[1] 1.55 0.07 2.00 0.00 0.00
R> unix.time(res2 <- foo(x, y, op))
[1] 23.96 0.05 24.00 0.00 0.00
R> all(res1 == res2)
[1] TRUE
--
Douglas Bates bates@stat.wisc.edu
Statistics Department 608/262-2598
University of Wisconsin - Madison
http://www.stat.wisc.edu/~bates/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-r-devel
mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To:
r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
From r-devel-owner@stat.math.ethz.ch Tue Nov 11 05:58 NZD 1997
Date: Mon, 10 Nov 1997 11:56:06 -0500
From: Paul Gilbert <pgilbert@bank-banque-canada.ca>
S> unix.time(res1 <-
cbind(x*y,x-y,x+y,x/y)[cbind(seq(along=x),match(op,c("*",
"-", "+",
"/")))])
[1] 0.40000010 0.04999995 0.00000000 0.00000000 0.00000000
R> unix.time(res1 <-
cbind(x*y,x-y,x+y,x/y)[cbind(seq(along=x),match(op,c("*",
"-", "+",
"/")))])
[1] 1.55 0.07 2.00 0.00 0.00
We have received a patch from someone in CS at Berkeley which will
be in 0.60 and should claw back a decent fraction of the difference
between R and S in this comparison. [ It eliminates our use of the
% operator in the loops for +, -, * / etc. ]
The looping speed was an original design goal for R. It's good to
know that it hasn't been fixed (yet).
Ross
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-r-devel
mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To:
r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
On Mon, 10 Nov 1997, Paul Gilbert wrote:> I noticed Doug only circulated the following on S-news, but it may be of > interest to R users who don't follow S-news. His findings are certainly > consistent with my own, where my problems often force me into the > element by element type of situation.It is also interesting to look at the allegedly realistic if extreme benchmark program provided with Matt Calder's partial S compiler. LMS <- function(M, N) { ### Pre-allocate result and filter. ### R <- matrix(0,nrow=M, ncol=5) W <- rep(0,5) for (i in 1:M) { ### Simulate MA(1) ### Z <- rnorm(N+1) X <- Z[2:(N+1)] + 0.5 * Z[1:N] ### Perform LMS ### for (j in 5:N) { U <- X[j:(j-4)] E <- Z[j+1] - sum(W * U) W <- W + 0.1* E * U } ### Save final result ### R[i,] <- W } return(R) } I used this in S, R and compiled to C on an Ultra 1 (Solaris) and in R and compiled to C on a Pentium Pro 150 under Linux The results are quite interesting. These are CPU times in seconds for a fixed value of N (100, I believe, but it was some time ago) Ultra Intel M R C S R C 100 7.24 2.08 11 6.38 0.79 200 14.19 4.06 25.8 12.36 1.56 300 21.59 6.3 56.8 18.89 2.31 400 28.76 8.37 89 24.97 3.09 800 57.62 16.42 295 50.33 6.16 The time taken by R scales roughly linearly with M, S time grows much faster. The two are comparable for small M. It's also interesting that the Intel machine was slightly faster, though this doesn't translate to disk and memory intensive tasks. Thomas Lumley ------------------------------------------------------+------ Biostatistics : "Never attribute to malice what : Uni of Washington : can be adequately explained by : Box 357232 : incompetence" - Hanlon's Razor : Seattle WA 98195-7232 : : ------------------------------------------------------------ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=