I think this is an interesting discussion -- I've learned from both
Steve's and Brian's comments, and I'm broadening it to R-help
since I think others will be interested as well.
The problem up for comment is:
result <- apply(array.3D, 1:2, sum)
Where array.3D is 3000 by 300 by 3.
The original poster already had a perfectly good replacement for
this problem that was virtually instantaneous. A solution for this
particular problem is not the issue, it is merely the starting point
for cases where there wouldn't be a trivial workaround.
Steve Karmesin wrote:
SK> As others have said, what apply has to do in this case is loop over
the 900,000
SK> cases and do a 'sum' over three elements each time. In this
case
the overhead
SK> of calling an S+ function totally swamps the numeric operations.
SK>
SK> Doing this on smaller datasets (300x30x3) on my machine (2CPU, 3GHz
Xeon
SK> running Windows 2000 and S-Plus 6.1) shows an overhead of about 140
SK> microseconds per call to sum, so I would expect it to take
100*1e-6*9e5=90 seconds.
SK>
SK> The thing is, it is worse than this. If I do a case with 900x90x3
it takes 300 usec per 'sum'.
SK>
SK> R is fairly stable at just under 15usec per 'sum' on my machine.
SK>
SK> A little more investigation (together with office mate Tony Plate)
provides some insight.
SK>
SK> Using mem.tally.reset() and mem.tally.report() shows that for this
case it is allocating a
SK> whopping 1280 bytes for each call to 'sum'.
SK>
SK> Just touching that much memory is going to be slow. So why would it
do that? Looking
SK> at the definition of the apply function shows that it is allocating
a general list for the result,
SK> not a vector-based array or matrix.
SK>
SK> Why? It has a shortcut that lets it use efficient matrices if the
input is a 2D matrix, but this
SK> one is 3D, so it uses the general code, which is much, much slower
and uses a lot more memory.
SK>
SK> If you collapse the first two dimensions of the array the times are
stable at <80usec per
SK> call to sum and it allocates 8 bytes per call, which is just the
amount of space needed.
SK>
SK> Still, the R code seems to always build a list, and it is about
15usec per call. Somehow
SK> the underlying function call and perhaps list storage mechanisms are
more efficient there.
Prof Brian Ripley wrote:
BR> There are almost always pros and cons with these issues. S's sum()
is an
BR> S4 generic whereas R's is internal *unless* you define an S4 method
for
BR> it (which S-PLUS has already done). S needs to create several frames for
BR> what is a nested set of function calls -- 1280b looks modest for that.
BR>
BR> Also, S has an ability to back out calculations that R does not, and that
BR> costs memory (and can have benefits).
BR>
BR> We know there are overheads in making functions generic, especially
BR> S4-generic, but then there are benefits too. I am not sure designers who
BR> add features take enough account of the costs.
Using R 1.8.1 (precompiled) on SuSe Linux with a Xeon 2.4GHz and 1G of
memory:
set.seed(2)
jja <- array(rnorm(3000*300*3), c(3000, 300, 3))
gc()
system.time(jjsa <- apply(jja, 1:2, sum)) # takes 30 seconds
sumS3 <- function(x, ...) UseMethod("sumS3")
sumS3.default <- function(x, ...) sum(x, ...)
gc()
system.time(jjsa3 <- apply(jja, 1:2, sumS3)) # takes 65 seconds
sumS4 <- function(x, ...) standardGeneric("sumS4")
setMethod("sumS4", signature(x="numeric"), function(x, ...)
sum(x, ...))
gc()
system.time(jjsa4 <- apply(jja, 1:2, sumS4)) # takes 58 seconds
Questions:
It looks to me like the penalty for making the functions generic is
similar to one extra function call. Does the penalty grow as there
are more methods? Are there other types of penalties for making
a function generic?
Is the test with sumS4 still an unfair comparison with S-PLUS?
Are things better with S-PLUS 6.2?
Patrick Burns
Burns Statistics
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")