I noticed that apply is VERY SLOW when applied to a "large" dimension as for example when computing the row sums of a matrix with thousands of rows. To demonstrate it, I did some benchmarking for different methods of computing the row sums of an nx10 matrix with n =3D 2000, ..., 10000. The first method (M1) I used is the normal apply command: y <- apply(x,1,sum) The second method (M2) uses a for-loop for the computations, where the memory for the resulting vector has been allocated before. That is, for n=3D2000: z <- numeric(2000); for (i in 1:2000) z[i] <- sum(x[i,]) The third method (M3) also uses a for-loop, but the resulting vector is built recursively, i.e. z1 <- NULL; for (i in 1:2000) z1 <- c(z1,sum(x[i,])) All computations have been made on a Pentium II 233MHz, 256MB, R started as R -v 50. The following table shows the minimum, mean, and maximum CPU-time in seconds as measured by system.time over 10 runs for every computation for different values of n. n M1 M2 M3 2000 4.03 4.16 4.34 0.27 0.40 0.47 0.51 0.63 0.71 4000 12.65 13.40 14.68 0.73 0.81 0.94 1.78 1.86 1.98 6000 26.51 28.14 29.50 1.19 1.22 1.38 3.79 3.80 3.80 8000 52.06 63.43 67.61 1.46 1.63 1.69 6.38 6.41 6.58 10000 84.06 98.17 118.94 1.93 2.01 2.13 9.78 9.79 9.81 That is, the computation of the sums of the rows of a 10000x10 matrix with apply takes about 100sec on average, where a simple for-loop does the same job in about 2sec. The next table shows for every method the relative increase of time as compared to the increase of the number of rows. n M1 M2 M3 1 1 1 1 2 3.221 2.025 2.952 3 6.764 3.050 6.032 4 15.247 4.075 10.175 5 23.599 5.025 15.540 You can see that the time for M2 (for-loop with memory allocated at the beginning) increases linearly with n, which is to be expected, whereas apply (M1) is scaling quadratically with the number of rows. M3 (for-loop, but new memory-allocation every time) is somewhere in between. So obviously, R becomes slow, when new memory has to be allocated repeatedly. I do not see a quick fix for apply, because apply can be used with arbitrary functions which might yield different results for different input data. But I think as long as memory allocation seems to be a performance bottleneck, one should be aware of this problem, use simple for-loops and use apply only on small data (small dimensions) or where it is really necessary. Andreas ************************************************************************ * Andreas Weingessel * ************************************************************************ * Institut f=FCr Statistik * Tel: (+43 1) 58801 4541 * * Technische Universit=E4t Wien * Fax: (+43 1) 504 14 98 * * Wiedner Hauptstr. 8-10/1071 * Andreas.Weingessel@ci.tuwien.ac.at * * A-1040 Wien, Austria * http://www.ci.tuwien.ac.at/~weingessel * ************************************************************************ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Andreas Weingessel <Andreas.Weingessel@ci.tuwien.ac.at> writes:> I noticed that apply is VERY SLOW when applied to a "large" > dimension as for example when computing the row sums of a matrix with > thousands of rows. > > To demonstrate it, I did some benchmarking for different methods of > computing the row sums of an nx10 matrix with n =3D 2000, ..., 10000. > > The first method (M1) I used is the normal apply command: > y <- apply(x,1,sum) > The second method (M2) uses a for-loop for the computations, where the > memory for the resulting vector has been allocated before. That is, for > n=3D2000: > z <- numeric(2000); for (i in 1:2000) z[i] <- sum(x[i,]) > The third method (M3) also uses a for-loop, but the resulting vector > is built recursively, i.e. > z1 <- NULL; for (i in 1:2000) z1 <- c(z1,sum(x[i,])) > > All computations have been made on a Pentium II 233MHz, 256MB, R > started as R -v 50. The following table shows the minimum, mean, and > maximum CPU-time in seconds as measured by system.time over 10 runs > for every computation for different values of n. > > n M1 M2 M3 > 2000 4.03 4.16 4.34 0.27 0.40 0.47 0.51 0.63 0.71 > 4000 12.65 13.40 14.68 0.73 0.81 0.94 1.78 1.86 1.98 > 6000 26.51 28.14 29.50 1.19 1.22 1.38 3.79 3.80 3.80 > 8000 52.06 63.43 67.61 1.46 1.63 1.69 6.38 6.41 6.58 > 10000 84.06 98.17 118.94 1.93 2.01 2.13 9.78 9.79 9.81 > > That is, the computation of the sums of the rows of a 10000x10 matrix > with apply takes about 100sec on average, where a simple for-loop does > the same job in about 2sec.There would be at least two other methods that would be interesting to try. Because you are interested in the row sums I imagine by far the fastest method would be (M4) y <- x %*% rep(1, ncol(x)) You may also find that (M5) y <- apply( t(x), 2, sum ) would be faster than M1 because of the way the R handles arrays. We should keep in mind when looking at these tables that the maximum time on the size 10000 case is about 2 minutes. If the computation is worth doing it may be worth waiting 2 minutes for the result. I remember the days of running S version 2 (i.e. the version before "New S") on a Vax-11/750. This sort of computation could take many, many hours on the only computer in the department so relative differences in speed for different methods were a lot more important. One got used to rephrasing computations in "efficient" ways. Today I think that clarity is usually more important than efficiency. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I was computing a function f(u,v) on bivariate point process data of about 4500 points. The computation in R took about 96 hours (4 days!) on a Pentium Pro 200 under linux with little or no other load. So I think speed can be an issue, esp. when computing things one will eventually display with image(). My collaborator and I wound up coding it in C and now the computation takes 4 min (yes, 4 min vs 4 days). Probably it could have been sped up in R but that would have taken the same effort as coding in C, and the result would have been slower. Bill Simpson -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Maybe my point was not quite clear. I did not want to compute the row sums of a certain matrix, I just wanted to demonstrate that the task "apply a certain function to every row of a matrix" can take a very long time, if it is done with the function "apply" which is of course the most clear way to implement this task. I first came across this problem when I made simulations on a data matrix with 15000 rows and after adding one simple "apply" to my program, it suddenly seemed not to finish at all. If I just want make such a computation once, it might be not important whether I wait a couple of seconds or 2 minutes or 5 minutes (as it needs time for 15000 rows). But if I do some simulation where these computations are repeated hundred times, there is a difference between 200 seconds or 200 minutes. Andreas ************************************************************************ * Andreas Weingessel * ************************************************************************ * Institut f=FCr Statistik * Tel: (+43 1) 58801 4541 * * Technische Universit=E4t Wien * Fax: (+43 1) 504 14 98 * * Wiedner Hauptstr. 8-10/1071 * Andreas.Weingessel@ci.tuwien.ac.at * * A-1040 Wien, Austria * http://www.ci.tuwien.ac.at/~weingessel * ************************************************************************ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Douglas Bates wrote:> > Andreas Weingessel <Andreas.Weingessel@ci.tuwien.ac.at> writes: > > > > To demonstrate it, I did some benchmarking for different methods of > > computing the row sums of an nx10 matrix with n =3D 2000, ..., 10000. > > > > The first method (M1) I used is the normal apply command: > > y <- apply(x,1,sum) > > The second method (M2) uses a for-loop for the computations, where the > > memory for the resulting vector has been allocated before. That is, for > > n=3D2000: > > z <- numeric(2000); for (i in 1:2000) z[i] <- sum(x[i,]) > > The third method (M3) also uses a for-loop, but the resulting vector > > is built recursively, i.e. > > z1 <- NULL; for (i in 1:2000) z1 <- c(z1,sum(x[i,])) > > > > All computations have been made on a Pentium II 233MHz, 256MB, R > > started as R -v 50. The following table shows the minimum, mean, and > > maximum CPU-time in seconds as measured by system.time over 10 runs > > for every computation for different values of n. > > > > n M1 M2 M3 > > 2000 4.03 4.16 4.34 0.27 0.40 0.47 0.51 0.63 0.71 > > 4000 12.65 13.40 14.68 0.73 0.81 0.94 1.78 1.86 1.98 > > 6000 26.51 28.14 29.50 1.19 1.22 1.38 3.79 3.80 3.80 > > 8000 52.06 63.43 67.61 1.46 1.63 1.69 6.38 6.41 6.58 > > 10000 84.06 98.17 118.94 1.93 2.01 2.13 9.78 9.79 9.81 > > > > That is, the computation of the sums of the rows of a 10000x10 matrix > > with apply takes about 100sec on average, where a simple for-loop does > > the same job in about 2sec.It looks like M1 and M3 are O(n^2) but M1 is O(n). I see where that comes from in M3, but it surprises me for M1. luke -- Luke Tierney University of Minnesota Phone: 612-625-7843 School of Statistics Fax: 612-624-8868 206 Church Street email: luke@stat.umn.edu Minneapolis, MN 55455 USA WWW: http://www.stat.umn.edu -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Subject: Re: performance of apply From: Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk> Date: 29 May 1998 18:04:30 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Oh, dear. Looking at apply, we have right in the middle of it: for (i in 1:d2) ans[[i]] <- FUN(array(newX[, i], d.call, dn.call), ...) Assignment to the tail of a very long list... I think I'm beginning to understand why Ross keeps talking about generic vectors. Either that or we need to add some low-level list primitives to the language. Got it in one. To reach ans[[i]] you have to cdr down the whole list. The subsetting code would be vastly simplified without the special list code. Ugliest of all is subscript manipulation for lists with dim attributes. [ I estimate 2 months for a fix - not that it will take that long, but there are some other things higher up the todo list ]. Ross -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._