Bob Sandefur
2000-Nov-01 14:00 UTC
[R] Performance note: Preallocating helps? and two questions
hi- in r 1.1 on windows 2000 with length(AU) of 35833 AUcap30<-0 for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) took over an hour on pentium II 300 mhertz (I esc'ed before it finished) but AUcap30<-AU for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) is very quick (a few seconds) Is this performance difference common in r (ie is linux the same way)? Are there other tricks to speed up R (in windows) (besides a faster processor and more memory)? thanx bob sandefur Principal Geostatistician Pincock Allen & Holt International Minerals Consultants 274 Union Suite 200 Lakewood CO 80228 USA 303 914-4467 v 303 987-8907 f -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Kaspar Pflugshaupt
2000-Nov-01 14:45 UTC
[R] Performance note: Preallocating helps? and two questions
> AUcap30<-0 > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) > took over an hour on pentium II 300 mhertz (I esc'ed before it finished) > but > AUcap30<-AU > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) > is very quick (a few seconds)> Is this performance difference common in r (ie is linux the same way)?The crucial thing is that in your first example, AUcap30 is set to 0 (one number), whereas in the second one, it's initialized as a vector of the same size as AU. So, in the first one, R has to grow the variable for each iteration of the loop, and in the second, it just fills in an already existing vector. I've noticed similar speed differences for this constellation on Linux, or on S-Plus (Win or Unix). It really pays to initialize objects in the size you're going to need. Cheers Kaspar Pflugshaupt -- Kaspar Pflugshaupt Geobotanisches Institut Zuerichbergstr. 38 CH-8044 Zuerich Tel. ++41 1 632 43 19 Fax ++41 1 632 12 15 mailto:pflugshaupt at geobot.umnw.ethz.ch privat:pflugshaupt at mails.ch http://www.geobot.umnw.ethz.ch -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Peter Dalgaard BSA
2000-Nov-01 14:47 UTC
[R] Performance note: Preallocating helps? and two questions
"Bob Sandefur" <rls at pincock.com> writes:> hi- > in r 1.1 on windows 2000 > with length(AU) of 35833 > AUcap30<-0 > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) > took over an hour on pentium II 300 mhertz (I esc'ed before it finished) > but > AUcap30<-AU > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) > is very quick (a few seconds) > > Is this performance difference common in r (ie is linux the same way)?Yes. The problem with the first version is that vectors are allocated just long enough. Assigning past the end of one will stretch it, but that involves allocating a new longer vector, copying the data into the first N elements and then assigning the new value to the N+1st element. So you have 35833 allocations of size 1,2,3,4,5,....,35833 in the above code, and that is going to take some time. The canonical form for a loop calculating a vector element by element would be x<-numeric(N) for (i in 1:N) # often better: "i in seq(length=N)" x[i]<-...> Are there other tricks to speed up R (in windows) (besides a faster processor and more memory)?Several... One of the more effective ones is vectorisation: Try AUcap30 <- pmin(30,AU) -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Angelo Canty
2000-Nov-01 15:53 UTC
[R] Performance note: Preallocating helps? and two questions
I'm not sure why the first method works so badly although it is clearly not something you should do. You are trying to use indices of 1:35833 on a vector of length 1. With the second method the vector that you are using is at least of the correct length. On my Sun Sparc the first method takes 3 minutes and the second takes 9 seconds. Interestingly on S-plus 3.4 for Unix, the two methods take about the same time (12 seconds each). My personal feeling is that the first method should result in an error message (or at least a warning). In any case there is no need for a loop here since AUcap30 <- pmin(AU, 30) does what you want in about half a second. Angelo Bob Sandefur wrote:> > hi- > in r 1.1 on windows 2000 > with length(AU) of 35833 > AUcap30<-0 > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) > took over an hour on pentium II 300 mhertz (I esc'ed before it finished) > but > AUcap30<-AU > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) > is very quick (a few seconds) > > Is this performance difference common in r (ie is linux the same way)? > > Are there other tricks to speed up R (in windows) (besides a faster processor and more memory)? > > thanx > > bob sandefur > > Principal Geostatistician > Pincock Allen & Holt > International Minerals Consultants > 274 Union Suite 200 > Lakewood CO 80228 > USA > 303 914-4467 v > 303 987-8907 f > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- **************************************************** ** Angelo J. Canty ** ** Dept of Mathematics and Statistics ** ** Concordia University ** ** Montreal, Quebec. ** ** ** ** Tel : +1-514-848-3244 ** ** Fax : +1-514-848-4511 ** ** Email : canty at discrete.concordia.ca ** **************************************************** -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Emmanuel Paradis
2000-Nov-01 16:40 UTC
[R] Performance note: Preallocating helps? and two questions
At 06:00 01/11/00 -0800, you wrote:>hi- > in r 1.1 on windows 2000 > with length(AU) of 35833 > AUcap30<-0 > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) >took over an hour on pentium II 300 mhertz (I esc'ed before it finished) >but >AUcap30<-AU > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) >is very quick (a few seconds) > >Is this performance difference common in r (ie is linux the same way)?I do not think this is a problem with memory. In the first case, you have length(AUcap30)==1, thus your loop is non-sense. I would have expected an error message here like "subscript out of range for AUcap30" (is this a bug? it seems R is trapped in the loop). In the second case, you have:> length(AU)==length(AUcap30)[1] TRUE and things are fine. Emmanuel Paradis -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Emmanuel Paradis
2000-Nov-01 16:45 UTC
[R] Performance note: Preallocating helps? and two questions
Ignore my reply as I have now received Peter's and Kaspar's judicious ones. EP At 06:00 01/11/00 -0800, you wrote:>hi- > in r 1.1 on windows 2000 > with length(AU) of 35833 > AUcap30<-0 > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) >took over an hour on pentium II 300 mhertz (I esc'ed before it finished) >but >AUcap30<-AU > for(i in 1:length(AU))AUcap30[i]<-min(30,AU[i]) >is very quick (a few seconds) > >Is this performance difference common in r (ie is linux the same way)?I do not think this is a problem with memory. In the first case, you have length(AUcap30)==1, thus your loop is non-sense. I would have expected an error message here like "subscript out of range for AUcap30" (is this a bug? it seems R is trapped in the loop). In the second case, you have:> length(AU)==length(AUcap30)[1] TRUE and things are fine. Emmanuel Paradis -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._