I have noticed dramatic differences in the run-time for the execution of one of my functions depending on whether or not R was restarted. Immediately after restart of R GUI, exec time = 2.8 min. If I then repeat the execution of the function in the same R session, exec time = 7.1 min. Removing all objects via rm(list=(all=TRUE)) and initiating gc (gc(reset=TRUE)) helps, but only slightly (exec time = 5.0 min). any thoughts on why this happens? I realize that this is somewhat of a generic question given that I haven't provided the source code for the function. However, the function is very involved so that I think presenting it might violate the posting guidelines. The function creates and recreates large list structures and calls numerical functions on elements of these lists. The list structures can be nested several levels deep, e.g. a list of lists. Are there particular aspects of R function programming to watch out for that can create this sort of problem? I'm running windows R v 2.2.1 thanks! peter r ................................ Peter E. Rossi Joseph T. and Bernice S. Lewis Professor of Marketing and Statistics Editor, Quantitative Marketing and Economics Rm 353, Graduate School of Business, U of Chicago 5807 S. Woodlawn Ave, Chicago IL 60637, USA Tel: (773) 702-7513 | Fax: (773) 834-2081
Rossi, Peter E. wrote:> I have noticed dramatic differences in the run-time for the execution of > one of my functions depending on whether or not R was restarted. > Immediately > after restart of R GUI, exec time = 2.8 min. If I then repeat the > execution > of the function in the same R session, exec time = 7.1 min. Removing all > objects via rm(list=(all=TRUE)) and initiating gc (gc(reset=TRUE)) > helps, but > only slightly (exec time = 5.0 min). > > any thoughts on why this happens?Quite probably due to memory fragmentation. Smaller objects and more RAM might help a little bit. Uwe Ligges> I realize that this is somewhat of a generic question given that I > haven't > provided the source code for the function. However, the function > is very involved so that I think presenting it might violate the posting > guidelines. The function creates and recreates large list structures > and calls numerical functions on elements of these lists. The list > structures > can be nested several levels deep, e.g. a list of lists. > > Are there particular aspects of R function programming to watch out for > that can create this sort of problem? > > I'm running windows R v 2.2.1 > > thanks! > > peter r > > > ................................ > Peter E. Rossi > Joseph T. and Bernice S. Lewis Professor of Marketing and Statistics > Editor, Quantitative Marketing and Economics > Rm 353, Graduate School of Business, U of Chicago > 5807 S. Woodlawn Ave, Chicago IL 60637, USA > Tel: (773) 702-7513 | Fax: (773) 834-2081 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Em Seg 16 Jan 2006 16:02, Rossi, Peter E. escreveu:> of the function in the same R session, exec time = 7.1 min. Removing all > objects via rm(list=(all=TRUE)) and initiating gc (gc(reset=TRUE)) > helps, but > only slightly (exec time = 5.0 min).Hi, I am currently using extensively a script with +250 lines that performed the opposite: 3min 26s as first run 2min 57s in the second run and with several objects in memory My script performs calculations of bayesian probabilities over 11 variables of 1033 observations, uses no special library, performs several disk writes to a networked nfs exported directory and does the clean up of all explicitly created objects. BTW, I currently use R 2.2.1 and it seems much faster than version 1.9 that I used before. -- Alexandre Santos Aguiar, MD - independent consultant for health research - R Botucatu, 591 cj 81 - 04037-005 S??o Paulo - SP - Brazil tel +55-11-9320-2046 fax +55-11-5549-8760 www.spsconsultoria.com
On Mon, 16 Jan 2006, Alexandre Santos Aguiar wrote:> Em Seg 16 Jan 2006 16:02, Rossi, Peter E. escreveu: >> of the function in the same R session, exec time = 7.1 min. Removing all >> objects via rm(list=(all=TRUE)) and initiating gc (gc(reset=TRUE)) >> helps, but >> only slightly (exec time = 5.0 min). > > Hi, > > I am currently using extensively a script with +250 lines that performed the > opposite: > > 3min 26s as first run > 2min 57s in the second run and with several objects in memoryThis sounds like of those cases where automatic tuning of the garbage collector is helping. Setting the startup parameters (see ?Memory) to similar values to gc() reports at the end of the run will probably help. You can get further information from gc.time(). Most often performance deteriorates during a session as Uwe Ligges points out (at least on a 32-bit system as he uses).> My script performs calculations of bayesian probabilities over 11 variables of > 1033 observations, uses no special library, performs several disk writes to a > networked nfs exported directory and does the clean up of all explicitly > created objects. > > BTW, I currently use R 2.2.1 and it seems much faster than version 1.9 that I > used before.Not surprising as we continually do performance tuning. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595