Mike Williamson
2010-Jul-16 23:24 UTC
[Rd] garbage collection & memory leaks in 'R', it seems...
Hello developers, I noticed that if I am running 'R', type "rm(list=objects())" and "gc()", 'R' will still be consuming (a lot) more memory than when I then close 'R' and re-open it. In my ignorance, I'm presuming this is something in 'R' where it doesn't really do a great job of garbage collection... at least not nearly as well as Windows or unix can do garbage collection. Am I right? If so, is there any better way to "clean up" the memory that 'R' is using? I have a script that runs a fairly large job, and I cannot keep it going on its own in a convenient way because of these remnants of garbage that pile up and eventually leave so little memory remaining that the script crashes. I have attached a screen shot of the "system performance" screen on the 64-bit windows machine (with 64-bit 'R') that I am using. I labeled portions of it before closing 'R' and after re-opening it (loading up the same saved image) to show the large difference in memory being consumed. Below are the details of my version of 'R'. Thanks for any help!! Thanks, Mike R version: Win-64bit version, 2.11.1 (the machine I'm using has 20 GB of memory, so that's my memory limit since it's 64 bit version) packages: see below... Package Version brew "brew" "1.0-3" colorspace "colorspace" "1.0-1" corpcor "corpcor" "1.5.6" digest "digest" "0.4.2" fdrtool "fdrtool" "1.2.6" GeneCycle "GeneCycle" "1.1.1" ggplot2 "ggplot2" "0.8.7" longitudinal "longitudinal" "1.1.5" MASS "MASS" "7.3-6" plyr "plyr" "0.1.9" proto "proto" "0.3-8" qvalue "qvalue" "1.22.0" R.methodsS3 "R.methodsS3" "1.2.0" R.oo "R.oo" "1.7.3" R.utils "R.utils" "1.4.3" randomForest "randomForest" "4.5-35" RColorBrewer "RColorBrewer" "1.0-2" reshape "reshape" "0.8.3" RODBC "RODBC" "1.3-1" sos "sos" "1.3-0" base "base" "2.11.1" boot "boot" "1.2-42" class "class" "7.3-2" cluster "cluster" "1.12.3" codetools "codetools" "0.2-2" datasets "datasets" "2.11.1" foreign "foreign" "0.8-40" graphics "graphics" "2.11.1" grDevices "grDevices" "2.11.1" grid "grid" "2.11.1" KernSmooth "KernSmooth" "2.23-3" lattice "lattice" "0.18-8" MASS "MASS" "7.3-6" Matrix "Matrix" "0.999375-39" methods "methods" "2.11.1" mgcv "mgcv" "1.6-2" nlme "nlme" "3.1-96" nnet "nnet" "7.3-1" rpart "rpart" "3.1-46" spatial "spatial" "7.3-2" splines "splines" "2.11.1" stats "stats" "2.11.1" stats4 "stats4" "2.11.1" survival "survival" "2.35-8" tcltk "tcltk" "2.11.1" tools "tools" "2.11.1" utils "utils" "2.11.1" "Telescopes and bathyscaphes and sonar probes of Scottish lakes, Tacoma Narrows bridge collapse explained with abstract phase-space maps, Some x-ray slides, a music score, Minard's Napoleanic war: The most exciting frontier is charting what's already here." -- xkcd -- Help protect Wikipedia. Donate now: http://wikimediafoundation.org/wiki/Support_Wikipedia/en -------------- next part -------------- A non-text attachment was scrubbed... Name: Capture.PNG Type: image/png Size: 89304 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20100716/37ffea0c/attachment.png>
Peter Dalgaard
2010-Jul-17 09:11 UTC
[Rd] garbage collection & memory leaks in 'R', it seems...
Mike Williamson wrote:> Hello developers, > > I noticed that if I am running 'R', type "rm(list=objects())" and > "gc()", 'R' will still be consuming (a lot) more memory than when I then > close 'R' and re-open it. In my ignorance, I'm presuming this is something > in 'R' where it doesn't really do a great job of garbage collection... at > least not nearly as well as Windows or unix can do garbage collection. > Am I right? If so, is there any better way to "clean up" the memory > that 'R' is using? I have a script that runs a fairly large job, and I > cannot keep it going on its own in a convenient way because of these > remnants of garbage that pile up and eventually leave so little memory > remaining that the script crashes.In a word, no, R is not particularly bad at GC. The internal gc() does a rather good job of finding unused objects as you can see from its returned report. Whether that memory is returned to the OS is a matter of the C-level services (malloc/free) that R's allocation routines use. As far as I recall, Windows free() just never returns memory to the OS. In general, whether it can be done at all depends on which part of the "heap" you have freed since you have to free from the end of it. (I.e., having a tiny object sitting at the end of the heap will force the entire range to be kept in memory.) R itself will allocate from freed-up areas of the heap as long as it can find a space that is big enough. However, there is always a tendency for memory to fragmentize so that you eventually have a pattern of many small objects with not-quite-big-enough holes between them. These issues affect most languages that do significant amounts of object allocation and destruction. You should not really compare it to OS level memory management because that's a different kettle of fish. In particular, user programs like R relies on having all objects mapped to a single linear address space, whereas the OS "just" needs to create a set of per-process virtual address spaces and has hardware help to do so. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com