Reading a flat text file 138 Mbyte large into R with a combination of scan (to get the header) and read.table. After conversion of text time stamps to POSIXct and conversion of integer codes to factors I convert everything into one data frame and release the old structures containing the data by using rm(). Strangely, the rm() does not appear to reduce the used memory. I checked using memory.size(). Worse still, the amount of memory required grows. When I save an image the .RData image file is only 23 Mbyte, yet at some point in to the program, after having done nothing particularly difficult (two and three way frequency tables and some lattice graphs) the amount of memory in use is over 1 Gbyte. Not yet a problem, but it will become a problem. This is using R2.10.0 on Windows Vista. Does anybody know how to release memory as rm(dat) does not appear to do this properly. Regards, Alex van der Spek
On Wed, 5 May 2010, Alex van der Spek wrote:> Reading a flat text file 138 Mbyte large into R with a combination of scan > (to get the header) and read.table. After conversion of text time stamps to > POSIXct and conversion of integer codes to factors I convert everything into > one data frame and release the old structures containing the data by using > rm(). > > Strangely, the rm() does not appear to reduce the used memory. I checked > using memory.size(). Worse still, the amount of memory required grows. When I > save an image the .RData image file is only 23 Mbyte, yet at some point in to > the program, after having done nothing particularly difficult (two and three > way frequency tables and some lattice graphs) the amount of memory in use is > over 1 Gbyte. > > Not yet a problem, but it will become a problem. This is using R2.10.0 on > Windows Vista. > > Does anybody know how to release memory as rm(dat) does not appear to do this > properly.Rather, you do not appear to understand 'properly'. First, you need to garbage-collect to find how much memory is available for re-use. R does that internally as needed, but you can force it with gc(). Second, there is simply no reason for R not to use 'over 1 Gbyte' if it is available (and it was). Using lots of memory is faster, but the garbage collector will clean up when needed. The likely bottleneck for you is not the amount of memory used but fragmentation of the limited address space on 32-bit Windows. See the documentation .... Third, the .RData file is (by default) compressed. And fourth, 'releasing memory' usually means giving it back to the OS. That is an implementation detail and C runtime memory managers on many builds of R either never do so or do so tardily. This is again not an issue unless your system is short of virtual memory and given how cheap disc space is, there is no reason to be so.> > Regards, > Alex van der Spek > > ______________________________________________ > R-help at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, stats.ox.ac.uk/~ripley University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Thank you all, No offense meant. I like R tremendously but I admit I am only a beginner. I did not know about gc(), but it explains my confusion about rm() not doing what I expected it to do. I suspected that .RData was a compressed file. Thanks for the confirmation. As for Windows, unfortunately it is not upon me to choose the system. Alex van der Spek
Dear Alex, Has manual garbage collection had any effect? Sincerely, KeithC. -----Original Message----- From: Alex van der Spek [mailto:doorz at xs4all.nl] Sent: Wednesday, May 05, 2010 3:48 AM To: r-help at r-project.org Subject: [R] Memory issue Reading a flat text file 138 Mbyte large into R with a combination of scan (to get the header) and read.table. After conversion of text time stamps to POSIXct and conversion of integer codes to factors I convert everything into one data frame and release the old structures containing the data by using rm(). Strangely, the rm() does not appear to reduce the used memory. I checked using memory.size(). Worse still, the amount of memory required grows. When I save an image the .RData image file is only 23 Mbyte, yet at some point in to the program, after having done nothing particularly difficult (two and three way frequency tables and some lattice graphs) the amount of memory in use is over 1 Gbyte. Not yet a problem, but it will become a problem. This is using R2.10.0 on Windows Vista. Does anybody know how to release memory as rm(dat) does not appear to do this properly. Regards, Alex van der Spek