Peter Waltman
2007-Aug-18 06:49 UTC
[R] Suspected memory leak with R v.2.5.x and large matrices with dimnames set
Hi - Admittedly, this may not be the most sophisticated memory profiling performed, but when using unix's top command, I'm noticing a notable memory leak when using R with a large matrix that has dimnames set. To allow people to reproduce the problem I'm seeing, I've added a small (< 50 lines) code snippet at the end of this email. I'm seeing this problem on both a MacOS box using R v.2.5.1 and a Unix box (x86_64) running R v.2.5.0. The output from sessionInfo() for both machines are below. What I'm seeing is that when I create a 20k x 2k matrix that does not have any dimnames set, if I call a function (the f() function below) that makes a couple of local copies of subsets of the matrix and then returns the result of some statistical massaging, R works mostly fine (more on this below) However, if I set the dimnames (currently commented out in the code snippet below), and then call from the R command intrepreter: res <- sapply( 1:10, function(i) { cat(i, "\n"); f() } ) gc() rm( list=ls() ) gc() unix's top command reports that R has a memory stamp of roughly 2 gig (1.2 on the MacOS box), although R's gc() command reports for this 'empty' instance of R: > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 236823 12.7 467875 25.0 467875 25.0 Vcells 120446 1.0 109363282 834.4 155806232 1188.8 > As I said, if the matrix does not have the dimnames set, the same procedure will produce the same output from R's gc() command, though unix's top command reports that R's memory stamp is actually >270 meg. Not sure if that's just a basal level of R's memory needs. I see this on both OS's I'm using and both versions of R (v.2.5.x). If I'm doing something wrong in my code below which is causing this issue, please let me know, but it's fairly vanilla code so I'm not sure Thanks, Peter Waltman SessionInfo output: Mac > sessionInfo() R version 2.5.1 (2007-06-27) powerpc-apple-darwin8.9.1 locale: C attached base packages: [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" [7] "base" > Unix: > sessionInfo() R version 2.5.0 (2007-04-23) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en _US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en _US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" [7] "base" > test.R: f<-function() { my.cols <- sample( ncol( val ), 750 ) my.r <- val[ sample( nrow( val ), 15 ), my.cols ] avg.rows <- apply( my.r, 2, mean, na.rm=TRUE ) rm ( my.r) gc() my.r.all <- val[ , my.cols ] devs <- apply( my.r.all, 1, "-", avg.rows ) rm( my.r.all ) gc() apply( devs, 2, var, na.rm=TRUE ) } ) val<-matrix( rnorm( (20000*2000) ), 20000, 2000 )#, dimnames= list( paste( "AT2G", 1:20000,sep="" ), paste( "AT2Gcol", 1:2000,sep="" ) ) ) gc() #res <- sapply(1:10, function(i) f()) # --- works fine if dimnames aren't set # rm( list=ls() ) #gc()
Seth Falcon
2007-Aug-18 19:21 UTC
[Rd] [R] Suspected memory leak with R v.2.5.x and large matrices with dimnames set
Hi Peter, Peter Waltman <waltman at cs.nyu.edu> writes:> Admittedly, this may not be the most sophisticated memory profiling > performed, but when using unix's top command, I'm noticing a notable > memory leak when using R with a large matrix that has dimnames > set.I'm not sure I understand what you are reporting. One thing to keep in mind is that how memory released by R is handled is OS dependent and one will often observe that after R frees some memory, the OS does not report that amount as now free. Is what you are observing preventing you from getting things done, or just a concern that there is a leak that needs fixing? It is worth noting that the internal handling of character vectors has changed in R-devel and so IMO testing there would make sense before persuing this further, I suspect your results will be different. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center BioC: http://bioconductor.org/ Blog: http://userprimary.net/user/
Maybe Matching Threads
- Suspected memory leak with R v.2.5.x and large matrices with dimnames set
- Possible memory leak with R v.2.5.0
- Suggestion on how to make permanent changes to a single object in a list?
- Running R under Sun Grid Engine with OpenMPI tight integration
- Memory leak with character arrays?