Hello, I am experiencing a very noticeable memory leak when using large lists of large data. The code below creates a list of matrices, yet the memory does not free up after removing that item and performing a garbage collection. Is there a something I can do to prevent this memory leak? Any help would be greatly appreciated. By the way, if you execute the code, please run it in a new R session. # Start of code =============================================================== # Function that returns memory being used MemoryUsed <- function(){ pid <- Sys.getpid() system(paste0("top -n 1 -p ", pid, " -b"), intern = TRUE)[c(7,8)] } # Initial memory (VIRT memory equals about 400,000 bytes on my machine) MemoryUsed() # Create a large list of large data, remove it, and perform garbarge collection ncols <- 100 nrows <- 10000 mat <- matrix(seq(nrows * ncols), nrow = nrows, ncol = ncols) ls <- lapply(1:1000, function(x) as.data.frame(mat)) rm(list = setdiff(ls(), 'MemoryUsed')) invisible(gc()) # Final memory (now, VIRT memory equals about 4,600,000 bytes on my machine) MemoryUsed() # End of code ================================================================= My session info is: R version 3.4.4 (2018-03-15) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.3 LTS Matrix products: default BLAS: /usr/lib/libblas/libblas.so.3.6.0 LAPACK: /usr/lib/lapack/liblapack.so.3.6.0 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.4.4 tools_3.4.4 yaml_2.1.19 [[alternative HTML version deleted]]
This looks like a case of FAQ 7.42: https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-is-R-apparently-not-releasing-memory_003f On Mon, Jul 16, 2018 at 2:32 PM, Daniel Raduta <datudar at gmail.com> wrote:> Hello, > > I am experiencing a very noticeable memory leak when using large lists of > large data. The code below creates a list of matrices, yet the memory does > not free up after removing that item and performing a garbage collection. > Is there a something I can do to prevent this memory leak? Any help would > be greatly appreciated. By the way, if you execute the code, please run it > in a new R session. > > # Start of code > ===============================================================> > # Function that returns memory being used > MemoryUsed <- function(){ > pid <- Sys.getpid() > system(paste0("top -n 1 -p ", pid, " -b"), intern = TRUE)[c(7,8)] > } > > # Initial memory (VIRT memory equals about 400,000 bytes on my machine) > MemoryUsed() > > # Create a large list of large data, remove it, and perform garbarge > collection > ncols <- 100 > nrows <- 10000 > mat <- matrix(seq(nrows * ncols), nrow = nrows, ncol = ncols) > ls <- lapply(1:1000, function(x) as.data.frame(mat)) > rm(list = setdiff(ls(), 'MemoryUsed')) > invisible(gc()) > > # Final memory (now, VIRT memory equals about 4,600,000 bytes on my machine) > MemoryUsed() > > # End of code > =================================================================> > My session info is: > > R version 3.4.4 (2018-03-15) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 16.04.3 LTS > > Matrix products: default > BLAS: /usr/lib/libblas/libblas.so.3.6.0 > LAPACK: /usr/lib/lapack/liblapack.so.3.6.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_3.4.4 tools_3.4.4 yaml_2.1.19 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com R/Finance 2018 | www.rinfinance.com
On 07/17/2018 12:56 PM, Joshua Ulrich wrote:> This looks like a case of FAQ 7.42: > https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-is-R-apparently-not-releasing-memory_003fYes. A true memory leak in R would mean that repeated execution of the same code (e.g. creation and deletion of the list) would start failing due to running out of memory at some (high) iteration number. Also, gc(verbose=TRUE, full=TRUE) run before and after the code snippet suggests there is no leak. Well, a little bit of memory could be added initially (e.g. byte-compilation, some internal caching at different levels) - and this is why "high iteration number" above - but eventually the memory use in cons cells and vectors should become constant and it does on my system. Tomas> On Mon, Jul 16, 2018 at 2:32 PM, Daniel Raduta <datudar at gmail.com> wrote: >> Hello, >> >> I am experiencing a very noticeable memory leak when using large lists of >> large data. The code below creates a list of matrices, yet the memory does >> not free up after removing that item and performing a garbage collection. >> Is there a something I can do to prevent this memory leak? Any help would >> be greatly appreciated. By the way, if you execute the code, please run it >> in a new R session. >> >> # Start of code >> ===============================================================>> >> # Function that returns memory being used >> MemoryUsed <- function(){ >> pid <- Sys.getpid() >> system(paste0("top -n 1 -p ", pid, " -b"), intern = TRUE)[c(7,8)] >> } >> >> # Initial memory (VIRT memory equals about 400,000 bytes on my machine) >> MemoryUsed() >> >> # Create a large list of large data, remove it, and perform garbarge >> collection >> ncols <- 100 >> nrows <- 10000 >> mat <- matrix(seq(nrows * ncols), nrow = nrows, ncol = ncols) >> ls <- lapply(1:1000, function(x) as.data.frame(mat)) >> rm(list = setdiff(ls(), 'MemoryUsed')) >> invisible(gc()) >> >> # Final memory (now, VIRT memory equals about 4,600,000 bytes on my machine) >> MemoryUsed() >> >> # End of code >> =================================================================>> >> My session info is: >> >> R version 3.4.4 (2018-03-15) >> Platform: x86_64-pc-linux-gnu (64-bit) >> Running under: Ubuntu 16.04.3 LTS >> >> Matrix products: default >> BLAS: /usr/lib/libblas/libblas.so.3.6.0 >> LAPACK: /usr/lib/lapack/liblapack.so.3.6.0 >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] compiler_3.4.4 tools_3.4.4 yaml_2.1.19 >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > >