Is there any way to detect which calls are consuming memory? I run a program whose global variables take up about 50 Megabytes of memory, but when I monitor the progress of the program it seems to allocating 150 Megabytes of memory, with peaks of up to 2 Gigabytes. I know that the global variables aren't "copied" many times by the routines, but I suspect something weird must be happening. Alberto Monteiro PS: the lines, below, count the memory allocated to all global variables, probably it could be adapted to track the local variables: y <- ls(pat="") # get all names of the variables z <- rep(0, length(y)) # create array of sizes for (i in 1:length(y)) z[i] <- object.size(get(y[i])) # loop: get all sizes (in bytes) of the variables # BTW, is there any way to vectorialize the above loop? xix <- sort.int(z, index.return = TRUE) # sort the sizes y <- y[xix$ix] # apply the sort to the variables z <- z[xix$ix] # apply the sort to the sizes y <- c(y, "total") # add a totalizator z <- c(z, sum(z)) # sum them all cbind(y, z) # ugly way to list them
On 29/12/2014 1:52 PM, ALBERTO VIEIRA FERREIRA MONTEIRO wrote:> Is there any way to detect which calls are consuming memory?The Rprofmem() function can do this, but you need to build R to enable it. Rprof() does a more limited version of the same thing if run with memory.profiling = TRUE. Duncan Murdoch> > I run a program whose global variables take up about 50 Megabytes of > memory, but when I monitor the progress of the program it seems to > allocating 150 Megabytes of memory, with peaks of up to 2 Gigabytes. > > I know that the global variables aren't "copied" many times by the > routines, but I suspect something weird must be happening. > > Alberto Monteiro > > PS: the lines, below, count the memory allocated to all global > variables, probably it could be adapted to track the local variables: > > y <- ls(pat="") # get all names of the variables > z <- rep(0, length(y)) # create array of sizes > for (i in 1:length(y)) z[i] <- object.size(get(y[i])) # loop: get all > sizes (in bytes) of the variables > # BTW, is there any way to vectorialize the above loop? > xix <- sort.int(z, index.return = TRUE) # sort the sizes > y <- y[xix$ix] # apply the sort to the variables > z <- z[xix$ix] # apply the sort to the sizes > y <- c(y, "total") # add a totalizator > z <- c(z, sum(z)) # sum them all > cbind(y, z) # ugly way to list them > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
You might find the advice at http://adv-r.had.co.nz/memory.html helpful. Hadley On Tue, Dec 30, 2014 at 7:52 AM, ALBERTO VIEIRA FERREIRA MONTEIRO <albmont at centroin.com.br> wrote:> Is there any way to detect which calls are consuming memory? > > I run a program whose global variables take up about 50 Megabytes of > memory, but when I monitor the progress of the program it seems to > allocating 150 Megabytes of memory, with peaks of up to 2 Gigabytes. > > I know that the global variables aren't "copied" many times by the > routines, but I suspect something weird must be happening. > > Alberto Monteiro > > PS: the lines, below, count the memory allocated to all global > variables, probably it could be adapted to track the local variables: > > y <- ls(pat="") # get all names of the variables > z <- rep(0, length(y)) # create array of sizes > for (i in 1:length(y)) z[i] <- object.size(get(y[i])) # loop: get all > sizes (in bytes) of the variables > # BTW, is there any way to vectorialize the above loop? > xix <- sort.int(z, index.return = TRUE) # sort the sizes > y <- y[xix$ix] # apply the sort to the variables > z <- z[xix$ix] # apply the sort to the sizes > y <- c(y, "total") # add a totalizator > z <- c(z, sum(z)) # sum them all > cbind(y, z) # ugly way to list them > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- http://had.co.nz/
On Mon, Dec 29, 2014 at 10:52 AM, ALBERTO VIEIRA FERREIRA MONTEIRO <albmont at centroin.com.br> wrote:> Is there any way to detect which calls are consuming memory? > > I run a program whose global variables take up about 50 Megabytes of > memory, but when I monitor the progress of the program it seems to > allocating 150 Megabytes of memory, with peaks of up to 2 Gigabytes. > > I know that the global variables aren't "copied" many times by the > routines, but I suspect something weird must be happening. > > Alberto Monteiro > > PS: the lines, below, count the memory allocated to all global > variables, probably it could be adapted to track the local variables: > > y <- ls(pat="") # get all names of the variables > z <- rep(0, length(y)) # create array of sizes > for (i in 1:length(y)) z[i] <- object.size(get(y[i])) # loop: get all > sizes (in bytes) of the variables > # BTW, is there any way to vectorialize the above loop? > xix <- sort.int(z, index.return = TRUE) # sort the sizes > y <- y[xix$ix] # apply the sort to the variables > z <- z[xix$ix] # apply the sort to the sizes > y <- c(y, "total") # add a totalizator > z <- c(z, sum(z)) # sum them all > cbind(y, z) # ugly way to list themDuncan already suggested Rprofmem(). For a neat interface to that, see also lineprof package. Common memory hogs are cbind(), rbind() and other ways of incrementally building up objects. These can often be avoided by pre-allocating the final object up front and populating it as you go. Another source of unnecessary memory duplication is coercion of data types, e.g. allocating an integer matrix but populating it with doubles. A related mistake is to use matrix(nrow, ncol) for allocate matrices that will hold numeric values. That is actually doing matrix(NA, nrow, ncol), which becomes a *logical* matrix, which will be coerced (involving copying and large memory allocation) the first thing as soon as it get's populated with a numeric value. One should have used matrix(NA_real_, nrow, ncol) here. For listing objects, their sizes and more, you can use ll() in the R.oo package which returns a data.frame, e.g.> example(iris) > a <- 1:1e6 > R.oo::ll()member data.class dimension objectSize 1 a numeric 1000000 4000040 2 dni3 list 3 600 3 ii data.frame c(150,5) 7088 4 iris data.frame c(150,5) 7088> R.oo::ll(sortBy="objectSize")member data.class dimension objectSize 2 dni3 list 3 600 3 ii data.frame c(150,5) 7088 4 iris data.frame c(150,5) 7088 1 a numeric 1000000 4000040> tbl <- R.oo::ll() > tbl <- tbl[order(tbl$objectSize, decreasing=TRUE),] > tblmember data.class dimension objectSize 1 a numeric 1000000 4000040 3 ii data.frame c(150,5) 7088 4 iris data.frame c(150,5) 7088 5 objs data.frame c(4,4) 2760 2 dni3 list 3 600> sum(tbl$objectSize)[1] 4017576 /Henrik> > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thanks to Duncan, Hadley and Henrik. Duncan, I used Rprof and could pinpoint the critical routine that was doing the memory crash. Henrik, you got it right: the culprit was a big matrix of integers, but where some of its fields are filled with -Inf and Inf. This matrix is global, it's used only once, it does not consume too much memory, and it should be harmless, but... Hadley, your link to memory allocation and management helped to identify the problem. I did a very stupid think, I added some debug in the critical routine that duplicated it at each iteration of a loop... So that big matrix with integers and Infs and -Infs was being copied several times, killing memory needlessly. Thanks for all the help. "I got 99 problems but you won't be one" Alberto Monteiro