Hervé Pagès
2015-Sep-29 21:42 UTC
[Rd] making object.size() more meaningful on environments?
Hi, Currently object.size() is not very useful on environments as it always returns 56 bytes, no matter how big the environment is: env1 <- new.env() object.size(env1) # 56 bytes env2 <- new.env(hash=TRUE, size=75000000L) object.size(env2) # 56 bytes env3 <- list2env(list(a=runif(25000000), L=LETTERS)) object.size(env3) # 56 bytes This makes it pretty useless on reference class instances and other objects that use environments internally for caching or other purposes. What about changing this and make it return something more meaningful? Cheers, H. -- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
Gabriel Becker
2015-Sep-29 21:51 UTC
[Rd] making object.size() more meaningful on environments?
Herve, The problem then would be that for A a refClass whose fields take up N bytes (in the sense that you mean), if we do B <- A A and B would look like the BOTH take up N bytes, for a total of 2N, whereas AFAIK R would only be using ~ N + 2*56 bytes, right? ~G On Tue, Sep 29, 2015 at 2:42 PM, Herv? Pag?s <hpages at fredhutch.org> wrote:> Hi, > > Currently object.size() is not very useful on environments as it always > returns 56 bytes, no matter how big the environment is: > > env1 <- new.env() > object.size(env1) # 56 bytes > > env2 <- new.env(hash=TRUE, size=75000000L) > object.size(env2) # 56 bytes > > env3 <- list2env(list(a=runif(25000000), L=LETTERS)) > object.size(env3) # 56 bytes > > This makes it pretty useless on reference class instances and other > objects that use environments internally for caching or other purposes. > > What about changing this and make it return something more meaningful? > > Cheers, > H. > > -- > Herv? Pag?s > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fredhutch.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Gabriel Becker, PhD Computational Biologist Bioinformatics and Computational Biology Genentech, Inc. [[alternative HTML version deleted]]
Hadley Wickham
2015-Sep-29 22:02 UTC
[Rd] making object.size() more meaningful on environments?
You might like to try pryr::object_size() : ``` r library(pryr) env1 <- new.env() object_size(env1) #> 328 B env2 <- new.env(hash = TRUE, size = 75000000L) object_size(env2) #> 600 MB env3 <- list2env(list(a = runif(2.5e+07), L = LETTERS)) object_size(env3) #> 200 MB ``` It handles the issue that Gabe mentions: ``` r a <- list2env(list(a = runif(1e+06))) object_size(a) #> 8 MB b <- new.env() b$a <- a b$b <- runif(1e+06) object_size(b) #> 16 MB object_size(a, b) #> 16 MB ``` You just have to remember that object_size(a) + object_size(b) <object_size(a, b). Hadley On Tue, Sep 29, 2015 at 4:42 PM, Herv? Pag?s <hpages at fredhutch.org> wrote:> Hi, > > Currently object.size() is not very useful on environments as it always > returns 56 bytes, no matter how big the environment is: > > env1 <- new.env() > object.size(env1) # 56 bytes > > env2 <- new.env(hash=TRUE, size=75000000L) > object.size(env2) # 56 bytes > > env3 <- list2env(list(a=runif(25000000), L=LETTERS)) > object.size(env3) # 56 bytes > > This makes it pretty useless on reference class instances and other > objects that use environments internally for caching or other purposes. > > What about changing this and make it return something more meaningful? > > Cheers, > H. > > -- > Herv? Pag?s > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fredhutch.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- http://had.co.nz/
Hervé Pagès
2015-Sep-29 22:18 UTC
[Rd] making object.size() more meaningful on environments?
Hi Gabe, On 09/29/2015 02:51 PM, Gabriel Becker wrote:> Herve, > > The problem then would be that for A a refClass whose fields take up N > bytes (in the sense that you mean), if we do > > B <- A > > A and B would look like the BOTH take up N bytes, for a total of 2N, > whereas AFAIK R would only be using ~ N + 2*56 bytes, right?Yes, but that's still a *much* better situation than the current one in my opinion. More generally speaking counting shared memory for each object (or process) that uses it is a common, sensible, and accepted approach. No need to look far: a character vector is just a collection of pointers to stuff that is shared thru the global CHARSXP cache and AFAIK object.size() takes this stuff into account. H.> > ~G > > > > On Tue, Sep 29, 2015 at 2:42 PM, Herv? Pag?s <hpages at fredhutch.org > <mailto:hpages at fredhutch.org>> wrote: > > Hi, > > Currently object.size() is not very useful on environments as it always > returns 56 bytes, no matter how big the environment is: > > env1 <- new.env() > object.size(env1) # 56 bytes > > env2 <- new.env(hash=TRUE, size=75000000L) > object.size(env2) # 56 bytes > > env3 <- list2env(list(a=runif(25000000), L=LETTERS)) > object.size(env3) # 56 bytes > > This makes it pretty useless on reference class instances and other > objects that use environments internally for caching or other purposes. > > What about changing this and make it return something more meaningful? > > Cheers, > H. > > -- > Herv? Pag?s > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org> > Phone: (206) 667-5791 <tel:%28206%29%20667-5791> > Fax: (206) 667-1319 <tel:%28206%29%20667-1319> > > ______________________________________________ > R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > -- > Gabriel Becker, PhD > Computational Biologist > Bioinformatics and Computational Biology > Genentech, Inc.-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319