Henrik Bengtsson
2006-Apr-04 13:40 UTC
[Rd] Return function from function with minimal environment
Hi,
this relates to the question "How to set a former environment?" asked
yesterday. What is the best way to to return a function with a
minimal environment from a function? Here is a dummy example:
foo <- function(huge) {
scale <- mean(huge)
function(x) { scale * x }
}
fcn <- foo(1:10e5)
The problem with this approach is that the environment of 'fcn' does
not only hold 'scale' but also the memory consuming object
'huge',
i.e.
env <- environment(fcn)
ll(envir=env) # ll() from R.oo
# member data.class dimension object.size
# 1 huge numeric 1000000 4000028
# 2 scale numeric 1 36
save(env, file="temp.RData")
file.info("temp.RData")$size
# [1] 2007624
I generate quite a few of these and my 'huge' objects are of order
100Mb, and I want to keep memory usage as well as file sizes to a
minimum. What I do now, is to remove variable from the local
environment of 'foo' before returning, i.e.
foo2 <- function(huge) {
scale <- mean(huge)
rm(huge)
function(x) { scale * x }
}
fcn <- foo2(1:10e5)
env <- environment(fcn)
ll(envir=env)
# member data.class dimension object.size
# 1 scale numeric 1 36
save(env, file="temp.RData")
file.info("temp.RData")$size
# [1] 156
Since my "foo" functions are complicated and contains many local
variables, it becomes tedious to identify and remove all of them, so
instead I try:
foo3 <- function(huge) {
scale <- mean(huge);
env <- new.env();
assign("scale", scale, envir=env);
bar <- function(x) { scale * x };
environment(bar) <- env;
bar;
}
fcn <- foo3(1:10e5)
But,
env <- environment(fcn)
save(env, file="temp.RData");
file.info("temp.RData")$size
# [1] 2007720
When I try to set the parent environment of 'env' to emptyenv(), it
does not work, e.g.
fcn(2)
# Error in fcn(2) : attempt to apply non-function
but with the new.env(parent=baseenv()) it works fine. The "base"
environment has the empty environment as a parent. So, I try to do
the same myself, i.e. new.env(parent=new.env(parent=emptyenv())), but
once again I get
fcn(2)
# Error in fcn(2) : attempt to apply non-function
Apparently, I do not understand enough here. Please, enlighten me. In
the meantime I stick with foo2().
Best,
Henrik
Thomas Lumley
2006-Apr-04 14:29 UTC
[Rd] Return function from function with minimal environment
On Tue, 4 Apr 2006, Henrik Bengtsson wrote:> Hi, > > this relates to the question "How to set a former environment?" asked > yesterday. What is the best way to to return a function with a > minimal environment from a function? Here is a dummy example: > > foo <- function(huge) { > scale <- mean(huge) > function(x) { scale * x } > } > > fcn <- foo(1:10e5) > > The problem with this approach is that the environment of 'fcn' does > not only hold 'scale' but also the memory consuming object 'huge', > i.e. > > env <- environment(fcn) > ll(envir=env) # ll() from R.oo > # member data.class dimension object.size > # 1 huge numeric 1000000 4000028 > # 2 scale numeric 1 36 > > save(env, file="temp.RData") > file.info("temp.RData")$size > # [1] 2007624 > > I generate quite a few of these and my 'huge' objects are of order > 100Mb, and I want to keep memory usage as well as file sizes to a > minimum. What I do now, is to remove variable from the local > environment of 'foo' before returning, i.e. > > foo2 <- function(huge) { > scale <- mean(huge) > rm(huge) > function(x) { scale * x } > } > > fcn <- foo2(1:10e5) > env <- environment(fcn) > ll(envir=env) > # member data.class dimension object.size > # 1 scale numeric 1 36 > > save(env, file="temp.RData") > file.info("temp.RData")$size > # [1] 156 > > Since my "foo" functions are complicated and contains many local > variables, it becomes tedious to identify and remove all of them, so > instead I try: > > foo3 <- function(huge) { > scale <- mean(huge); > env <- new.env(); > assign("scale", scale, envir=env); > bar <- function(x) { scale * x }; > environment(bar) <- env; > bar; > } > > fcn <- foo3(1:10e5) > > But, > > env <- environment(fcn) > save(env, file="temp.RData"); > file.info("temp.RData")$size > # [1] 2007720 > > When I try to set the parent environment of 'env' to emptyenv(), it > does not work, e.g. > > fcn(2) > # Error in fcn(2) : attempt to apply non-function > > but with the new.env(parent=baseenv()) it works fine. The "base" > environment has the empty environment as a parent. So, I try to do > the same myself, i.e. new.env(parent=new.env(parent=emptyenv())), but > once again I getI don't think you want to remove baseenv() from the environment. If you do, no functions from baseenv will be visible inside fcn. These include "{" and "*", which are necessary for your function. I think the error message comes from being unable to find "{". Also, there is no memory use from having baseenv in the environment, since all the objects in baseenv are always present. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
Roger D. Peng
2006-Apr-04 14:38 UTC
[Rd] Return function from function with minimal environment
In R 2.3.0-to-be, I think you can do
foo <- function(huge) {
scale <- mean(huge)
g <- function(x) { scale * x }
environment(g) <- emptyenv()
g
}
-roger
Henrik Bengtsson wrote:> Hi,
>
> this relates to the question "How to set a former environment?"
asked
> yesterday. What is the best way to to return a function with a
> minimal environment from a function? Here is a dummy example:
>
> foo <- function(huge) {
> scale <- mean(huge)
> function(x) { scale * x }
> }
>
> fcn <- foo(1:10e5)
>
> The problem with this approach is that the environment of 'fcn'
does
> not only hold 'scale' but also the memory consuming object
'huge',
> i.e.
>
> env <- environment(fcn)
> ll(envir=env) # ll() from R.oo
> # member data.class dimension object.size
> # 1 huge numeric 1000000 4000028
> # 2 scale numeric 1 36
>
> save(env, file="temp.RData")
> file.info("temp.RData")$size
> # [1] 2007624
>
> I generate quite a few of these and my 'huge' objects are of order
> 100Mb, and I want to keep memory usage as well as file sizes to a
> minimum. What I do now, is to remove variable from the local
> environment of 'foo' before returning, i.e.
>
> foo2 <- function(huge) {
> scale <- mean(huge)
> rm(huge)
> function(x) { scale * x }
> }
>
> fcn <- foo2(1:10e5)
> env <- environment(fcn)
> ll(envir=env)
> # member data.class dimension object.size
> # 1 scale numeric 1 36
>
> save(env, file="temp.RData")
> file.info("temp.RData")$size
> # [1] 156
>
> Since my "foo" functions are complicated and contains many local
> variables, it becomes tedious to identify and remove all of them, so
> instead I try:
>
> foo3 <- function(huge) {
> scale <- mean(huge);
> env <- new.env();
> assign("scale", scale, envir=env);
> bar <- function(x) { scale * x };
> environment(bar) <- env;
> bar;
> }
>
> fcn <- foo3(1:10e5)
>
> But,
>
> env <- environment(fcn)
> save(env, file="temp.RData");
> file.info("temp.RData")$size
> # [1] 2007720
>
> When I try to set the parent environment of 'env' to emptyenv(), it
> does not work, e.g.
>
> fcn(2)
> # Error in fcn(2) : attempt to apply non-function
>
> but with the new.env(parent=baseenv()) it works fine. The "base"
> environment has the empty environment as a parent. So, I try to do
> the same myself, i.e. new.env(parent=new.env(parent=emptyenv())), but
> once again I get
>
> fcn(2)
> # Error in fcn(2) : attempt to apply non-function
>
> Apparently, I do not understand enough here. Please, enlighten me. In
> the meantime I stick with foo2().
>
> Best,
>
> Henrik
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/