Return function from function with minimal environment

Tue, Apr 4, 2006 8:10 AM

On 4/4/06, Henrik Bengtsson <hb at maths.lth.se> wrote:

On 4/4/06, Thomas Lumley <tlumley at u.washington.edu> wrote:

On Tue, 4 Apr 2006, Henrik Bengtsson wrote:

Hi,

this relates to the question "How to set a former environment?" asked
yesterday.  What is the best way to to return a function with a
minimal environment from a function? Here is a dummy example:

foo <- function(huge) {
 scale <- mean(huge)
 function(x) { scale * x }
}

fcn <- foo(1:10e5)

The problem with this approach is that the environment of 'fcn' does
not only hold 'scale' but also the memory consuming object 'huge',
i.e.

env <- environment(fcn)
ll(envir=env)  # ll() from R.oo
#   member data.class dimension object.size
# 1   huge    numeric   1000000     4000028
# 2  scale    numeric         1          36

save(env, file="temp.RData")
file.info("temp.RData")$size
# [1] 2007624

I generate quite a few of these and my 'huge' objects are of order
100Mb, and I want to keep memory usage as well as file sizes to a
minimum.  What I do now, is to remove variable from the local
environment of 'foo' before returning, i.e.

foo2 <- function(huge) {
 scale <- mean(huge)
 rm(huge)
 function(x) { scale * x }
}

fcn <- foo2(1:10e5)
env <- environment(fcn)
ll(envir=env)
#   member data.class dimension object.size
# 1  scale    numeric         1          36

save(env, file="temp.RData")
file.info("temp.RData")$size
# [1] 156

Since my "foo" functions are complicated and contains many local
variables, it becomes tedious to identify and remove all of them, so
instead I try:

foo3 <- function(huge) {
 scale <- mean(huge);
 env <- new.env();
 assign("scale", scale, envir=env);
 bar <- function(x) { scale * x };
 environment(bar) <- env;
 bar;
}

fcn <- foo3(1:10e5)

But,

env <- environment(fcn)
save(env, file="temp.RData");
file.info("temp.RData")$size
# [1] 2007720

When I try to set the parent environment of 'env' to emptyenv(), it
does not work, e.g.

fcn(2)
# Error in fcn(2) : attempt to apply non-function

but with the new.env(parent=baseenv()) it works fine. The "base"
environment has the empty environment as a parent.  So, I try to do
the same myself, i.e. new.env(parent=new.env(parent=emptyenv())), but
once again I get

I don't think you want to remove baseenv() from the environment. If you
do, no functions from baseenv will be visible inside fcn. These include
"{" and "*", which are necessary for your function. I think the error
message comes from being unable to find "{".

Thank you, this makes sense. Modifying Roger Peng's example
illustrates what you say:

foo <- function(huge) {
       scale <- mean(huge)
       g <- function(x) x
       environment(g) <- emptyenv()
       g
}

fcn <- foo(1:10e5)
fcn(2)
# [1] 2

But as soon as you add "something" to the g(), it is missing;

foo <- function(huge) {
       scale <- mean(huge)
       g <- function(x) { x }
       environment(g) <- emptyenv()
       g
}

fcn <- foo(1:10e5)
fcn(2)
# Error in fcn(2) : attempt to apply non-function

...and I did not know that "{" and "(" are primitive functions.  Interesting.

I conclude that 'env <- new.env(parent=baseenv())' is better than
''env <- new.env()' in my case.

Is there any reason to use

   env <- new.env(parent=baseenv())

instead of just

   env <- baseenv() ?

The extra environment being created seems to serve no purpose.

Return function from function with minimal environment

Thread (10 messages)