How to test existence of an environment and how to remove it (from within functions)?
Dear Duncan,
Thanks a lot for your help.
I tried to adapt your example to my MWE, but the subsequent calls of
main() are 'too fast' now: new calls of main() should also 'reset' the
environment (as a different x is generated then), that's why I tried
to remove the environment .my_environ from within main():
## Auxiliary function with caching
aux <- local({
.my_environ <- new.env(hash = FALSE, parent = emptyenv()) # define
the environment
function(x) {
## Setting up the environment and caching
if(exists("cached.obj", envir = .my_environ)) { # look-up (in
case the object already exists)
x.cols <- get("cached.obj", .my_environ)
} else { # time-consuming part (+ cache)
x.cols <- split(x, col(x))
Sys.sleep(1)
assign("cached.obj", x.cols, envir = .my_environ)
}
## Do something with the result from above (here: pick out two randomly
## chosen columns)
x.cols[sample(1:1000, size = 2)]
}
})
## Main function
main <- function() {
x <- matrix(rnorm(100*1000), ncol = 1000)
res <- replicate(5, aux(x))
rm(.my_environ) # TODO: Trying to remove the environment
res
}
## Testing
set.seed(271)
system.time(main()) # => ~ 1s since the cached object is found
system.time(main()) # => ~ 0s (instead of ~ 1s)
system.time(main()) # => ~ 0s (instead of ~ 1s)
Do you know a solution for this?
Background information:
This is indeed a problem from a package which draws many (sub)plots
within a single plot. Each single (sub)plot needs to access the data
for plotting but does not known about the other (sub)plots... Thought
this might be interesting in general for caching results.
Thanks & cheers,
Marius
On Mon, Aug 29, 2016 at 7:59 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
On 29/08/2016 1:36 PM, Marius Hofert wrote:
Hi, I have a function main() which calls another function aux() many times. aux() mostly does the same operations based on an object and thus I would like it to compute and store this object for each call from main() only once. Below are two versions of a MWE. The first one computes the right result (but is merely there for showing what I would like to have; well, apart from the environment .my_environ still floating around after main() is called). It works with an environment .my_environ in which the computed object is stored. The second MWE tries to set up the environment inside aux(), but neither the check of existence in aux() nor the removal of the whole environment in main() work (see 'TODO' below). How can this be achieved?
If you create aux in a local() call, it can have persistent storage,
because local() creates an environment to hold it. For example,
aux <- local({
persistent <- NULL
function(x) {
if (!is.null(persistent))
message("Previous arg was ", persistent)
persistent <<- x
}
})
Note that the assignment uses <<- to work in the local-created
environment rather than purely locally within the evaluation frame of
the call. You need to create the variable "persistent" there, or the
assignment would go to the global environment, which is bad.
This gives
> aux(1) > aux(2)
Previous arg was 1
> aux(3)
Previous arg was 2 Duncan Murdoch
Cheers,
Marius
### Version 1: Setting up the environment in .GlobalEnv ########################
.my_environ <- new.env(hash = FALSE, parent = emptyenv()) # define the
environment
## Auxiliary function with caching
aux <- function(x) {
## Setting up the environment and caching
if(exists("cached.obj", envir = .my_environ)) { # look-up (in case
the object already exists)
x.cols <- get("cached.obj", .my_environ)
} else { # time-consuming part (+ cache)
x.cols <- split(x, col(x))
Sys.sleep(1)
assign("cached.obj", x.cols, envir = .my_environ)
}
## Do something with the result from above (here: pick out two randomly
## chosen columns)
x.cols[sample(1:1000, size = 2)]
}
## Main function
main <- function() {
x <- matrix(rnorm(100*1000), ncol = 1000)
res <- replicate(5, aux(x))
rm(cached.obj, envir = .my_environ) # only removing the *object*
(but not the environment)
res
}
## Testing
set.seed(271)
system.time(main()) # => ~ 1s since the cached object is found
### Version 2: Trying to set up the environment inside aux() ###################
## Auxiliary function with caching
aux <- function(x) {
## Setting up the environment and caching
if(!exists(".my_environ", mode = "environmnent")) # TODO: How to
check the existence of the environment? This is always TRUE...
.my_environ <- new.env(hash = FALSE, parent = emptyenv()) #
define the environment
if(exists("cached.obj", envir = .my_environ)) { # look-up (in case
the object already exists)
x.cols <- get("cached.obj", .my_environ)
} else { # time-consuming part (+ cache)
x.cols <- split(x, col(x))
Sys.sleep(1)
assign("cached.obj", x.cols, envir = .my_environ)
}
## Do something with the result from above (here: pick out two randomly
## chosen columns)
x.cols[sample(1:1000, size = 2)]
}
## Main function
main <- function() {
x <- matrix(rnorm(100*1000), ncol = 1000)
res <- replicate(5, aux(x))
rm(.my_environ) # TODO: How to properly remove the environment?
res
}
## Testing
set.seed(271)
system.time(main()) # => ~ 5s since (the cached object in) environment
.my_environ is not found
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.