Skip to content
Prev 169229 / 398506 Next

Large file size while persisting rpart model to disk

One correction below, and a suggested alternative approach.
On 2/4/2009 9:31 AM, Terry Therneau wrote:
This description is not right: it's not the caller, it's the environment 
where mfun was created.  So it applies to nested functions (as you 
said), but the caller is irrelevant.
I'm not sure what you mean by "chain" here, but the real issue is that 
all the variables in the function that creates mfun will be kept as long 
as mfun exists.
So here printfun captures all the local variables in pspline, even if it 
doesn't need them.
Another approach is simply to rm() the variables that aren't needed 
before returning a function.  For example, this function has locals x 
and y, but only needs y for the returned function to work:

 > fnbuilder <- function(n) {
+    x <- numeric(n)
+    y <- numeric(n)
+    noneedforx <- function() sum(y)
+    rm(x)
+    return(noneedforx)
+ }
 > f <- fnbuilder(10000)
 > f()
[1] 0

To see what actually got carried along with f, use ls():

 > ls(environment(f))
[1] "n"          "noneedforx" "y"

So we've picked up the arg n, and our local copy of noneedforx, but we 
did manage to get rid of x.  (The local copy costs almost nothing:  R 
will just have another reference to the same object as f refers to.  The 
arg could have been rm'd too, if it was big enough to matter.)

Duncan Murdoch