-----Original Message-----
From: r-devel-bounces@stat.math.ethz.ch
[mailto:r-devel-bounces@stat.math.ethz.ch] On Behalf Of Luke Tierney
Sent: den 2 juni 2003 17:10
To: John Chambers
Cc: r-devel@stat.math.ethz.ch; Laurent Gautier
Subject: Re: [Rd] 'methods' and environments.
On Mon, 2 Jun 2003, John Chambers wrote:
Hi,
I have quite some trouble with the package methods.
in R are a convenient way to emulate pointers (and avoid
large objects, or of large collections of objects). So
but the package methods is becoming more (and more)
problematic to work with. Up to version R-1.7.0,
slots that were environments were still references
to an environment, but I discovered in a recent
R-patched that this is not the case any longer:
environments as slots are now copied (increasing
the memory consumption by more than three fold in my case).
The (excessive) duplication (as a simple example
shown below demonstrates it) is now enforced
(as environments are copied too) !!!
m <- matrix(0, 600^2, 50)
## RSS of the R process is about 150MB
used (Mb) gc trigger (Mb)
Ncells 364813 9.8 667722 17.9
Vcells 85605 0.7 14858185 113.4
## RSS is now about 15 MB
library(methods)
setClass("A", representation(a="matrix"))
a <- new("A", a=matrix(0, 600^2, 50))
## The RSS will peak to 705 MB !!!!!!
Are there any plans to make "methods" usable with
large datasets ?
The memory growth seems real, but its connection to
slots" is unclear.
The only recent change that sounds relevant is the modification to
ensure that methods are evaluated in an environment that
lexical scope of the method's definition. That does create a new
environment for each call to a generic function, but has
with slots being environments.
That was (just) prior to 1.7.0.
It's possible there is some sort of "memory leak" or extra copying
there, but I'm not familiar enough with the details of that code to
say for sure.
Notice that the following workaround has no bad effects on memory
(suggesting that the extra environment in evaluating
fact be relevant):
R> setClass("A", representation(a="matrix"))
[1] "A"
R> aa <- matrix(600^2, 50)
R> a1 <- new("A")
R> a1@a <- aa
R> gc()
used (Mb) gc trigger (Mb)
Ncells 370247 9.9 531268 14.2
Vcells 87522 0.7 786432 6.0
You have managed to store Laurant's 140MB matrix in less than 1MB!:-)
If you use matrix(0, 600^2, 50) you get essentially the same
pattern as Laurant did.
The general solution for dealing with large objects is likely to
involve some extensions to R to allow "reference" objects,
the programmer is responsible for any copying.
Environments themselves are not quite adequate for this
different "references" to the same environment cannot have
Wrapping them in lists is the easiest way to deal this this.