Size of a refClass instance
Yes, I agree. How does one conceptually achieve polymorphic behavior without instantiating 100,000s of instances? Perhaps one way around this is to represent the data in an efficient R way -- i.e. a data.frame -- and create a set of re-usable singleton instances of different node types. To perform some polymorphic operation on a node, a singleton gets assigned to a node in the tree. But behavior such as node$parent() or node$child(1) will require a small pool of these singletons. Doable, I think. PS. FWIW, I found another strike against the "massive tree of refClass instances". It's save(). save() appears to unnecessarily expand/duplicate refClass structures. Write time becomes prohibitive and loading in the data structure again results in a far greater memory usage.
On May 3, 2013, at 9:47 AM, Jeff Newmiller wrote:
Interesting conclusion. Alternatively, that representation of your object model may not be computationally effective. This discrepancy may be less exaggerated in C++, but you may still find that large numbers of objects are less efficient in their use of memory or cpu time than vector processing even there. I would read the point of Martin's response as "Don't confuse your mental model of the solution with its implementation".
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
David Kulp <dkulp at fiksu.com> wrote:
Good tip. Thanks Morgan. I agree that a different structure might (necessarily) be in order. I wanted to create a tree where nodes in a tree were of different derived sub-classes -- possibly holding more data and behaving polymorphically. OO programming seemed ideal for this: lots of small things with specialized behavior -- but this isn't R's strength. On May 2, 2013, at 4:57 PM, Martin Morgan wrote:
On 05/01/2013 11:20 AM, David Kulp wrote:
I'm using refClass for a complex multi-directional tree structure
with
possibly 100,000s of nodes. The refClass design is very impressive
and I'd
love to use it, but I've found that the size of refClass instances
are very
large and creation time is slow. For example, below is a RefClass
and normal
S4 class. The RefClass requires about 4KB per instance vs 500B for
the S4
class -- based on adding the Ncells and Vcells of used memory
reported by
gc(). And instantiation is more than twice as slow for a RefClass.
(R
2.14.2) Anyone have thoughts on this and whether there's any hope for
improving
resources on either front?
Hi David -- not necessarily helpful but creating a few large objects
is always better than creating many small in R, so perhaps re-conceptualize your data structure? As a rough analogy, instead of constructing a graph as a large number of 'Node' instances each pointing to one another, a graph could be represented as a data.frame containing columns of 'from' and 'to' indexes (neighbour-edge list, a few large objects) or as an adjacency matrix. One would also implement creation and update of the few large objects in an R-friendly (vectorized) way.
Perhaps there are existing packages that already model the data
you're interested in? If your multi-directional tree can be represented as a graph, then perhaps
http://bioconductor.org/packages/release/bioc/html/graph.html including facilities in the Boost graph library (RBGL, on the
Bioconductor web site, too) or the igraph package can be put to use.
Martin
I wonder what others are doing. I've been thinking about
lightweight
alternative implementations, but nothing particularly elegant has
come to
mind, yet!
Thanks!
simple <- setRefClass('simple', fields = list(a = "character",
b="numeric")
) gc() system.time(simple.list <- lapply(1:100000, function(i) {
simple$new(a='foo',b=i) })) gc()
setClass('simple2', representation(a="character",b="numeric"))
setMethod("initialize", "simple2", function(.Object, a, b) {
.Object at a <- a
.Object at b <- b .Object })
gc() system.time(simple2.list <- lapply(1:100000, function(i) {
new('simple2',a='foo',b=i) })) gc()
______________________________________________ R-help at r-project.org
mailing
list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
posting
guide http://www.R-project.org/posting-guide.html and provide
commented,
minimal, self-contained, reproducible code.
-- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.