Skip to content

Plans to improve reference classes?

7 messages · Michael Lawrence, Gábor Csárdi, Hadley Wickham +2 more

#
(Moved to R-devel)

Niek,

Would you please provide the details on this test case, including your
benchmarks, and what you are trying to achieve at the high-level?

Thanks,
Michael
On Wed, Jun 17, 2015 at 4:55 AM, Niek Bouman <niek.bouman at keygene.com> wrote:
#
On Mon, Jun 22, 2015 at 9:49 AM, Michael Lawrence
<lawrence.michael at gene.com> wrote:
[...]
You can also see http://rpubs.com/wch/17459
Linked from https://github.com/wch/R6/ and also from CRAN, actually:
http://cran.rstudio.com/web/packages/R6/README.html
[...]
R6 is backed by RStudio, for example their Shiny framework uses R6
classes, so I would not be too worried about this.

Gabor

[...]
#
Apart from speed, the most important advantage of R6 over ref classes
is that's it easy to subclass a class defined in package A in package
B. This is currently difficult with ref classes because of the way it
does scoping. (And I think it's difficult to fix without fundamentally
changing how ref classes work)

Hadley

On Mon, Jun 22, 2015 at 8:49 AM, Michael Lawrence
<lawrence.michael at gene.com> wrote:

  
    
#
Could of requests:

1) Is there any example or writeup on the difficulties of extending
reference classes across packages? Just so I can fully understand the
issues.

2) In what sorts of situations does the performance of reference
classes cause problems? Sure, it's an order of magnitude slower than
constructing a simple environment, but those timings are in
microseconds, so one would need a thousand objects before it started
to be noticeable. Some motivating use cases would help.

Thanks,
Michael
On Mon, Jun 22, 2015 at 7:06 AM, Hadley Wickham <h.wickham at gmail.com> wrote:
#
Here's a simple example:

library(scales)
library(methods)

MyRange <- setRefClass("MyRange", contains = "DiscreteRange")
a_range <- MyRange()
a_range$train(1:10)
# Error in a_range$train(1:10) : could not find function "train_discrete"

where train_discrete() is an non-exported function of the scales
package called by the train() method of DiscreteRange.

There are also some notes about portable vs. non-portable R6 classes
at http://cran.r-project.org/web/packages/R6/vignettes/Portable.html
It's a bit of a pathological case, but the switch from RefClasses to
R6 made a noticeable performance improvement in shiny. It's hard to
quantify the impact on an app, but the impact on the underlying
reactive implementation was quite profound: http://rpubs.com/wch/27260
vs  http://rpubs.com/wch/27264

R6 also includes a vignette with detailed benchmarking:
http://cran.r-project.org/web/packages/R6/vignettes/Performance.html

I've added Winston to the thread since he's the expert.

Hadley
#
I understand Hadley's point; it's a consequence of the modification of the environment of the ref. class methods.

Good point, but it seems we can make that an option (there are advantages to it of code quality and ease of writing, when it works);

Let's discuss possibilities, off-list until things are a bit clearer.

John
On Jun 23, 2015, at 8:06 AM, Hadley Wickham <h.wickham at gmail.com> wrote:

            
#
I can provide a little background on why particular choices were made for
R6. Generally speaking, speed is a primary consideration in making
decisions about the design of R6. The basic structure of R6 classes is
actually not so different from reference classes: an R6 object is an
environment. But many aspects of R6 objects are simpler.

R6 does support clean cross-package inheritance. The key design feature
that allows this is that methods have one environment that they are bound
in (this is where they can be found), and another environment that they are
enclosed in (roughly, this is where they run). The enclosing environment
points back to the binding environment with a binding named `self`. Methods
must access other members with `self$`, as in `self$foo`. I've found that
this requirement results in clearer code, because it's always clear when
you're accessing something that's part of the object.

When a class inherits from another class, the enclosing environment also
contains a binding named `super`, which points to an environment containing
methods from the superclass. These methods also have their own enclosing
environment, with a `self` that points back to the object's binding
environment.

I know this might be hard to picture from the description; I have some
diagrams drawn up which might help. See pages 1 and 4 from this document:
  https://github.com/wch/R6/blob/master/doc_extra/R6.pdf
(The other pages show other features, like private members, and
non-portable R6 objects, which don't support clean cross-package
inheritance, and have a different structure.)


Regarding performance, R6 is fast relative to ref classes because it
doesn't do type checking for fields, and doesn't make use of S4. (There may
be other reasons as well, but I don't know the internals of ref classes
well enough to say much about it.) Accessing a member of an R6 object is
literally just accessing a binding in an environment, and that's a very
fast operation in R.

-Winston



On Tue, Jun 23, 2015 at 10:06 AM, Hadley Wickham <h.wickham at gmail.com>
wrote: