On 17/11/2014, 4:23 PM, Murat Tasan wrote:
Yeah, my biggest stumbling-point while starting to write S3 classes was the some-default-methods-preserve class, and some-default-methods-don't-preserve class dichotomy. But I'm not sure it's so "easy" to figure this out without more documentation... (though my experience is n = 1, and I might be particularly slow).
What I meant is that you can just try it. If you think your users will want to subset your object, then you can try it yourself, and you'll see that you need to write a `[` method. Duncan Murdoch
The most common motivating example for S3 classes (I've seen) is overriding plot(). I imagine many people would want to take a base structure (e.g. a simple vector) and 'class-ify' it solely for the purposes of encapsulating domain-specific plotting commands: MyClass <- function(x) structure(x, class = "MyClass") plot.MyClass <- function(...) ## large complicated plotting function here. Those examples, however, basically never mention the need to then override/implement many other common methods, `c`, `[`, `unique`, `as.list`, `as.data.frame`, etc. I believe this is a _huge_ tripping point for new-comers to R programming (even if they are not new-comers to programming more generally). In my own experience, I had to work backwards by finding methods that dropped my class, then examine the source for those methods, find the underlying calls in those methods that dropped the class, and continue on down the (rabbit hole) call stack... this is hardly ideal for any programmer, I think, experienced or novice. In the end, I completely understand your point (e.g. with the sorted numbers example), and I don't know how to resolve the issue, save perhaps for more explicit warnings when introducing S3 programming? My own solution, by the way, is to define a single ancestor class that either (i) errors immediately if some assumptions fail, or (ii) dispatch to the default method while working to properly restore class attributes of the return object. Most of my 'useful' classes inherit from this 'dummy' ancestor class, just to save a lot of re-writing dispatch code. An example of where I error-out immediately is something like `c`, where I'll check to make sure all args are of the same class type... if they aren't, I could use R's coercion rules, but I've opted for the 'type-safe' approach of mixing variables when dealing with my own custom classes. An example of where I opt for preserving class is `[`. If I write a class where subsetting doesn't make sense, I'll have to write a fail-fast implementation of `[` for that specific class. The whole thing seems... inelegant (for lack of a better word), which is what prompted my post in the first place. Cheers, and thanks for the discussion and points... they're definitely helpful in guiding development. -murat On Mon, Nov 17, 2014 at 9:19 AM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
On 17/11/2014 10:41 AM, Hadley Wickham wrote:
Generally the idea is that the class should be stripped because R has no way of knowing if the new object, for example unique(obj), still has the necessary properties to be considered to be of the same class as obj. Only the author of the class knows that. S4 would help a bit here, but only structurally (it could detect when the object couldn't possibly be of the right class), not semantically.
There are two possible ways that S3 methods could handle subclasses: * preserve by default (would also have preserve all attributes) * drop by default If you could really on either system consistently, I think you could write correct code. It's very hard when the defaults vary. (In other words, I agree with everything you said, except I think if the default was to preserve you could still write correct code)
I don't see how default preserving could work. For example, I might define a "SortedNumbers" class, which is a vector of numbers in non-decreasing order. I could define min() and max() methods for it which would be really fast, because they only need to look at the first or last elements. But a rev() method wouldn't make sense, so I wouldn't define one of those. If the rev() default method left the class as "SortedNumbers", then my min() and max() calculations would end up broken. So maybe I should have defined a rev() method that just stops with an error. But classes don't own methods, so I'd have no way of knowing that someone else defined a new generic (e.g. shuffle()) that broke things. I don't see any way around this within the S3 system. In fact, some default methods do preserve the class, for example the replacement method `[<-`. I could take a SortedNumbers vector of the numbers 1:10, and set element 1 to 11, and end up breaking min() and max(). This is a problem with the current design. Probably we should do a better job of documenting which methods preserve the class and which ones don't. (For example, `[` doesn't preserve the class, even though it would be fine to do so in this example.) But there are a lot of things to do, and this is one thing that is pretty easy to figure out without documentation, so I'd say it's a low priority. Duncan Murdoch