Skip to content

S4 accessors

8 messages · Seth Falcon, John Chambers, Henrik Bengtsson +1 more

#
Ross Boylan <ross at biostat.ucsf.edu> writes:
No, I accidentally responded privately and I believe I already resent
my reply to the list.  Sorry about that.  I've cc'd the list for this response.
You are not, but someone else might be: suppose you release your code
and I would like to extend it.  I am stuck until you decide to make
generics.
You have to create the generic first if it doesn't already exist:

   setGeneric("foo", function(x) standardGeneric("foo"))
Which might be a further argument not to have the distinction in the
first place ;-)

To me, simple accessors are best documented with the class.  If I have
an instance, I will read help on it and find out what I can do with
it.
Yes you need an alias for the _generic_ function.  You can either add
the alias to the class man page where one of its methods is documented
or you can have separate man pages for the generics.  This is
painful.  S4 documentation, in general, is rather difficult and IMO
this is in part a consequence of the more general (read more powerful)
generic function based system.

IOW, I think these are good questions.  They are ones that I struggle
with and do not know of any truly satisfying answers.

Best,

+ seth
#
On Tue, 2006-09-26 at 10:43 -0700, Seth Falcon wrote:
This may be easier to do concretely.
I have an S4 class A.
I have defined a function foo that only operates on that class.
You make a class B that extends A.
You wish to give foo a different implementation for B.

Does anything prevent you from doing 
setMethod("foo", "B", function(x) blah blah)
(which is the same thing I do when I make a subclass)?
This turns my original foo into the catchall method.

Of course, foo is not appropriate for random objects, but that was true
even when it was a regular function.
I wonder if it might be worth changing setMethod so that it does this by
default when no existing function exists. Personally, that would fit the
style I'm using better.
As my message indicates, I too am struggling with an appropriate
documentation style for S4 classes and methods.  Since "Writing R
Extensions" has said "Structure of and special markup for documenting S4
classes and methods are still under development." for as long as I cam
remember, perhaps I'm not the only one.

Some of the problem may reflect the tension between conventional OO and
functional languages, since R remains the latter even under S4.  I'm not
sure if it's the tools or my approach that is making things awkward; it
could be both!

Ross
#
John Chambers <jmc at r-project.org> writes:
If foo is a generic and the only method defined is for class Bar, then
the statement seems meaningful enough?
How?  A given accessor function has the purpose of returning the
expected data "contained" in an instance.  It provides an abstract
interface that decouples the structure of the class from the data it
needs to provide to users.

The anomaly, is IMO, a much larger challenge with generic function
based systems.  When the same name for a generic is used in different
packages, you end up with a masking problem.  This scenario is
unavoidable in general, but particularly likely, for accessors.  As S4
becomes more prevalent, I suspect that '<pkg>::foo' is going to become
a required idiom for interactive use (other options are available for
package code).

+ seth
#
On 9/27/06, John Chambers <jmc at r-project.org> wrote:
In the Object class system of the R.oo package I have for years worked
successfully with what I call virtual fields.  I find them really
useful and convenient to work with.

These works as follows, if there is a get<Field>(object) function,
this is called whenever object$<field> is called.  If there is no such
function, the internal field '<field>' is access (from the environment
where all fields live in).  Similarily, object$<field> <- value check
for set<Field>(object, value), which is called if available. [I work
with environments/references so my set functions don't really have to
be replacement functions, but there is nothing preventing them from
being such.]

There are several advantages doing it this way.  You can protect
fields behind a set function, e.g. preventing assignment of negative
values and similar, e.g.

  circle$radius <- -5
  Error: Negative radius: -5

You can also provide redundant fields in your API, e.g.

  circle$radius <- 5
  print(circle$diameter)
  circle$area <- 4
  print(circle$radius)

and so on. How the circle is represented internally does not matter
and may change over time. With such a design you don't have to worry
as a software developer; the API is stable.  I think this schema
carries over perfectly to S4 and '@'.

FYI: I used the above naming convention because I did this way before
the '_' operator was redefined.

Comment: If you don't want the user to access a slot/field directly, I
recommend to name the slot with a period prefix, e.g. '.radius'.  This
gives at least the user the chance to understand your design although
it does not prevent them to misuse it.  The period prefix is also
"standard" for "private" object, cf. ls(all.names=FALSE/TRUE).

/Henrik
#
I'm trying to understand what the underlying issues are here--with the
immediate goal of how that affects my design and documentation
decisions.
On Wed, Sep 27, 2006 at 02:08:34PM -0400, John Chambers wrote:
The sense of "meaningful" here is hard for me to pin down, even with
the subsequent discussion.

I think the import is more than formal: R is not strongly typed, so
you can hand any argument to any function and the language will not
complain.
It's true that clashing uses of the same name may lead to confusion,
but that need not imply that functions must be applicable to all
objects.  Many functions only make sense in particular contexts, and
sometimes those contexts are quite narrow.

One of the usual motivations for an OO approach is precisely to limit
the amount of global space taken up by, for example, functions that
operate on the class (global in both the syntactic sense and in the
inside your brain sense).  Understanding a traditional OO system, at
least for me, is fundamentally oriented to understanding the objects
first, with the operations on them as auxiliaries.  As you point out,
this is just different from the orientation of a functional language,
which starts with the functions.
I don't see why get_flag differs from flag; if "flag" lends itself to
multiple interpretations or meanings, wouldn't "get_flag" have the
same problem?

Or are you referring to the fact that "flag" sounds as if it's a verb
or action?  That's a significant ambiguity, but there's nothing about
it that is specific to a functional approach.
If this is a claim that every function should make sense for every
object, it's asking too much.  If it's not, I don't really see how a
function can avoid having a purpose.  The purpose of accessor
functions is to get or set the state of the object.
Aside from the fact that I don't see why get_flag is so different from
flag, the syntactic sugar argument has another problem.  The usually
conceived purpose of accessors is to hide from the client the
internals of the object.  To take an example that's pretty close to
one of my classes, I want startTime, endTime, and duration.
Internally, the object only needs to hold 2 of these quantities to get
the 3rd, but I don't want the client code to be aware of which choice
I made.  In particular, I don't what the client code to change from 
duration to get_duration if I switch to a representation that stored
the duration as a slot.

Ross