Skip to content
Prev 9079 / 63424 Next

What does is() mean?

Roger Koenker wrote:
Well, it depends.  Sometimes (in dispatching methods, e.g.) you don't
want to ask too deeply about what the object REALLY is, but would just
like to (excuse the expression in the context) "get on with it", given a
basic assertion.

That's fundamentally what is() does:  it simply checks the class
inheritance structure for class(x).

The function defined to poke more deeply into the issue is
validObject():

	validObject(x)

checks as much as possible into whether x is a valid object of its
class.  Some checks are built in (slots of the right classes, etc.). 
Others if needed can be incorporated in the class definitions (argument
validity=) or via setValidity().  Validity checking uses inheritance, so
validity methods for contained classes will be applied.
Well, about the best there is, for a single attack.  See comment to your
point 3.
I believe some of the packages in BioConductor use initialize methods. 
Others??
Indeed.  The "obvious" strategy is:  if the class has a validity method,
direct or inherited, then initialize() should invoke it.  The default
initialize method does not (in either R or S-Plus).

Should that be changed?  Logically, I would say yes:  if the class
designer specified a validity method, it should not be possible to use
new() to create invalid objects.  But there is an efficiency penalty.

At the moment, you need to build your own initialize() method.  That's
not all bad--in the process you may also make the arguments to
initialize() reasonable names instead of ..., or otherwise get beyond
the notion of just supplying slot names in calls to new().

Here's a simple example (which I'll add to the initialize
documentation).  The validity method requires a single string for the
"id" slot.

setClass("a", representation(x="numeric", id = "character"),
         validity = function(object)
     if(length(object@id)==1) TRUE else 
     "Expected a single string as the \"id\" slot")

and the initialize method calls validObject.

setMethod("initialize", "a", function(.Object, x = numeric(), id =
"<>"){
  .Object@x <- x
  .Object@id <- id
   validObject(.Object)
  .Object})

With that definition, you get a check with new():

R> new("a", x=1:10, id=character())
Error in validObject(.Object) : Invalid "a" object: Expected a single
string as the "id" slot

[A couple of details for those interested.  The default values in the
initialize() method above are important.  Otherwise, simple calls such
as new("a") will fail.

Also, R (but not currently S-Plus) has a function callNextMethod() that
looks good for writing initialize methods.  It often is, but there is a
current bug that requires you to supply all the arguments to
callNextMethod in this case, contrary to the documentation.  With luck,
the bug will be fixed.  Meanwhile, the way to use callNextMethod in
initialize() is like this:

R> setMethod("initialize", "a", function(.Object, ...) {
+ x <- callNextMethod(.Object, ...)
+ validObject(x)
+ x})

]
Well, as someone might have said, it depends what you mean by "new".

The above mechanism pretty much ensures that objects will be valid when
created.

But it doesn't prevent some code from doing:
  x@id <- character(0)

So, one might define methods for "@<-" that included validity checking. 
Sometimes, though, (as in checking that two slots have the same length,
e.g.) it may take some care not to create invalid objects temporarily: 
the discipline of always creating the objects through a call to new()
will usually work, but once again with some slight efficiency penalty.

The whole area of valid objects is one that all of us interested folks
should discuss.

It would be nice to have some more "real" examples.

John