Problems initializing an extended S4 class

2 messages · Martin Morgan, Jim Regetz

Original

Sat, Mar 1, 2008 5:49 AM #

Hi Jim --

I think your problems have to do with the way 'initialize' and
'validObject' work together. At some point in it's code, validObject
converts an instance 'b' of a derived class (e.g., B) into that of the
base class (e.g., A). It does this, by calling a = new("A"), and then
copying the relevant slots from the instance of b to new a. So...

...the initialize method for A gets invoked twice, once when B is
being created originally, and once when B is being checked for
validity and is being converted to A.

...and this is because of how A is being constructed from B -- with a
called to new("A"), with no additional arguments.

Practically, this means that your classes have to be valid when
created with new("A"). This imposes constraints on the prototype,and
on the initialize method, and in the end I've found myself writing
'constructors' that process user-friendly arguments into a format that
can be passed to 'new' as slots, e.g.,

A <- function(x=numeric(0), ...) {
    x <- ifelse(log(x)<0, 1, x)
    new("A", x=x, ...)
}

In some ways I think this is the right thing to do anyway (a layer of
abstraction between the user and implementation), even if the
motivation might not be so pure.

Martin


Jim Regetz <regetz at nceas.ucsb.edu> writes:

Hi all,

I am having trouble extending S4 classes in cases where I'm using both 
validity and initialize methods. I've read as much S4 information as I 
can find, but I've yet to have that "a-ha" moment.

In my application, I am using validity methods to guard against 
inappropriate input data that my code has no way of handling, and 
initialize methods to detect and deal with fixable problems. As a toy 
example, consider classes A and B and associated methods as defined 
below. I use a validity method for A to complain about negative values, 
and an initialize method to "correct" small input values. B should 
simply extend A by adding an extra slot. The example is contrived, but 
it illustrates a key behavior that I also encounter in my real code:

setClass("A", representation(x="numeric"))
setClass("B", representation(y="character"), contains="A")

setValidity("A", function(object) {
     message("start validate A")
     retval <- NULL
     if (any(object at x<0))
         retval <- c(retval, "x must be positive")
     message("done validate A")
     if(is.null(retval)) return(TRUE) else return(retval)
})

setMethod("initialize", "A", function(.Object, ...) {
     message("start init A")
     .Object <- callNextMethod()
     x <- .Object at x
     .Object at x <- ifelse(log(x)<0, 1, x)
     message("done init A")
     .Object
})

setMethod("initialize", "B", function(.Object, ...) {
     message("start init B")
     callNextMethod()
})


# Creating an instance of A works just as I would expect

a <- new("A", x=c(0.5, 2))

start init A
start validate A
done validate A
done init A

An object of class ?A?
Slot "x":
[1] 1 2

# But subsequently creating a derived B object fails!

new("B", a, y="foo")

start init B
start init A
start validate A
start init A
Error in checkSlotAssignment(object, name, value) :
   assignment of an object of class "logical" is not valid for slot "x"
in an object of class "A"; is(value, "numeric") is not TRUE

The two things I haven't quite figured out are:

1. Why is initialize invoked *twice* for A during instantiation of B?

2. The second time initialize is invoked for A, it appears .Object at x is 
only bound to its implicit prototype value of numeric(0). Why? Of 
course, this leads to an error because the ifelse expression 
subsequently evaluates to logical(0) rather than a numeric vector as 
intended. Again, this is a contrived example, but a very real problem in 
my code.

I suppose I could define a prototype for A that I know won't break my 
initialize method, but that seems inelegant and hard to maintain. Is 
there a better way to code this so that I can reliably instantiate B 
using a valid A object? Hopefully I've just got something wrong in the 
formals for my initialize methods or in my use of getNextMethod(), but 
I've had no luck trying some alternatives -- and ultimately I'd prefer 
to better understand the underlying behavior rather than stumble onto 
something that merely appears to work.

I'd be grateful for any suggestions...

Thanks,
Jim

------------------------------
James Regetz, Ph.D.
Scientific Programmer/Analyst
National Center for Ecological Analysis & Synthesis
735 State St, Suite 300
Santa Barbara, CA 93101

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

1 day later

Jim Regetz

Mon, Mar 3, 2008 2:51 AM #

Hi Martin,

Many thanks for the reply, that really helps a lot. Some follow-up 
observations below.

Martin Morgan wrote:

With your comments as context, I stepped more carefully through relevant 
source code in the methods package. As you indicated, my trouble arises 
in the 'validObject' function. In particular, it's when 'as' is called 
to coerce the nascent B object into its superclass A. [For reference, 
it's in line 456 of the SClasses.R in the R-2.6.2 source tarball.]

This led me to try defining a simple B -> A coercion method:
setAs("B", "A", function(from, to) new(to, x=from at x))

Indeed, after doing this, I can successfully create an instance of B.

It still seems peculiar that the initialize and validity methods are 
both called twice for A when B is instantiated -- particularly given 
that the 'a' object itself was already initialized and checked for 
validity. There may well be a good reason for all this, but I still 
wonder if I'm doing something inefficiently.

In any case, although I haven't tested it, I'm optimistic that defining 
'coerce' methods in this fashion will also solve the problem in my real 
code without my needing to redesign the initialize and validity methods.

Agreed, and I actually do have such user-level constructors. However, 
precisely in the interests of abstraction, I'd prefer not to expose the 
innards of my initialize methods in these functions. I admit this is 
partly on purity grounds, but more practical issues are (1) I may have 
several variants of the user-level constructors and don't want to repeat 
the common initialization code in each one, and (2) the operations I 
have in my initialize methods are only sensible for valid objects (and 
thus not meaningfully executed before 'new' is called). Hopefully this 
is all in the true spirit of the S4 object system...

Thanks again for the clear and helpful reply!

Cheers,
Jim

Martin


Jim Regetz <regetz at nceas.ucsb.edu> writes:

Hi all,

I am having trouble extending S4 classes in cases where I'm using both 
validity and initialize methods. I've read as much S4 information as I 
can find, but I've yet to have that "a-ha" moment.

In my application, I am using validity methods to guard against 
inappropriate input data that my code has no way of handling, and 
initialize methods to detect and deal with fixable problems. As a toy 
example, consider classes A and B and associated methods as defined 
below. I use a validity method for A to complain about negative values, 
and an initialize method to "correct" small input values. B should 
simply extend A by adding an extra slot. The example is contrived, but 
it illustrates a key behavior that I also encounter in my real code:

setClass("A", representation(x="numeric"))
setClass("B", representation(y="character"), contains="A")

setValidity("A", function(object) {
     message("start validate A")
     retval <- NULL
     if (any(object at x<0))
         retval <- c(retval, "x must be positive")
     message("done validate A")
     if(is.null(retval)) return(TRUE) else return(retval)
})

setMethod("initialize", "A", function(.Object, ...) {
     message("start init A")
     .Object <- callNextMethod()
     x <- .Object at x
     .Object at x <- ifelse(log(x)<0, 1, x)
     message("done init A")
     .Object
})

setMethod("initialize", "B", function(.Object, ...) {
     message("start init B")
     callNextMethod()
})


# Creating an instance of A works just as I would expect

a <- new("A", x=c(0.5, 2))

start init A
start validate A
done validate A
done init A

An object of class ?A?
Slot "x":
[1] 1 2

# But subsequently creating a derived B object fails!

new("B", a, y="foo")

start init B
start init A
start validate A
start init A
Error in checkSlotAssignment(object, name, value) :
   assignment of an object of class "logical" is not valid for slot "x"
in an object of class "A"; is(value, "numeric") is not TRUE

The two things I haven't quite figured out are:

1. Why is initialize invoked *twice* for A during instantiation of B?

2. The second time initialize is invoked for A, it appears .Object at x is 
only bound to its implicit prototype value of numeric(0). Why? Of 
course, this leads to an error because the ifelse expression 
subsequently evaluates to logical(0) rather than a numeric vector as 
intended. Again, this is a contrived example, but a very real problem in 
my code.

I suppose I could define a prototype for A that I know won't break my 
initialize method, but that seems inelegant and hard to maintain. Is 
there a better way to code this so that I can reliably instantiate B 
using a valid A object? Hopefully I've just got something wrong in the 
formals for my initialize methods or in my use of getNextMethod(), but 
I've had no luck trying some alternatives -- and ultimately I'd prefer 
to better understand the underlying behavior rather than stumble onto 
something that merely appears to work.

I'd be grateful for any suggestions...

Thanks,
Jim

------------------------------
James Regetz, Ph.D.
Scientific Programmer/Analyst
National Center for Ecological Analysis & Synthesis
735 State St, Suite 300
Santa Barbara, CA 93101

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel