Skip to content

inheritence in S4

16 messages · Christophe Genolini, Martin Morgan

#
Hi Christophe -- 

I don't know whether there's a particularly elegant way. This works

setClass("A", representation(x="numeric"))
setClass("B", representation(y="numeric"))
setClass("C", contains=c("A", "B"))

setMethod("show", "A", function(object) cat("A\n"))
setMethod("show", "B", function(object) cat("B\n"))
setMethod("show", "C", function(object) {
    callGeneric(as(object, "A"))
    callGeneric(as(object, "B"))
    cat("C\n")
})
A
B
C

but obviously involves the developer in making explicit decisions
about method dispatch when there is multiple inheritance.

Martin

cgenolin at u-paris10.fr writes:

  
    
2 days later
#
Thanks Martin

Well it works except that "as" seems to not like the "initialize" method 
: the following code (that is the same than yours with some initialize 
for A B and C) does not compile. It seems that as(c,"A") does not work 
if we definie a initialize for A...

--- 8< --------------
setClass("A", representation(x="numeric"))
setMethod("initialize","A",function(.Object,value){.Object at x <- 
value;return(.Object)})
a <- new("A",4)

setClass("B", representation(y="numeric"))
setMethod("initialize","B",function(.Object,value){.Object at y <- 
value;return(.Object)})
b <- new("B",5)

setClass("C", contains=c("A", "B"))
setMethod("initialize","C",function(.Object,valueA, valueB){
    .Object at x <- valueA
    .Object at y <- valueB
    return(.Object)
})
c <- new("C",valueA=10,valueB=12)

setMethod("show", "A", function(object) cat("A\n"))
setMethod("show", "B", function(object) cat("B\n"))
setMethod("show", "C", function(object) {
    callGeneric(as(object, "A"))
    callGeneric(as(object, "B"))
    cat("C\n")
})
c
--- 8< --------------------

Is there something wrong with the use of 'as' between class and father 
class?

Christophe
#
Hi Christophe --

This is a variant of the problem that Jim Regetz is having in a thread
in R-devel. Here's where the trouble is
Error in .local(.Object, ...) : 
  argument "value" is missing, with no default

By default, 'as(c, "A")' will create a new instance of it's second argument
using new("A"), and then fill the slots of A with appropriate values
from "C". We can see that creating a new "A" without any additional
arguments causes the same error:
Error in .local(.Object, ...) : 
  argument "value" is missing, with no default

Jim has gone down the path of creating coercion methods ('setAs') for
his classes. A different solution is to ensure that 'new' works with
no additional arguments (typically requiring that a prototype, if
present, prodcues valid objects). So for instance

setMethod("initialize","A",function(.Object, value=numeric(0)){
    .Object at x <- value
    return(.Object)
})

and then
A

I find it easier to keep track of prototype and initialize methods,
rather than setAs, so I use a solution like the above.  But a couple
of other quick points. I would have written

setMethod("initialize", "A",
          function(.Object, ..., xValue=numeric(0)){
              callNextMethod(.Object, ..., x=xValue)
          })

Why? this allows the built-in object creation methods to create
.Object, so there's less code for me to maintain (even if it's just
object assignment .Object at x <- value  here). Importantly, when I
create a derived class, the derived class does not have to know in
detail about what the initilalize method for "A" does, e.g.,

setMethod("initialize","B",
          function(.Object, ..., yValue=numeric(0)){
              callNextMethod(.Object, ..., y=yValue)
          })

Here 'initialize' for B just deals with it's slots, and doesn't have
to worry about what to do with A's slots. Also .Object at x <- value
makes a copy of .Object, which can be expensive if .Object is
large. There is some hope that the default method (eventually reached
by callNextMethod) does things relatively efficiently in terms of
copies. Note that each initialize method only deals with its own
slots. And finally, the position of 'xValue' and 'yValue' means that
the arugment has to be named, e.g., new("B", yValue=12). This seems a
little awkward at first, but seems like a best practice when creating
objects with complicated inheritance -- not quite so much need to
follow the method dispatch / argument assignment rules through a
complicated inheritance hierarchy.

And finally, in Jim's thread I mention using a constructor. So in
practice for a case like the above I would not define any initialize
methods, and instead write

B <- function(xValue=numeric(0), yValue=numeric(0)) {
    new("B", x=xValue, y=yValue)
}

All my slot coercion is in the constructor. The user can figure out
from the signature of the constructor what the appropriate arguments
and their types are, and does not have to know about the details of
the class definition. I can catch common errors and provide
user-friendly messages, rather than getting cryptic messages from the
internals of S4.

Hope that helps.

Martin

Christophe Genolini <cgenolin at u-paris10.fr> writes:

  
    
#
Hi Martin, thanks for your answer
I am not that much familiar with S3... In our way of writing this 
method, 'initialize' for 'A' will call the next method for A, which is 
'initialise' for 'numeric', is that right ?
I agree with you. But I do not like the use of ... , it lets us to make 
many mistake like in :

print(3.5165,digitts=2)

There is a typo in digitts, but since print works with ... , R does not 
detect the mistake.
So I agree with the importance of naming argument, I always do it but 
without 'forcing' it
Interesting. Will you still define a validity method or not even ?


Christophe
----------------------------------------------------------------
Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre
#
cgenolin at u-paris10.fr writes:
from below,
The 'next' method refers to arguments in the signature of the generic,
i.e., the 'next' method relevant to .Object. For "A", the next method
is the default "initialize" method, which will use all named arguments
to fill the slots of .Object (and perhaps call validObject, among
other things). You can see the code with
S3 is not relevant here.
If this were "initialize", and you had provided an invalid named
argument, the default method would have failed because there is no
slot of that name.
For a simple case like this there is no extra validity to check beyond
that ensured by S4 (e.g., slots of the correct type). If there were
constraints, e.g., only positive values, or length one vectors, then
I would define a validity function.

Martin

  
    
1 day later
#
Well well well...

To summarize : let assume that A is a class (slot x) and C is a class 
containing A (A and slot y) - as(c,"A") calls new("A"). So new("A") HAS 
TO works, you can not decide to forbid empty object (unless you define 
setAs("C","A") ?)
- In addition, any test that you would like to set in initialize or 
validity have to first check is some field are empty (because 
'if(object at x >0)' will fail if x=numerical(0))
- When you call new("C"), the neither new("A") nor intialize("A") are 
called. (!!!!)

So, all the security test you write in initialize "A", you have to 
rewrite them on "C" ?

I start to undestand why not that much people use S4...

Christophe
----------------------------------------------------------------
Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre
#
cgenolin at u-paris10.fr writes:
You're partly misunderstanding...
new("A") has to work (return a valid object), but you can still create
arbitrary objects with a prototype (provided the prototype is
consistent with your validity function) and / or with initialize
methods (which do not have any named arguments, other than .Object,
without default values). Creating valid objects seems like a good
idea!
The validity and initialize methods have to be written in a way
consistent with valid objects -- if you want your object to contain
zero-length vectors, then the validity test has to accommodate that
(e.g., if (length(object at x)==0 || all(object at x>0)) {}). If your
initialize method is supposed to return a valid object, then you'd
better construct one! An initialize method on "A" that expected an
argument xValue whose default was 0 might be written as

setMethod("A", "initialize", function(.Object, ..., xValue=0) {
    callNextMethod(.Object, ..., x=xValue)
})

For this simple case you could create the object with a suitable
prototype and not have an initialize method at all

setClass("A", representation(x="numeric"),
    prototype=prototype(x=0))
new("A") is not called (why would it be?). If you have an initialize
method on "A" then it will be called. The problem you experienced
before was that your initialize method for "A" REQUIRED additional
arguments.
That is not correct; you do not have to duplicate code to construct an
object of class C.

  
    
#
Hi Martin

I was not that much speaking about what we can do, but more about what 
we can't. We can't decide that object will be 'never empty', we have to 
allow empty oject, otherwise new("A") will not work and that will be 
problematic.

So, at this point, I see :
  - it is necessary to allow the creation of empty object (otherwise, we 
are in trouble with 'as')
  - so 'new' will have to create object either empty or valid.
  - 'initialize' can check if an object is valid but has first to check 
if the object is non empty
  - I create an object Ccc containing an object Aaa : If Ccc is calling 
the constructor of Aaa, then initialize of Aaa ic called. If Ccc is 
setting the Aaa slot one bu one, the initialize of Aaa is not caller.
  - In addition, I did not speak about the setteur, but the 'setSlotX 
<-' does not call the initialize. So we have to check the validity of 
the object after any 'setSlot' again.

In consequence, to program as safe as possible:
  - for checking if an object is valid or not, I am thinking of setting 
a method checkCoherance
  - 'initialize' will just set slot to either empty or some value, then 
call checkCoherance
  - 'setSlotX<-' will change slot X, then call checkCoherance
  - If I create a C object, its checkCoherance method will call 
checkCoherance(as(object,"A"))

Here is an example (still the same stupid example with a single slot. 
Obviously, it has no interest on so small object, it is just to simplify)

--- 8< ------------------
setGeneric("checkCoherance",function(object){standardGeneric("checkCoherance")})

setClass("A", representation(x="numeric"))
setMethod("checkCoherance","A",
    function(object){
        cat("*** checkCoherance A ***\n")
        if(length(object at x)>0){
            if(object at x<0){stop("Object incoherance : x should be 
positive")}else{}
        }else{}
        return(TRUE)
    }
)

setMethod("initialize","A",
    function(.Object,value=numeric(0)){
        cat("*** initialize A ***\n")
        .Object at x <- value
        checkCoherance(.Object)
        return(.Object)
    }
)

setGeneric("setX<-",function(object,value){standardGeneric("setX<-")})
setReplaceMethod("setX","A",
  function(object,value){
    object at x <- value
    checkCoherance(object)
    return(object)
  }
)
 
a1 <- new("A")
a2 <- new("A",value=4)
try(a3 <- new("A",value=-4))

A <- function(val=numeric(0)){
  new("A",value=val)
}
A()
A(val=4)
A(val=-4)

setClass("C", representation(y="numeric"),contains="A")
setMethod("checkCoherance","C",
    function(object){
        cat("*** checkCoherance C ***\n")
        if(length(object at y)>0){
            if(object at y>0){stop("Object incoherance : y should be 
n?gative")}else{}
        }else{}
        checkCoherance(as(object,"A"))
        return(TRUE)
    }
)
setMethod("initialize","C",
    function(.Object,valueX=numeric(0), valueY=numeric(0)){
        cat("*** initialize C ***\n")
        .Object at x <- valueX
        .Object at y <- valueY
        checkCoherance(.Object)
        return(.Object)
    }
)

c <- new("C",valueX=10,valueY=12)
new("C")

C <- function(valX=numeric(0),valY=numeric(0)){
   new("C",valueX=valX,valueY=valY)
}
C()
C(valX=5)
try(C(valX=-5))
C(valY=-3)
c1 <- C(valX=4,valY=-6)
as(c1,"A")

--- 8< ------------------
Lol. I guess so, and unfortunatly, it is not over ...
I do not understand the difference with

setClass("A", representation(x="numeric"),
    prototype(x=0)
)
Only if you call the A constructor, right ? Not if you set the slot that 
C has heritate from A one by one, right ?

Christophe
18 days later
#
Hi Martin

I am re reading all the mail we exchange with new eyes because of all 
the thing I learn in the past few weeks. That very interesting and some 
new question occurs...

***********************************
Once, you speak about callGeneric :

setClass("A", representation(x="numeric"))
setClass("C", contains=c("A"))

setMethod("show", "A", function(object) cat("A\n"))
setMethod("show", "C", function(object) {
   callGeneric(as(object, "A"))
   cat("C\n")
})

new("C")

Considere the following definition (that you more or less teach me with 
your yesterday remarques...) :

setMethod("show", "C", function(object) {
   callNextMethod()
   cat("C\n")
})

In this case, is there any difference between the former and the latter ?
Which one would you use ?

(I get that in more complicate case, for example if
setClass("C", contains=c("A","B")), it might be more complicate to use 
the latter, right ?)




*************************
This works :

setMethod("initialize","B",
          function(.Object,..., yValue){
              callNextMethod(.Object, ..., y=yValue)
              return(.Object)
          })
new("B",yValue=3)

but this does not :

setMethod("initialize","B",
          function(.Object, yValue){
              callNextMethod(.Object, y=yValue)
              return(.Object)
          })
new("B",yValue=3)

Why ?
Is there any help page about ... ?


**************************
showMethods gives the list of all the method. Is there a way to see all 
the method for a specific signature IN THE ORDER they will be call by 
callNextMethod ?
If ANY <- D <- E, a method that will gives :

Function "initialize":
.Object = "E"
.Object = "D"
.Object = "ANY"

Thanks for your help
And happy easter eggs !

Christophe


----------------------------------------------------------------
Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre
#
cgenolin at u-paris10.fr wrote:
callNextMethod is the right thing to do for this case. callGeneric is 
useful in a very specific case -- when dispatching from within a 
so-called 'group generic' function. But this is an advanced topic.
The right thing to do in this case is to sit down with the rules of 
method dispatch, and figure out what the 'next' method will be. A common 
alternative paradigm is to have a plain function (not visible to the 
user, e.g., not exported in a package name space) that several different 
methods all invoke, after mapping their arguments appropriately. The 
methods provide a kind of structured interface to the function, making 
sure arguments are of the appropriate type, etc. The function does the 
work, confident that the arguments are appropriate.
Both 'work' in the sense that an object is returned (by the way, no need 
to use 'return' explicitly). And actually the examples on some of the 
man pages do not include '...', so this is really my opinion rather than 
the 'right' way to do things.

In an object-oriented sense, initialize,B-method should really just deal 
with it's own slots; it shouldn't have to 'know' about either classes 
that it extends (A) or classes that extend it. And it shouldn't do work 
that inherited methods (i.e., initialize,ANY-method) do. In the second 
form above, without the ..., there is no way for the initialize,A-method 
to see arguments that might be relevant to it (e.g., values to be used 
to initialize its slots). So initialize,B-method would have to do all 
the work of initializing A. This is not good design.
There is, but I have never been able to figure it out in detail or to 
feel confident that I was using functions that were meant to be used for 
this purpose by the user (as opposed to by the methods package).

John Chambers posted recently to the R-devel mailing list about changes 
to the internal representation of methods and classes.

https://stat.ethz.ch/pipermail/r-devel/2008-March/048729.html

I have not explored the new functions Dr. Chambers mentions; to use them 
requires the 'devel' version of R, not 2.6.2. Any questions they 
generate should definitely be addressed to the R-devel mailing list.

Best,

Martin
#
Ok, when I will be older :-)
Well yes, but the second one does return an object without assigning 
the value 3, that is not realy working...
I get your point and I agree : I am developing B, you are developing A, 
I do not want to know what is in A so B should not initialize its 'A 
part'

On the other hand, I do not like the "..." . "..." can be anything, 
there is no controle at all, no type checking.
I would prefers to initialize B giving its value for its own slot AND 
an object class A. So I send you the value for A, you send me an objets 
'aaa' of class A then I initialize B with some value and aaa. This way, 
B keep its role but does not transmit anythink to A without controling 
it.

Best,

Christophe

----------------------------------------------------------------
Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre
#
cgenolin at u-paris10.fr wrote:
I wasn't paying enough attention to your code. callNextMethod is like 
any R function -- it will not change .Object 'in place', but makes a 
copy, modifies the copy, and returns the copy. So either

setMethod("initialize","B",
          function(.Object,..., yValue){
              .Object <- callNextMethod(.Object, ..., y=yValue)
              return(.Object)
          })

or more compactly

setMethod("initialize","B",
          function(.Object,..., yValue){
              callNextMethod(.Object, ..., y=yValue)
          })

The code example is incomplete, so I don't really know why one version 
assigned y=3 for you and the other did not; for me, neither version did 
the assignment.

Martin

  
    
#
I probably add the return in the mail without imagining il will change things.

My question was more on the use of ... versus the absence of ...
You anwer me by correcting my bug. So I can use callNextMethod with or 
without ... :

setClass("B",representation(y="numeric"))
setMethod("initialize","B",
          function(.Object,..., yValue){
              return(callNextMethod(.Object, ..., y=yValue))
          })

new("B",yValue=3)                   #1
try(new("B",yValueee=3))            #2 try(new("B",yValue=3,yValueee=3))   #3

setMethod("initialize","B",
          function(.Object, yValue){
              return(callNextMethod(.Object, y=yValue))
          })
new("B",yValue=3)                    #4
try(new("B",yValueee=3))             #5
try(new("B",yValue=3,yValueee=3))    #6

I undersand that 1 and 4 work. I understand that 2 and 5 do not work 
since yValue is missing
I understand that 6 does not work since yValueee is not a valid argument
But I would expect that 3 will work since it get a value for yValue and 
yValueee can be one of the ...

It does not...

Christophe

----------------------------------------------------------------
Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre
#
cgenolin at u-paris10.fr wrote:
The first challenge is to understand how ... works (maybe from 'An 
Introduction to R', section 10.4?)

 > f <- function(...) names(list(...))

 > f(x=1)
[1] "x"
 > f(x=1, y=2)
[1] "x" "y"

Probably ok so far. Now

 > g <- function(..., x) f(..., z=x) # 'g' transmits 'x' as 'z'
 > g(x=1)
[1] "z"
 > g(y=1,x=2)
[1] "y" "z"

or

 > h <- function(..., x) f(...) # 'h' consumes 'x'
 > h()
NULL
 > h(x=1)
NULL
 > h(y=1)
[1] "y"
 > h(y=1,x=1)
[1] "y"

For instance, g passes an argument to f that consisting of (what g 
received as) ... and z. f is expects ..., and z ends up as part of that 
list.

What happens with initialize? The default method is described on 
?initialize, where the signature is

initialize(.Object, ...)

with

      ...: Data to include in the new object.  Named arguments
           correspond to slots in the class definition.
[snip]

in #3, your callNextMethod results in the default method seeing ... 
containing an argument yValueee=3. Since yValueee does not match a slot, 
R responds with an error

  > Error in .nextMethod(.Object, ..., y = yValue) :
   invalid names for slots of class "B": yValueee

With the ..., a class extending B can rely on the default method to fill 
in slots with provided arguments.

setClass("C", representation=representation(c="numeric"), contains="B")
new("C", yValue=1, c=2) # works with ... in B's initialize method

Without ..., a class extending B would have to write an initialize 
method that fills in slots (and checks validity) itself, repeating the 
work already implemented in initialize,ANY-method.

  
    
2 days later
#
Sorry to come back on callNextMethod, I am still not very confident about it.

Consideres the following (there is a lot of code, but very simple with 
almost only some cat) :

------------------
setClass("A",representation(a="numeric"))
setValidity("A",function(object){cat(" ***** Valid A *****\n");TRUE})
setMethod("initialize","A",function(.Object){
    cat("****** Init  A ******\n")
    .Object <- callNextMethod()
    return(.Object)
})

setClass("B",representation(b="numeric"),contains="A")
setValidity("B",function(object){cat("   *** Valid B ***\n");TRUE})
setMethod("initialize","B",function(.Object){
    cat("  **** Init  B ****\n")
    .Object <- callNextMethod()
    return(.Object)
})
new("B",a=3,b=2)
######## Result ########
#   **** Init  B ****
# ****** Init  A ******
#  ***** Valid A *****
#    *** Valid B ***
# An object of class "B"
# Slot "b":
# [1] 2
# # Slot "a":
# [1] 3

--------------------

new("B") will go trought
- initialize B that will call the nextMethod that is :
- initialize A that will call the nextMethod that is :
- initialize ANY call validObject A.
This would be perfect... But there is also a call to validObject B. 
Where does it come from ?

This is anoying because :
I completly agree with that. But if the author of A change its code :

---------------------
setClass("A",representation(a="numeric"))
setValidity("A",function(object){cat(" ***** Valid A *****\n");TRUE})
setMethod("initialize","A",function(.Object){
    cat("****** Init  A ******\n")
    .Object at a <- 10
    return(.Object)
})

setClass("B",representation(b="numeric"),contains="A")
setValidity("B",function(object){cat("   *** Valid B ***\n");TRUE})
setMethod("initialize","B",function(.Object){
    cat("  **** Init  B ****\n")
    .Object <- callNextMethod()
    return(.Object)
})
new("B",a=3,b=2)
######## Result ########
#   **** Init  B ****
# ****** Init  A ******
# An object of class "B"
# Slot "b":
# numeric(0)
#
# Slot "a":
# [1] 10

---------------------

Then validObject of B is no longer call, and B is no longueur correctly set...
So if A is changed by its author, the comportement of B is change as well...

Anoying, isn't it ?

But I agree with the
So may be something like :
---------------------
setClass("A",representation(a="numeric"))
setValidity("A",function(object){cat(" ***** Valid A *****\n");TRUE})
setMethod("initialize","A",function(.Object){
    cat("****** Init  A ******\n")
#    .Object at a <- 10
    .Object <- callNextMethod()
    return(.Object)
})

setClass("B",representation(b="numeric"),contains="A")
setValidity("B",function(object){cat("   *** Valid B ***\n");TRUE})
setMethod("initialize","B",function(.Object,a,b){
    cat("  **** Init  B ****\n")
    as(.Object,"A") <- new("A",a=a)
    .Object at b <- b
    return(.Object)
})
new("B",a=3,b=2)
######## Result ########
#   **** Init  B ****
# ****** Init  A ******
# An object of class "B"
# Slot "b":
# [1] 2
#
# Slot "a":
# [1] 10

---------------------
The call to validObject of B is no longer dependent of the A code.

Christophe
----------------------------------------------------------------
Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre
#
cgenolin at u-paris10.fr wrote:
You're creating a "B" object, so validObject is called on B. This 
probably makes you wonder why, then, validObject is being called on A. 
And the answer to this is that the validity method of 'B' is responsible 
only for the unique aspects of the object that relate to B. validObject 
A has to be called so that the parts of B that are inheritted from A can 
be checked.
Yes this would be a bad thing for the author of A to do (in my opinion).
Yes, but in any object oriented system you're relying on inherited 
methods to fulfill a contract. If they change the contract (e.g., no 
longer guaranteeing that slots will be populated and validity checked), 
then downstream classes have to change.
Here you're implementing part of the functionality of the default method 
(populating slots) so this code duplication is not very good practice 
(in my opinion).
You're free to do what you like, of course. This replicates 
functionality of the default method (slot assignment) and does it in an 
inefficient way (making unnecessary copies of .Object; this matters when 
real-world objects are large). There is no validity checking, and to 
ensure that you'd have to add to your paradigm that all initialize 
methods call validObject. Because of the way validObject is implemented, 
you'll end up evaluating it multiple times for each construction of B. 
Lack of ... in the argument list means that derived classes must use 
your convention for object initialization, so you've replaced one 
(semi-established) convention with another. A close reading of 
initialize shows that the contract is more complicated than what we've 
talked about, with unnamed arguments used to initialize classes that the 
object extends and with a kind of copy-constructor functionality. You'll 
have to modify your paradigm further to accommodate these, or change the 
contract of your initialize method relative to those documented for S4. 
Again, you can adopt these conventions if you find them useful.

I know the above classes are just examples. But it's worth pointing out 
that the basic operation of initializing classes from named slots 
actually requires NO initialize method for the class -- this is all 
performed by the default method. As I've gained experience, I've 
actually found that my real classes tend NOT to have initialize methods, 
or to have initialize methods that are much simpler than they were at an 
earlier point in my understand. It's letting the existing software do 
the work for you.

Martin