unary class union of an S3 class
On 03/19/2016 06:35 AM, Michael Lawrence wrote:
On Sat, Mar 19, 2016 at 4:29 AM, Herv? Pag?s <hpages at fredhutch.org
<mailto:hpages at fredhutch.org>> wrote:
On 03/19/2016 01:22 AM, Michael Lawrence wrote:
On Sat, Mar 19, 2016 at 12:10 AM, Herv? Pag?s
<hpages at fredhutch.org <mailto:hpages at fredhutch.org>
<mailto:hpages at fredhutch.org <mailto:hpages at fredhutch.org>>> wrote:
On 03/18/2016 03:28 PM, Michael Lawrence wrote:
On Fri, Mar 18, 2016 at 2:53 PM, Herv? Pag?s
<hpages at fredhutch.org <mailto:hpages at fredhutch.org>
<mailto:hpages at fredhutch.org <mailto:hpages at fredhutch.org>>
<mailto:hpages at fredhutch.org
<mailto:hpages at fredhutch.org> <mailto:hpages at fredhutch.org
<mailto:hpages at fredhutch.org>>>> wrote:
Hi,
Short story
-----------
setClassUnion("ArrayLike", "array")
showClass("ArrayLike") # no slot
setClass("MyArrayLikeConcreteSubclass",
contains="ArrayLike",
representation(stuff="ANY")
)
showClass("MyArrayLikeConcreteSubclass") # 2
slots!!
That doesn't seem right.
Long story
----------
S4 provides at least 3 ways to create a little
class hierarchy
like this:
FooLike ............. virtual class with
no slot
^ ^
| |
foo anotherfoo ..... 2 concrete subclasses
(1) The "standard" way: define FooLike first, then
foo and
anotherfoo
as subclasses of FooLike:
setClass("FooLike")
setClass("foo",
contains="FooLike",
representation(stuff="ANY")
)
setClass("anotherfoo",
contains="FooLike",
representation(stuff="ANY")
)
showClass("FooLike") # displays foo and
anotherfoo as
# known subclasses
x1 <- new("foo")
is(x1, "foo") # TRUE
is(x1, "FooLike") # TRUE
is(x1, "anotherfoo") # FALSE
x2 <- new("anotherfoo")
is(x2, "anotherfoo") # TRUE
is(x2, "FooLike") # TRUE
is(x2, "foo") # FALSE
Everything works as expected.
(2) Using a class union: define foo and anotherfoo
first,
then FooLike
as the union of foo and anotherfoo:
setClass("foo", representation(stuff="ANY"))
setClass("anotherfoo", representation(stuff="ANY"))
setClassUnion("FooLike", c("foo", "anotherfoo"))
showClass("FooLike") # displays foo and
anotherfoo as
# known subclasses
(3) Using a *unary* class union: define foo first,
then
FooLike as the
(unary) union of foo, then anotherfoo as a
subclass of FooLike:
setClass("foo", representation(stuff="ANY"))
setClassUnion("FooLike", "foo")
showClass("FooLike") # displays foo as the
only known
subclass
setClass("anotherfoo",
contains="FooLike",
representation(stuff="ANY")
)
showClass("FooLike") # now displays foo and
anotherfoo as
# known subclasses
The 3 ways lead to the same hierarchy. However the
3rd way is
interesting because it allows one to define the
FooLike virtual
class as the parent of an existing foo class that
s/he doesn't
control.
Why not use setIs() for this?
> setClass("ArrayLike")
> setIs("array", "ArrayLike")
Error in setIs("array", "ArrayLike") :
class ?array? is sealed; new superclasses can not be
defined,
except by 'setClassUnion'
How do you define a virtual class as the parent of an
existing class
with setIs?
You can only do that with setClassUnion(). But the new classes
should
use setIs() to inherit from the union. So it's:
setClassUnion("ArrayLike", "array")
setClass("MyArrayLike")
setIs("MyArrayLike", "ArrayLike")
Everything then behaves as expected. I
don't think it makes much sense to "contain" a class union.
Why is that? A class union is just a virtual class with no slot
that is the parent of the classes that are in the union.
All the
classes in the union contain their parent. What's
interesting is that
this union is actually open to new members: when I later
define a new
class that contains the class union, I'm just adding a new
member to
the union.
Rather, you
just want to establish the inheritance relationship.
Isn't what I'm doing when I define a new class that
contains the
class union?
Containing does two things: establishes the is() relationship
and adds
slots to the class.
I understand that. But in that case, since a class union has no slots,
one would expect that using setIs() is equivalent to containing.
These slots are comprised of the slots of the
contained class, and as a special case the "array" class and other
native types confer a data part that comes from the prototype of the
class. The "array" class has a double vector with a dim
attribute as its
prototype. That is all well understood. What is surprising is that
"ArrayLike" has the same prototype as "array". That happens via
setIs(doComplete=TRUE), called by setClassUnion(). When a class
gains
its first non-virtual child, the parent assumes the prototype of its
child. I'm not sure why, but the logic is very explicit and
I've come
to just accept it as a "feature".
Never noticed that. Thanks for clarifying. So with this "feature":
- setClassUnion("A", c("B", "C")) is not the same as
setClassUnion("A", c("C", "B"))
- if 2 packages define concrete subclasses of a virtual
class defined in a 3rd package, the prototype of the virtual
class will depend on the order the packages are loaded
- using setIs("MyArrayLike", "ArrayLike") is not equivalent
to containing (even though ArrayLike has no slots)
- containing adds an undesirable .Data slot
- containing breaks is.array() but not is( , "array")
Seems pretty harmful to me. Would be good to understand the
rationale behind this feature. In particular it's not clear to me
why a virtual
class with no slot would need to have a prototype at all (i.e. other
than NULL).
I ran into this some months ago when
defining my own ArrayLike when working on a very similar package
to the
one you are developing ;)
After giving it more thoughts I realized that I can do without the
ArrayLike class. That will keep the class hierarchy in HDF5Array to the
strict minimum.
Yea I've come to realize that declaring virtual classes that indicate
whether an object behaves like a base type is overkill.
If it was just to indicate this, it would definitely be overkill.
But it's convenient to be able to define methods at the level of
ArrayLike. A typical use case is when a method for a subclass
coerces to "array" and delegates to the method for "array":
x <- as.array(x)
callGeneric()
If you don't have the ArrayLike class, you have to define the same
method over and over for each ArrayLike subclass.
It's also useful to be able to use ArrayLike as slots in other
classes.
Anyway, it turns out that I don't need any of these features in
HDF5Array, at least for now.
It usually suffices to say that the object satisfies the basic contract of an array, list, vector, etc. It would be nice to have something like a Java interface for specifying such contracts.
I'd rather fix what we already have ;-) H.
Thanks for the feedback,
H.
For example, to define an ArrayLike class:
setClassUnion("ArrayLike", "array")
showClass("ArrayLike") # displays array as a
known subclass
Note that ArrayLike is virtual with no slots
(analog to a Java
Interface), which is what is expected.
setClass("MyArrayLikeConcreteSubclass",
contains="ArrayLike",
representation(stuff="ANY")
)
showClass("MyArrayLikeConcreteSubclass") #
shows 2 slots!!
What is the .Data slot doing here? I would expect
to see
that slot
if MyArrayLikeConcreteSubclass was extending array
but this
is not
the case here.
a <- new("MyArrayLikeConcreteSubclass")
is(a, "MyArrayLikeConcreteSubclass") # TRUE
--> ok
is(a, "ArrayLike") # TRUE
--> ok
is(a, "array") # FALSE
--> ok
But:
is.array(a) # TRUE --> not ok!
Is is.array() confused by the presence of the
.Data slot?
It looks like the unary union somehow equates ArrayLike
and array
Clearly the unary union makes ArrayLike a parent of array,
as it should
be. This can be confirmed by extends():
> extends("array", "ArrayLike")
[1] TRUE
> extends("ArrayLike", "array")
[1] FALSE
The results for is(a, "ArrayLike") (TRUE) and is(a,
"array") (FALSE)
on a MyArrayLikeConcreteSubclass instance are consistent
with this.
So the little 3-class hierarchy I end up with in the above
example
is exactly how expected:
ArrayLike
^ ^
| |
array MyArrayLikeConcreteSubclass
What is not expected is that MyArrayLikeConcreteSubclass
has a .Data
slot and that is.array(a) returns TRUE on a
MyArrayLikeConcreteSubclass
object.
H.
and
thus makes ArrayLike confer a dim attribute (and thus
is.array(a)
returns TRUE). Since S4 objects cannot have attributes
that are not
slots, it must do this via a data part, thus the .Data
slot.
I can fix it by defining an "is.array" method for
MyArrayLikeConcreteSubclass objects:
setMethod("is.array",
"MyArrayLikeConcreteSubclass",
function(x) FALSE
)
However, it feels that I shouldn't have to do this.
Is the presence of the .Data slot in
MyArrayLikeConcreteSubclass
objects an unintended feature?
Thanks,
H.
> sessionInfo()
R Under development (unstable) (2016-01-07 r69884)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8
LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets
methods base
--
Herv? Pag?s
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fredhutch.org
<mailto:hpages at fredhutch.org> <mailto:hpages at fredhutch.org
<mailto:hpages at fredhutch.org>>
<mailto:hpages at fredhutch.org
<mailto:hpages at fredhutch.org> <mailto:hpages at fredhutch.org
<mailto:hpages at fredhutch.org>>>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
<tel:%28206%29%20667-5791>
<tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
<tel:%28206%29%20667-1319>
<tel:%28206%29%20667-1319>
______________________________________________
R-devel at r-project.org <mailto:R-devel at r-project.org>
<mailto:R-devel at r-project.org <mailto:R-devel at r-project.org>>
<mailto:R-devel at r-project.org
<mailto:R-devel at r-project.org> <mailto:R-devel at r-project.org
<mailto:R-devel at r-project.org>>>
mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Herv? Pag?s
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
<mailto:hpages at fredhutch.org <mailto:hpages at fredhutch.org>>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
<tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
<tel:%28206%29%20667-1319>
--
Herv? Pag?s
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319