Skip to content

[Bioc-devel] Problem with generic methods name conflicts

12 messages · Herb Holyst, Kasper Daniel Hansen, Vincent Carey +4 more

#
Herb,

First I want to point out that this is an excellent case related to
the recent discussion I tried to initiate about clashes of generics.

Second I will note (like you have already noticed) that this is a
recent addition to Biobase, seemingly coming from the DEseq crowd and
that this particular generic is not used at all in Biobase, it just
contains the generic.

The best solution to this is of course to agree with the other
package(s) about the signature of the generic.  You are using
  object, ...
which to me seems very nice. Biobase is using
  cds, normalized=TRUE
To me it seems that "object" works better as the function argument of
a generic, and that you are right in adding the ....  Also,
"normalized=TRUE" seems (to me) to be something that really belongs in
the individual method (is this really relevant for all signatures?),
so my suggestion would be to adapt your generic into Biobase.  It
should be easy for DEseq to change the signature in that (and possible
other?) packages.

But since I am not really involved in any of the packages, I am not
sure my opinion counts for much.

Kasper
On Thu, Oct 13, 2011 at 11:04 AM, Herb Holyst <holyst at mail.med.upenn.edu> wrote:
#
On Thu, Oct 13, 2011 at 11:14 AM, Kasper Daniel Hansen
<kasperdanielhansen at gmail.com> wrote:
I agree with this comment about parameter names in generics.
I would hope that this could happen, and that some review of generic
programming and
policies could be taken up in Manchester.
#
Dear Kaspar and Herb
On 2011-10-13 17:14, Kasper Daniel Hansen wrote:
I fully agree. Please change the definition in Biobase to
"counts( object, ...)" and I'll change DESeq/DEXSeq accordingly. That 
the specialized argument 'normalized' turned up in Biobase was an oversight.

To get back to the general discussion: It is an unfortunate property of 
CLOS and S4 that a generic function needs to be defined before one can 
define methods; other language demonstrate that this requirement is not 
needed. (Compare with overloading in C++; the reason we have it in 
CLOS/S4 is, as far as I understand it, solely that the construction 
allowed to add methods without modifying the core language.)

Hence, it seems reasonable to consider the definition of a generic as a 
mere formality without semantic context. This is especially true if 
multiple dispatch is not needed, because then, the standard signature 
"object, ..." will always to the job. Hence, we could make it a policy 
that whenever two packages both wish to define methods for a generic f, 
we simply define one in Biobase (or maybe, in BiocGenerics) with always 
the same standard signature "object, ...".

   Simon
#
On Thu, Oct 13, 2011 at 11:36 AM, Simon Anders <anders at embl.de> wrote:
"object" is here just a parameter name, not a signature.  i believe
this issue is independent of the
concept of multiple dispatch.
#
Dear Vince
What I meant is this: If you consider it unlikely that any package 
author who may want to define methods with a certain name wants to 
dispatch on anything else than just the type of the first argument, it 
is sufficient to specify only one parameter (and then "...") in the 
generic. This limits any method to signatures with only a single type, 
but this will be fine, usually. And to stay general, we should give this 
one parameter as standard name, "object". Then, defining this generic is 
a "mere formality", and we may have a standing policy that such a 
generic is simply added to a central place (Biobadse or BioGenerics) 
whenever requested, without need for discussion.

   Simon
#
On 10/13/2011 08:36 AM, Simon Anders wrote:
AllGenerics now defines the generics as

   function(object, ...)

and the replacement generics as

   function(object, ..., value)

Martin

  
    
#
this seems reasonable to me.  i still vote for keeping the generics in
Biobase until benefits of breaking it off are clarified.
On Thu, Oct 13, 2011 at 12:09 PM, Simon Anders <anders at embl.de> wrote:
#
On 10/13/2011 08:04 AM, Herb Holyst wrote:
I think that this construct pre-dates name spaces; it's weird (to me) to 
define a method on a generic that one doesn't know about, e.g., maybe 
'counts' is documented to return a vector of European nobleman. It also 
implies that 'counts' is found on the user search path rather than in 
the package name space, so sometimes the intended counts is found, 
sometimes not.

These days I think that one would importFrom(Biobase, counts) and then 
setMethod on the known generic (if Biobase has an appropriate counts 
generic) or create a new generic (which I think would not normally be 
the case).

A secondary question that quickly comes up is how to make the user aware 
of the generic 'counts'.

On the one hand (I think this is the usual solution) it might be 
appropriate to (in the DESCRIPTION file)

   Depends: Biobase
   Imports: Biobase

and (in the NAMESPACE file)

   importFrom(Biobase, counts)
   exportMethods(counts)

which arranges for the counts generic from Biobase to be available to 
you for attaching methods, and to the user via the standard search path, 
and for the methods defined in your package to be associated with that 
generic.

On a second hand one might think that, while your package has a use for 
some-of-Biobase, the user does not have a use for all-of-Biobase so

   Imports: Biobase

and

   importFrom(Biobase, counts)
   export(counts)

which allows you to add your methods to Biobase::counts, then passes 
Biobase::counts through to be visible to the user. This seems awkward; 
it requires that you document the 'counts' generic (you're the one 
making it available to the user) even though the generic is in Biobase. 
The motivation for not attaching Biobase to the search path is probably 
part of the motivation for a BiocGenerics class -- too much clutter for 
the user.

On the third hand one might forgo Biobase entirely (neither Depends: nor 
Imports:), define a generic counts in your own package and export that. 
Let the user disambiguate if they happen to load Biobase in addition to 
your package. For disparate data types and questions this might be 
appropriate, but for data types shared by different packages it 
unfortunately leads to a multiplication of classes and a confusion of 
interfaces.

On the fourth hand, perhaps your package has a use for some of Biobase 
but the user does not. So Imports: and importFrom, with no exports.

There are likely to be other hands.

Martin

  
    
#
Hi developers,
On 11-10-13 08:36 AM, Simon Anders wrote:
That makes a lot of sense to me. That would solve the current conflict
with the updateObject() generic defined in Biobase and IRanges (both
packages are too big to depend one on each other). And I would be happy
to move other setGeneric statements currently in IRanges/GenomicRanges
/Biostrings to BiocGenerics. Some of those explicit generic definitions
just correspond to stuff defined in base R (e.g. cbind, rbind, pmin,
pmax, eval etc...), and BiocGenerics sounds like a better place to have
them.

Cheers,
H.