Skip to content
Prev 12055 / 21307 Next

[Bioc-devel] R6 class v.s. S4 class

On 10/19/2017 09:24 PM, Charles Plessy wrote:
The Bioconductor convention would use S4 objects with CamelCase 
constructors.

   geneClient = BioThingsGeneClient()  ## or just GeneClient()

I agree with enabling the use of pipe, and think the generic + methods 
should have signature where the first argument is the client rather than 
the pattern against which the query occurs. There is to some extent an 
argument for name-mangling in the generic (other knowledgeable people 
disagree) so that one is free to implement contracts unique to the 
package in question, and avoid conflicts with other generics with 
identical names in different packages ( AnnotationDbi::select() / 
dplyr::select()).

   setGeneric(
     "btQuery",
     function(x, query, ...) standardGeneric("btQuery")
   )

   setMethod(
     "btQuery", "GeneClient",
     function(x, query)
   {
     ## implementation
   })

   btQuery(geneClient, "CDK2")  ## maybe btquery(...)

Yes one could BioThings::query(), or 
semanticallyInformativeAlterntaiveToQuery(), but these seem cumbersome 
to me, and the first at least has rough edges (that of course should be 
fixed...), e.g.,

   > methods(AnnotationHub::query)
   Error in .S3methods(generic.function, class, parent.frame()) :
     no function 'AnnotationHub::query' is visible

I think Michael is arguing for something like plain-old-functions (and 
the original examples and problems of multiplying methods seemed somehow 
to be plain old functions rather than S4 generics and methods?)

   geneQuery <- function(x, query) ...

A down side is that one cannot discover programatically what one can do 
with a GeneClient object (if it were a method, one could ask for 
methods(class=class(geneClient))); as a developer one also needs to 
validate the incoming argument, which requires a certain but not 
unsurmountable discipline.

Michael didn't mention it, but these slides of his are relevant

 
https://bioconductor.org/help/course-materials/2017/BioC2017/DDay/BOF/usability.pdf

One other lesson from the annotation world is to think carefully about 
the structure of the return, in particular thinking about 1:1 versus 
1:many mappings between vector-valued 'pattern='. While it's tempting to 
return say a character vector or named list, probably one wants these 
days to take the lessons of tidy data and return a data.frame-like 
(e.g., DataFrame(), but maybe that's not 'necessary'; nothing wrong with 
a tibble, but a data.table is not likely necessary or particularly 
advised [because of the novel syntax and reference semantics]) object 
where the first column is the query and the second and subsequent 
columns the result of the query; one wants to pay particular attention 
to dealing with 1:0 and 1:many mappings in ways that do not confuse 
users; some use cases (e.g., adding annotations to the rowData() of 
SummarizedExperiment) are really facilitated by a 1:1 mapping between 
query and response.

Martin
This email message may contain legally privileged and/or...{{dropped:2}}