Skip to content
Prev 4261 / 21312 Next

[Bioc-devel] Federation of sub-packages acceptable for Bioconductor?

On 04/16/2013 05:59 AM, Ulrich Bodenhofer wrote:
Hi Ulrich -- sounds like an interesting project, and of course it's great to 
make highly performant and relevant code available to a wide audience!

The Bioconductor policy is that packages are available across platforms. Really 
this benefits the end user (no need to discover why it worked on my Mac but not 
on the computer cluster) and the developer (only one branch of code to 
maintain). There are exceptions in Bioconductor, but invariably these cause 
problems for users, developers, and Bioconductor, and weaken the appeal of a 
high-level language touting reproducible research -- it is a _mistake_ to plan 
to implement something that will be so unsatisfactory!

I would instead recommend the difficult path of identifying and implementing the 
cross-platform core of your ideas and ambitions; this in itself can be a 
rewarding software development activity.
Bioconductor policy is that packages have vignettes.

Having complicated dependencies requires considerable discipline on the part of 
developers (maybe you have control over this...) and users (but not this!). It 
also makes use of the package difficult, as your consideration of documentation 
implies -- the user will get easily lost in the already confusing R help system. 
I would instead identify strategies to organize your code within a single 
package. I would also encourage an evolutionary design, where if the full 
ambition of your project is realized and adopted by a broad or deep community of 
users, perhaps in the future the single package evolves to several packages; 
this approach is simplified when the software works uniformly across platforms.

I think also that R and Bioconductor users are, speaking broadly, different from 
general purpose programming language users; they have well-defined use cases 
(differential expression in RNA-seq; copy number variation in DNA-seq; 
annotation of regions of interest, machine learning for exploratory analysis, 
...) and are looking for a 'package' that fulfils their use case, rather than 
for algorithms that can in principle be stitched together to form a solution. 
This does not encourage the packaging of algorithms, but of solutions. This can 
sometimes be less than optimal, e.g., Import'ing a single function from a much 
larger package. Perhaps as a corollary, my opinion is that code should be 
organized into packages in which vignettes make sense.

Hope that helps, and look forward to your contributions!

Martin