Skip to content
Prev 149 / 21312 Next

[Bioc-devel] Invitation to the bioC developers Meeting in Seattle Mon 15 Aug

I wish to propose an agenda topic:

- Is Bioconductor's primary aim to provide a focused repository of 
packages, aiming to attract software implementing cutting edge 
bioinformatics research from as many quality labs around the world as 
possible. Or is to produce a set of packages implementing more or less a 
single integrated application?

- If the primary aim is that of a repository, would it be worthwhile to 
spin off a much smaller set of packages with a smaller developer team to 
try to develop towards an integrated application?

Background. Although Bioconductor might have some characteristics of both a 
repository and an integrated application, one of these two paradyms needs 
to take precedence I think. To make an analogy, is it the aim of 
Bioconductor to be the software analog of a research journal, or is it to 
be the software analog of a monograph?

Under the first model, the development of different packages providing 
different approaches to the same problem, i.e., competing with one another, 
is to be expected and even encouraged. The aim is to promote a stimulating 
environment for the development and dissemination of new techniques. The 
obvious down-side is that a research journal however provides a very steep 
learning curve for non-statistical users. A research journal can provide 
occasional review articles for a wider audience.

Under the second model, it is not reasonable to expect every lab in the 
world to participate. Instead, one needs to select a smaller team of close 
collaborators. Co-authors on research monographs are normally collaborators 
who are also co-authors on associated research papers. Also, it is not 
realistic to expect a monograph to keep up with the pace of a research 
journal in terms of development of new techniques. So this model with move 
more slowly and be less inclusive, but will be easier to present as an 
integrated solution to a non-specialist audience.

I think that one could view R itself, meaning the set of packages in the 
default distribution, as being an example of the second model. This seems 
to me that this is appropriate considering that the statistical methodology 
implemented by the standard distribution of R is reasonably 
well-established, mostly part of the canonical core of the statistical 
discipline. On the other hand, the research problems being addressed by 
Bioconductor, almost without exception, do not yet have generally accepted 
solutions. On the contrary, the race is very much on to explore what is 
possible and what is best. This situation makes the contrast between a 
research journal and a monograph unusually marked, with the latter at risk 
of being dated unusually quickly.

Best regards
Gordon
At 03:44 AM 4/08/2005, Wolfgang Huber wrote: