[Bioc-devel] Dependencies in Bioconductor dockers
On 29 August 2015 01:19, Martin Morgan wrote:
On 08/28/2015 02:51 PM, Dan Tenenbaum wrote:
----- Original Message -----
From: "Laurent Gatto" <lg390 at cam.ac.uk> To: "Dan Tenenbaum" <dtenenba at fredhutch.org> Cc: "Kasper Daniel Hansen" <kasperdanielhansen at gmail.com>, "bioC-devel" <bioc-devel at stat.math.ethz.ch>, "Laurent Gatto" <lg390 at cam.ac.uk> Sent: Friday, August 28, 2015 2:28:29 PM Subject: Re: [Bioc-devel] Dependencies in Bioconductor dockers On 28 August 2015 20:42, Dan Tenenbaum wrote:
----- Original Message -----
From: "Kasper Daniel Hansen" <kasperdanielhansen at gmail.com> To: "Laurent Gatto" <lg390 at cam.ac.uk> Cc: "bioC-devel" <bioc-devel at stat.math.ethz.ch> Sent: Wednesday, August 26, 2015 2:36:08 PM Subject: Re: [Bioc-devel] Dependencies in Bioconductor dockers This might be especially nice if we use the docker containers for R CMD check.
In this case, you would be checking your own package, right, so the docker image cannot know in advance what the Suggests dependencies of your package are. [More below].
On Wed, Aug 26, 2015 at 10:56 PM, Laurent Gatto <lg390 at cam.ac.uk> wrote:
Dear all, As far as I can see, the Suggests dependencies of a package are not included in the docker containers. Would you consider adding these? It would be nice to be able to run all examples and vignette code of the packages available in a container.
Adding the Suggests dependencies of all packages installed on the
image is going to make the image much bigger. This request comes
soon
after other requests to reduce the size of the images. We should
probably have a wider discussion and decide exactly what type of
docker images we want to have.
Use cases that have been mentioned are:
- an image for building/checking with travis (sounds similar to
Kasper's request above). For this one in particular, small
size is
important as Travis has to build its environment from scratch
every
time, and loading large images takes too long.
- an image that has the Suggests dependencies of all installed
packages installed.
We might want to pick a different way to decide what packages are
installed on a given image. Currently we install all packages with
a
given biocView (Sequencing for example) and this leads to very
large
images (sequencing = ~7.5GB).
Thank you for these clarifications, Dan. If there is interest in having full/complete containers in addition to requiring light ones, would it make sense to distribute both? Would that be much overhead?
I think it definitely makes sense to distribute the light containers. (and even then, I want to see how small a 'light' container is--one that contains R, LaTeX, and every system dependency that we know about) I am a little hesitant to make the existing bloated containers even bigger by adding all the Suggests dependencies. That's why I said we might want to revisit the way we decide what packages are on a given container. Right now we use biocViews (Microarray, Sequencing, Proteomics, FlowCytometry) but that results in huge containers containing many packages that people arguably don't use that much but just happen to have the correct biocView. Of course it does have the benefit of being a somewhat democratic method.
I don't really know what I'm talking about, but does it make sense to think of the docker images provided by Bioconductor as building blocks for more specialized containers? i.e., that it should not be 'hard' for a developer to make an image that is appropriate for their particular needs? It seems like there's value to some level of nimbleness provided by small container size. I also wonder about LaTeX -- it seems like HTML vignettes are way better, and since docker images are forward-looking, maybe the images should be provisioned with the notion that they'll support HTML? Maybe there could be a docker-factory script that would take the name of a base image and the path to a package repository, and create a derived image with the additional necessary dependencies?
That sounds like a great idea. It would still be nice if Bioconductor kept the topic specific containers (flow, microarrays, proteomics, sequencing). Laurent
Martin
Dan Dan
Laurent
Dan
Best wishes, Laurent
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Laurent Gatto | @lgatt0 http://cpu.sysbiol.cam.ac.uk/