----- Original Message -----
From: "Laurent Gautier" <lgautier at gmail.com>
To: "Martin Morgan" <mtmorgan at fredhutch.org>
Cc: bioc-devel at r-project.org, "Dan Tenenbaum" <dtenenba at fredhutch.org>
Sent: Monday, November 10, 2014 9:57:00 AM
Subject: Re: [Bioc-devel] PPA with built bioconductor packages (for
They would work in the context of well defined system such as the VM
used by popular continuous integration providers (Travis or Drone
for example).
Then it would be easy as having the binaries built as artifacts by
continuous integration and made available to other continuous
integration processes.
This sounds to me like a pretty good use case for docker/rocker. We just
need to define what packages should be installed on a given image; I don't
think we want the images to be too big (unlike the AMI). The images could
be rebuilt daily. So you'd still need to download the diffs from the
previous image but I imagine this would take less time than building those
packages from source.
Dan
On Nov 10, 2014 6:19 PM, "Martin Morgan" < mtmorgan at fredhutch.org >
wrote:
On 11/09/2014 11:06 AM, Dan Tenenbaum wrote:
----- Original Message -----
From: "Martin Morgan" < mtmorgan at fredhutch.org >
To: "Laurent Gautier" < lgautier at gmail.com >,
bioc-devel at r-project.org
Sent: Sunday, November 9, 2014 8:26:48 AM
Subject: Re: [Bioc-devel] PPA with built bioconductor packages (for
continuous integration)
On 11/09/2014 07:23 AM, Laurent Gautier wrote:
Hi,
Continuous integration is a convenient way to automate some of the
steps
necessary to ensure quality software.
Popular ways to do it create a vanilla virtual machine 9VM) with a
Linux
distribution, and scripts prepares the VM with 3rd-party
dependencies
required by the software. For example, the popular CI system Travis
for
github creates by default a VM running ubuntu, and dependencies can
be
installed with `apt-get install`.
When developing software that requires CRAN/bioconductor, the
latest R is
available precompiled but the R packages must be downloaded
installed from
source.
This can take a relatively long time. On a recent project over 80%
of the
time is spent downloading/installing the R/BioC packages. The
remaining is
building the code and running the unit tests.
Having a Personal Package Archive (PPA) with bioconductor packages
already
compiled would both speed up the process and make the use of
continuous
integration by projects relying on bioconductor packages easier.
Is this something others would like to have, and is this something
that
bioconductor would see to its mission to provide / help provide
quality
software and be able to host ?
It would be interesting to catalogue objectives (e.g., development
vs.
reproducibility) and available alternatives (e.g., PPA, docker /
Rocker, AMI,
existing or possible cloud services [such as the Bioc 'single package
builder'
used to build and check new package submissions, or travis itself],
the Becker
repository management scheme Michael and Gabe mention, ...);
Just to add to the mix of options, it's possible to run
R CMD INSTALL --build on a source tarball on Linux and it will create
a 'binary' version that is already compiled.
These binaries are in general not portable, either within or between
distributions, e.g., because the user has a different version of a
system dependency than the one the binary was built against.
Martin
The problem with this is (AFAIK) there is no corresponding package
type that can be used with install.packages();
otherwise the simplest solution would be to add a CRAN-style repos
containing these "binaries". Maybe R could be patched to allow this?
But it's possible that the requirements for Linux "binaries" could
vary depending on many things: cpu type (intel or solaris, or...),
architecture (i386, x64), presence/absence of BLAS/LAPACK, etc etc
etc. This suggests that a vm or container-based approach might be
better.
Dan
if there
is a clear
path forward satisfying some plurality of users without too many
technical
obstacles then it might fall within the Bioc purview; my initial
sense is that
there is not a consensus on use cases or viable implementations, but
I can be
convinced otherwise...
In terms of Tim's post, getting your colleague to use a PPA /
existing
alternative (e.g., the Bioc AMI,
http://bioconductor.org/help/ bioconductor-cloud-ami/ which comes
with
Rstudio
server installed...) is not likely to be easier / faster than getting
them to
download / install relevant R / Bioc packages. One interesting
possibility is a
'hosted' bioconductor with sufficient computational resources on the
back-end
and Rstudio server on the front end; this is not impossible to
imaging seeking
funding for.
Martin
Best,
Laurent
[[alternative HTML version deleted]]