Skip to content
Prev 16670 / 21307 Next

[Bioc-devel] GenomicFeatures and/or TxDb.Hsapiens.UCSC.hg19.knownGene issue: missing tibble

EDIT:  I found a general solution! (workaround?) I had written a
response, but I had an idea, tested it and a few hours later I'm
finishing this email. It does work... although not exactly as I
intended it to.



---

Thanks Martin for looking into this =)

I'll respond to your question about making things complicated for myself.


## General scenario

The general scenario is, I have a package say `newpkg`. `newpkg` has
dependencies (imports, suggests and/or depends) on Bioconductor
packages. I want to test that `newpkg` passes R CMD build, check &
BiocCheck. To do so, we need all the dependencies of `newpkg`
available including the "suggests" ones.

If `newpkg` was already available from Bioconductor, I could install
it using BiocManager::install("newpkg"). But that's not necessarily
the case.**

One could install the dependencies for `newpkg` manually, using
BiocManager::install(), remotes, and/or install.packages(). But then,
you need to adapt the code again for `newerpkg`, `oldpkg`, etc.

Currently, either through remotes::install_deps() or through
remotes::dev_package_deps(dependencies = TRUE) directly (the first
calls the second
https://github.com/r-lib/remotes/blob/5b3da5f852772159cc2a580888e10536a43e9199/R/install.R#L193)
Charlotte and I are getting the list of packages that `newpkg` depends
on, then either installing them through remotes or BiocManager. This
is failing for both of us, though in theory (as far as I know) either
should work. Is this something that could be fixed? I don't know.++


## GitHub Actions

Ultimately in my case, I'm trying to build a GitHub Actions workflow
that will work for any package with Bioconductor dependencies. I'm
nearly there, it's just this last issue about the source-only BioC
packages (annotation, experiment, workflow). I've been doing this
since last week and through this process I discovered some issues with
my own packages that were masked in the Bioconductor build machines.
Many other packages are already installed in the Bioconductor build
machines and on my laptop, so I hadn't noticed some missing "suggests"
dependencies on some of my packages. For example
https://github.com/leekgroup/recount/commit/f3bdb77d789f1a8364f16b71bd344fd23ecbfda5.


## Some possibilities to explore

Maybe what we need is some other code to process the DESCRIPTION file
of `newpkg`, extract the list of packages explicitly mentioned on
DESCRIPTION (removing those that are base packages, say it's 10
packages), then just install those direct dependencies (the 10
packages) instead of all the packages listed in the DESCRIPTION and
their dependencies (what you can get from remotes::dev_package_deps(),
say 100 packages) and pass this smaller list of direct dependencies to
BiocManager::install(). However, I suspect that it won't work either,
because again, I'm expecting (maybe incorrectly) that
BiocManager::install() figures out the right order in which to install
either the short or long list of packages and this is currently
failing for the long list.

Another option might involve figuring out from the full list of
dependencies (remotes::dev_package_deps(dependencies = TRUE) ), which
ones are available only through source (maybe those available only
through repos BioCann, BioCexp, BioCworkflows from
BiocManager::repositories() ) and install those first, then install
the remaining packages that exist in the BioCsoft and CRAN
repositories. Maybe something like:

## This doesn't work since BiocManager::install() doesn't allow using
the `repos` argument
deps <- remotes::dev_package_deps(dependencies = TRUE)
BiocManager::install(deps$package[deps$diff != 0], repos =
BiocManager::repositories()[c('BioCann', 'BioCexp', 'BioCworkflows')]
)
BiocManager::install(deps$package[deps$diff != 0])

## This also doesn't work since all CRAN deps are missing at this point
remotes::install_deps( repos =
BiocManager::repositories()[c('BioCann', 'BioCexp', 'BioCworkflows')]
)
remotes::install_deps()


## But the above lead me a solution at
https://github.com/leekgroup/derfinderPlot/blob/8695cbee49a01d1d297042232a1593e6c94f1b41/.github/workflows/check-bioc.yml#L139-L165.
That is, install packages in waves: first the CRAN ones, then the BioC
source-only ones, then the BioC software ones. Doing the installation
in this order worked for several of my packages (as many as I could
test tonight).


message(paste('****', Sys.time(), 'installing BiocManager ****'))
remotes::install_cran("BiocManager")

message(paste('****', Sys.time(), 'installing CRAN dependencies ****'))
remotes::install_deps(repos = BiocManager::repositories()['CRAN'])

message(paste('****', Sys.time(), 'installing BioC source-only
dependencies ****'))
remotes::install_deps(repos = BiocManager::repositories()[c('BioCann',
'BioCexp', 'BioCworkflows')])

message(paste('****', Sys.time(), 'installing remaining BioC
dependencies ****'))
deps <- remotes::dev_package_deps(dependencies = TRUE, repos =
BiocManager::repositories())
BiocManager::install(deps$package[deps$diff != 0])


I added those messages so I could find these steps on the logs more
easily and it works for Bioconductor's devel docker, macOS and Windows
using R 4.0 and BioC 3.11.

Here are the links to one log file (Windows):

1. BiocManager:
https://github.com/leekgroup/derfinderPlot/runs/621120165?check_suite_focus=true#step:12:40
2. CRAN deps: https://github.com/leekgroup/derfinderPlot/runs/621120165?check_suite_focus=true#step:12:43
(though hm... it does install many BioC ones, not sure why)
3. The BioC source-only deps:
https://github.com/leekgroup/derfinderPlot/runs/621120165?check_suite_focus=true#step:12:1219
(hm... doesn't install anything)
4. BioC remaining deps:
https://github.com/leekgroup/derfinderPlot/runs/621120165?check_suite_focus=true#step:12:1222
This is where TxDb.Hsapiens.UCSC.hg19.knownGene gets installed;
GenomeInfoDbData and tibble are available for GenomicFeatures at this
point, so no errors pop up. This step also installs a few other CRAN
deps which I'm not sure why they didn't install before.


Best,
Leo

** Even if it was, you might not want to actually install the package
`newpkg` from Bioconductor/CRAN since you likely want to test the very
latest version of `newpkg` and avoid any false negative errors where
everything seems to work, but your code is really just checking the
latest release version (bioc-release or bioc-devel for BioC packages)
instead of your development version.

++ Maybe it could be fixed by adding a explicit dependency on
GenomicFeatures to both GenomeInfoDbData and tibble, though I'm not
sure. But it seems like fixing the order in which packages are
installed is the more general problem.
On Sun, Apr 26, 2020 at 5:53 PM Martin Morgan <mtmorgan.bioc at gmail.com> wrote:
Message-ID: <CAP8VEO5LbqUkcEB--U=YvMw3yyCmciY1P4RTS06G6cKYik6QbA@mail.gmail.com>
In-Reply-To: <BL0PR04MB6609A49091DB7EB2909A721EF9AE0@BL0PR04MB6609.namprd04.prod.outlook.com>