[Bioc-devel] GenomicFeatures and/or TxDb.Hsapiens.UCSC.hg19.knownGene issue: missing tibble
Hi, Thanks for the discussion and the insight into possible solutions. I currently have the same problem with settings up an RNAmodR GitHub Action. This fails because the RNAmodR requires RNAmodR.Data, which suggests GenomicRanges, which then leads to GenomeInfoDb and GenomeInfoDbData. What struck me was the fact, that GenomeInfoDbData is installed as a source package after RNAmodR.Data, which is basically the same situation as Leonardo describes for the TxDb packages. So why is the package GenomeInfoDbData, which does not have any dependencies at all (except R) is not installed first? I tried to fix it by adding GenomeInfoDbData to the depends of RNAmodR. This solved the problem on macOS, but not on windows the tibble problem remains (https://github.com/FelixErnst/RNAmodR/runs/623569724). This makes sense, because tibble is currently available as binary for macOS, but not windows. Adding tibble will probably solve this as well, but that cannot be a permanent solution, can it? I also have a question regarding the inner working of BiocManager::install: I used the following command to install dependencies: BiocManager::install(remotes::dev_package_deps(dependencies = TRUE, repos = c(BiocManager::repositories(),getOption('repos')))$package). Is the order in which the packages are given important? To state a hypothesis: In both cases, GenomeInfoDbData and tibble, source packages are affected, which are required, by a binary package, which is then again required by a source package. Maybe this bridge by a binary package is not picked up, when trying to sort for the install order of the source packages. Does this sound reasonable or is it there something I haven't thought about? Thanks for any advice. Best regards, Felix -----Urspr?ngliche Nachricht----- Von: Bioc-devel <bioc-devel-bounces at r-project.org> Im Auftrag von Leonardo Collado Torres Gesendet: Montag, 27. April 2020 18:07 An: Charlotte Soneson <charlottesoneson at gmail.com> Cc: Bioc-devel <bioc-devel at r-project.org> Betreff: Re: [Bioc-devel] GenomicFeatures and/or TxDb.Hsapiens.UCSC.hg19.knownGene issue: missing tibble Hi, I also ran more tests, which makes me think that the issue was with the list of dependencies we were asking `remotes` to install. First, regarding the second to last email from Charlotte, the step-wise installation I did mostly using remotes was not ideal. I found a complicated scenario in another package that I contribute to where BSgenome.Hsapiens.UCSC.hg19 had to be downloaded twice (it failed the first time). That was `brainflowprobes` on Windows at https://github.com/LieberInstitute/brainflowprobes/runs/621460015?check_suite_focus=true#step:12:1142 and https://github.com/LieberInstitute/brainflowprobes/runs/621460015?check_suite_focus=true#step:12:1410. Downloading such a big package (or any package) twice is really wasteful. So we can discard that path. Secondly, I also found about remotes::local_package_deps() like Charlotte just mentioned prompted by Martin's question. As suggested by Martin, I'm now trying using BiocManager::install() only since it knows how to resolve Bioc's dependency tree. Thus my current GHA workflow uses BiocManager::install() with the "minimal" deps (the immediate dependencies). I still use remotes::dev_package_deps() to find which packages need to be updated in order to enable the caching functionality later on. I did this installation twice, just as a backup. Then I do a third BiocManager::install() call with any outdated packages across the full dependencies. That's what https://github.com/leekgroup/derfinderPlot/blob/673608493488ae488ccb66e77e6deae5dabe69e0/.github/workflows/check-bioc.yml#L367-L393 does and here's the relevant code: message(paste('****', Sys.time(), 'installing BiocManager ****')) remotes::install_cran("BiocManager") ## Pass #1 at installing dependencies message(paste('****', Sys.time(), 'pass number 1 at installing dependencies: local dependencies ****')) local_deps <- remotes::local_package_deps(dependencies = TRUE) deps <- remotes::dev_package_deps(dependencies = TRUE, repos = BiocManager::repositories()) BiocManager::install(local_deps[local_deps %in% deps$package[deps$diff != 0]]) ## Pass #2 at installing dependencies message(paste('****', Sys.time(), 'pass number 2 at installing dependencies: local dependencies again ****')) deps <- remotes::dev_package_deps(dependencies = TRUE, repos = BiocManager::repositories()) BiocManager::install(local_deps[local_deps %in% deps$package[deps$diff != 0]]) ## Pass #3 at installing dependencies message(paste('****', Sys.time(), 'pass number 3 at installing dependencies: any remaining dependencies ****')) deps <- remotes::dev_package_deps(dependencies = TRUE, repos = BiocManager::repositories()) BiocManager::install(deps$package[deps$diff != 0]) For all 3 OS (Bioconductor devel docker, macOS, Windows) this works for `derfinderPlot`: https://github.com/leekgroup/derfinderPlot/actions/runs/89153451. Actually, in all 3, "pass #2" did nothing. Only in the docker one did pass 3 do something (it updated `pkgbuild` which is not a direct dependency of `derfinderPlot`). If you like this, given that `BiocManager` already suggests `remotes`, I could add a PR. Something like (with all the arguments and all that): bioc_dev_package_deps <- function() { local_deps <- remotes::local_package_deps(dependencies = TRUE) deps <- remotes::dev_package_deps(dependencies = TRUE, repos = BiocManager::repositories()) BiocManager::install(local_deps[local_deps %in% deps$package[deps$diff != 0]]) } Charlotte, in your small example, I saw that https://github.com/csoneson/testinstall/commit/a5d7f473cad8fbaa4c7df8672dbdcf1994c0dd38 worked. Maybe it would still work with BiocManager::install('TxDb.Hsapiens.UCSC.hg19.knownGene') only (the minimal "deps" you found with Martin's code). As for remotes::install_bioc() my understanding is that that function ends up using the git Bioconductor versions. Double checking right now, I see that the behavior depends on whether `git2r` is installed https://github.com/r-lib/remotes/blob/master/R/install-bioc.R#L66. Best, Leo PS If you update many packages at the same time with GHA, you can run into timeout problems :P I was just trying to update my GHA workflow on all my packages before the BioC 3.11 freeze. PS2 Charlotte, I recommend that you set the GITHUB_PAT environment variable https://github.com/leekgroup/derfinderPlot/blob/673608493488ae488ccb66e77e6deae5dabe69e0/.github/workflows/check-bioc.yml#L100. Otherwise you depend on the one included in remotes and can run into rate limiting issues. Though I actually ran into some even with it :P https://github.com/LieberInstitute/recountWorkflow/runs/622552889?check_suite_focus=true#step:13:15 https://github.com/LieberInstitute/recountWorkflow/runs/622552889?check_suite_focus=true#step:13:20
On Mon, Apr 27, 2020 at 11:29 AM Charlotte Soneson <charlottesoneson at gmail.com> wrote:
Hi again, as for getting the immediate dependencies, this is what remotes does internally (it includes also recommended packages though): deps <- remotes::local_package_deps(pkgdir = ".", dependencies = TRUE) I guess I would further simplify the problem by eliminating from 'deps' the packages that do not contribute to the problem. So I wonder what a minimal 'deps' looks like? This would be much more helpful for understanding the problem than many 1000's of lines of output from CI. For my small example package, deps (as created with your code below) was just TxDb.Hsapiens.UCSC.hg19.knownGene. So I conclude that the problem is actually IN BASE R, and that the fix is in the incredibly complicated logic of install.packages. I?m going to agree with this :) just to try to get a bit further, I went through the code of remotes, and in the end it comes down to calling install.packages() with a certain list of packages, and a specified set of repos (e.g. https://github.com/r-lib/remotes/blob/5b3da5f852772159cc2a580888e10536 a43e9199/R/install.R#L75). So I tried to just install some packages with install.packages(), in an empty library, to see what would happen. Experiments are here (not very easily accessible, I admit, but still) if someone is interested: https://github.com/csoneson/testinstall/actions Long story short, this works fine:
install.packages("TxDb.Hsapiens.UCSC.hg19.knownGene", repos =
c(getOption('repos'), BiocManager::repositories()), type = "both",
dependencies = TRUE)
This doesn?t (tries to install annotation packages before GenomeInfoDbData):
install.packages(c("TxDb.Hsapiens.UCSC.hg19.knownGene","IRanges","Ge
nomeInfoDb"), repos = c(getOption('repos'),
BiocManager::repositories()), type = "both", dependencies = TRUE)
With other combinations of packages, sometimes it works, sometimes not. At the same time, this works fine:
BiocManager::install(c("TxDb.Hsapiens.UCSC.hg19.knownGene","IRanges"
,"GenomeInfoDb"))
And one more observation which may or may not be related: locally, this fails for me:
remotes::install_bioc('TxDb.Hsapiens.UCSC.hg19.knownGene')
while this works:
remotes::install_bioc('SummarizedExperiment?)
Charlotte
On 27 Apr 2020, at 12:11, Martin Morgan <mtmorgan.bioc at gmail.com> wrote:
Personally, I wouldn't trust remotes to get the Bioconductor repositories, and hence the dependency graph, correct. I say this mostly because of problems to get BiocManager to get the repositories right during each phase of the Biocoonductor release cycle, not to diss the remotes package.
I'd grab the immediate dependencies ONLY of the new package from the
DESCRIPTION file (does remotes have a function to do this? I'd trust
it to do a better job than my hack, but I'd double check it)
dcf = read.dcf("DESCRIPTION", c("Depends", "Imports", "LinkingTo",
"Enhances", "Suggests")) deps = unlist(strsplit(dcf, ",[[:space:]]*"))
deps = sub(" .*", "", deps) # no version info
deps = setdiff(deps, c("R", NA, rownames(installed.packages(priority
= "high"))))
I'd then do the builds with
BiocManager::install(deps)
If that failed and I wanted to 'peel back' a layer of responsibility
to get closer to a minimal reproducible example, I'd do
install.packages(deps, repos = BiocManager::repositories())
I believe that (maybe you can confirm?) this fails for your case.
BiocManager::repositories() is just a named character vector -- in
devel it is currently
dput(BiocManager::repositories())
c(BioCsoft = "https://bioconductor.org/packages/3.11/bioc",
BioCann = "https://bioconductor.org/packages/3.11/data/annotation",
BioCexp = "https://bioconductor.org/packages/3.11/data/experiment",
BioCworkflows = "https://bioconductor.org/packages/3.11/workflows",
CRAN = "https://cran.rstudio.com")
So I conclude that the problem is actually IN BASE R, and that the fix is in the incredibly complicated logic of install.packages.
I guess I would further simplify the problem by eliminating from 'deps' the packages that do not contribute to the problem. So I wonder what a minimal 'deps' looks like? This would be much more helpful for understanding the problem than many 1000's of lines of output from CI.
This might help to come up with a simple example to demonstrate Charlotte's conclusion that the source packages are installed in the wrong order. And that might also lead to a difference between parallel (options(Ncpus = 8), for example) versus serial (options(Ncpus = NULL)) builds...
Thanks for your exhaustive work on this!
Martin
?On 4/27/20, 2:03 AM, "Leonardo Collado Torres" <lcolladotor at gmail.com> wrote:
EDIT: I found a general solution! (workaround?) I had written a
response, but I had an idea, tested it and a few hours later I'm
finishing this email. It does work... although not exactly as I
intended it to.
---
Thanks Martin for looking into this =)
I'll respond to your question about making things complicated for myself.
## General scenario
The general scenario is, I have a package say `newpkg`. `newpkg` has
dependencies (imports, suggests and/or depends) on Bioconductor
packages. I want to test that `newpkg` passes R CMD build, check &
BiocCheck. To do so, we need all the dependencies of `newpkg`
available including the "suggests" ones.
If `newpkg` was already available from Bioconductor, I could install
it using BiocManager::install("newpkg"). But that's not necessarily
the case.**
One could install the dependencies for `newpkg` manually, using
BiocManager::install(), remotes, and/or install.packages(). But then,
you need to adapt the code again for `newerpkg`, `oldpkg`, etc.
Currently, either through remotes::install_deps() or through
remotes::dev_package_deps(dependencies = TRUE) directly (the first
calls the second
https://github.com/r-lib/remotes/blob/5b3da5f852772159cc2a580888e10536a43e9199/R/install.R#L193)
Charlotte and I are getting the list of packages that `newpkg` depends
on, then either installing them through remotes or BiocManager. This
is failing for both of us, though in theory (as far as I know) either
should work. Is this something that could be fixed? I don't know.++
## GitHub Actions
Ultimately in my case, I'm trying to build a GitHub Actions workflow
that will work for any package with Bioconductor dependencies. I'm
nearly there, it's just this last issue about the source-only BioC
packages (annotation, experiment, workflow). I've been doing this
since last week and through this process I discovered some issues with
my own packages that were masked in the Bioconductor build machines.
Many other packages are already installed in the Bioconductor build
machines and on my laptop, so I hadn't noticed some missing "suggests"
dependencies on some of my packages. For example
https://github.com/leekgroup/recount/commit/f3bdb77d789f1a8364f16b71bd344fd23ecbfda5.
## Some possibilities to explore
Maybe what we need is some other code to process the DESCRIPTION file
of `newpkg`, extract the list of packages explicitly mentioned on
DESCRIPTION (removing those that are base packages, say it's 10
packages), then just install those direct dependencies (the 10
packages) instead of all the packages listed in the DESCRIPTION and
their dependencies (what you can get from remotes::dev_package_deps(),
say 100 packages) and pass this smaller list of direct dependencies to
BiocManager::install(). However, I suspect that it won't work either,
because again, I'm expecting (maybe incorrectly) that
BiocManager::install() figures out the right order in which to install
either the short or long list of packages and this is currently
failing for the long list.
Another option might involve figuring out from the full list of
dependencies (remotes::dev_package_deps(dependencies = TRUE) ), which
ones are available only through source (maybe those available only
through repos BioCann, BioCexp, BioCworkflows from
BiocManager::repositories() ) and install those first, then install
the remaining packages that exist in the BioCsoft and CRAN
repositories. Maybe something like:
## This doesn't work since BiocManager::install() doesn't allow using
the `repos` argument
deps <- remotes::dev_package_deps(dependencies = TRUE)
BiocManager::install(deps$package[deps$diff != 0], repos =
BiocManager::repositories()[c('BioCann', 'BioCexp', 'BioCworkflows')]
)
BiocManager::install(deps$package[deps$diff != 0])
## This also doesn't work since all CRAN deps are missing at this point
remotes::install_deps( repos =
BiocManager::repositories()[c('BioCann', 'BioCexp', 'BioCworkflows')]
)
remotes::install_deps()
## But the above lead me a solution at
https://github.com/leekgroup/derfinderPlot/blob/8695cbee49a01d1d297042232a1593e6c94f1b41/.github/workflows/check-bioc.yml#L139-L165.
That is, install packages in waves: first the CRAN ones, then the BioC
source-only ones, then the BioC software ones. Doing the installation
in this order worked for several of my packages (as many as I could
test tonight).
message(paste('****', Sys.time(), 'installing BiocManager ****'))
remotes::install_cran("BiocManager")
message(paste('****', Sys.time(), 'installing CRAN dependencies ****'))
remotes::install_deps(repos = BiocManager::repositories()['CRAN'])
message(paste('****', Sys.time(), 'installing BioC source-only
dependencies ****'))
remotes::install_deps(repos = BiocManager::repositories()[c('BioCann',
'BioCexp', 'BioCworkflows')])
message(paste('****', Sys.time(), 'installing remaining BioC
dependencies ****'))
deps <- remotes::dev_package_deps(dependencies = TRUE, repos =
BiocManager::repositories())
BiocManager::install(deps$package[deps$diff != 0])
I added those messages so I could find these steps on the logs more
easily and it works for Bioconductor's devel docker, macOS and Windows
using R 4.0 and BioC 3.11.
Here are the links to one log file (Windows):
1. BiocManager:
https://github.com/leekgroup/derfinderPlot/runs/621120165?check_suite_focus=true#step:12:40
2. CRAN deps: https://github.com/leekgroup/derfinderPlot/runs/621120165?check_suite_focus=true#step:12:43
(though hm... it does install many BioC ones, not sure why)
3. The BioC source-only deps:
https://github.com/leekgroup/derfinderPlot/runs/621120165?check_suite_focus=true#step:12:1219
(hm... doesn't install anything)
4. BioC remaining deps:
https://github.com/leekgroup/derfinderPlot/runs/621120165?check_suite_focus=true#step:12:1222
This is where TxDb.Hsapiens.UCSC.hg19.knownGene gets installed;
GenomeInfoDbData and tibble are available for GenomicFeatures at this
point, so no errors pop up. This step also installs a few other CRAN
deps which I'm not sure why they didn't install before.
Best,
Leo
** Even if it was, you might not want to actually install the package
`newpkg` from Bioconductor/CRAN since you likely want to test the very
latest version of `newpkg` and avoid any false negative errors where
everything seems to work, but your code is really just checking the
latest release version (bioc-release or bioc-devel for BioC packages)
instead of your development version.
++ Maybe it could be fixed by adding a explicit dependency on
GenomicFeatures to both GenomeInfoDbData and tibble, though I'm not
sure. But it seems like fixing the order in which packages are
installed is the more general problem.
On Sun, Apr 26, 2020 at 5:53 PM Martin Morgan <mtmorgan.bioc at gmail.com> wrote:
I spent a bit of time not understanding why you were being so complicated -- BiocManager::install() finds all CRAN / Bioc dependencies, there's no need to use remotes at all and for debugging purposes it just seemed (still seems?) like you were making trouble for yourself.
But eventually... I created a fake CRAN-style repository
$ tree my_repo/
my_repo/
??? bin
? ??? macosx
? ??? contrib
? ??? 4.0
? ??? PACKAGES
??? src
??? contrib
??? PACKAGES
The plain-text PACKAGES file is an index of the packages that are
supposed to be available. So under the 'bin' tree I have
---
Package: foo
Version: 1.0.0
NeedsCompilation: true
Package: bar
Version: 1.0.0
Depends: foo
Package: baz
Version: 1.0.0
Depends: bar
---
baz depends on bar depends on foo, and binary versions are all at
1.0.0
Under the src tree I have
---
Package: foo
Version: 1.0.1
NeedsCompilation: true
Package: bar
Version: 1.0.0
Depends: foo
Package: baz
Version: 1.0.0
Depends: bar
```
with a more recent src for foo at version 1.0.1. I guess this is (almost) the situation with GenomeInfoDbData / tibble.
In an R session I have
available.packages(repos="file:///tmp/my_repo/")
Package Version Priority Depends Imports LinkingTo Suggests Enhances
foo "foo" "1.0.1" NA NA NA NA NA NA
bar "bar" "1.0.0" NA "foo" NA NA NA NA
baz "baz" "1.0.0" NA "bar" NA NA NA NA
License License_is_FOSS License_restricts_use OS_type Archs MD5sum
foo NA NA NA NA NA NA
bar NA NA NA NA NA NA
baz NA NA NA NA NA NA
NeedsCompilation File Repository
foo "true" NA "file:///tmp/my_repo/src/contrib"
bar NA NA "file:///tmp/my_repo/src/contrib"
baz NA NA "file:///tmp/my_repo/src/contrib"
I'll try to 'install' baz; it'll fail because there are no packages to install, but it's still informative...
install.packages("baz", repos = "file:///tmp/my_repo")
Installing package into '/Users/ma38727/Library/R/4.0/Bioc/3.11/library'
(as 'lib' is unspecified)
also installing the dependencies 'foo', 'bar'
There is a binary version available but the source version is later:
binary source needs_compilation
foo 1.0.0 1.0.1 TRUE
Do you want to install from sources the package which needs
compilation? (Yes/no/cancel) yes Warning in download.packages(pkgs, destdir = tmpd, available = available, :
package 'bar' does not exist on the local repository Warning in
download.packages(pkgs, destdir = tmpd, available = available, :
package 'baz' does not exist on the local repository installing the
source package 'foo'
Warning in download.packages(pkgs, destdir = tmpd, available = available, :
package 'foo' does not exist on the local repository
Note the order of downloads -- binaries first, then source as you
found! (actually, this would 'work' because the binaries are installed
without any test load, but in more complicated situations...)
On the other hand, if I answer 'no' to install the more recent source
packages I get
There is a binary version available but the source version is later:
binary source needs_compilation
foo 1.0.0 1.0.1 TRUE
Do you want to install from sources the package which needs
compilation? (Yes/no/cancel) no Warning in download.packages(pkgs, destdir = tmpd, available = available, :
package 'foo' does not exist on the local repository Warning in
download.packages(pkgs, destdir = tmpd, available = available, :
package 'bar' does not exist on the local repository Warning in
download.packages(pkgs, destdir = tmpd, available = available, :
package 'baz' does not exist on the local repository
installing in the order required for dependencies.
If I remove baz from the source repository, I get a similar order of events, with an additional prompt about installing 'baz' from source.
I don't actually see, from the 'Binary packages' section of ?install.packages, how to get R to respond 'no' to the prompt to install the more recent source package foo, but still install the source-only package 'baz'...
Of course this is transient, when there more recent source than binaries; my own installation of TxDb on macOS found a binary tibble as current as the source, and went without problem.
Martin
?On 4/26/20, 4:48 PM, "Leonardo Collado Torres" <lcolladotor at gmail.com> wrote:
Hi everyone,
Charlotte, thank you very much! I didn't know about that issue on
`remotes` and the fix attempts. Thank you for the info Martin!
However, I have to report that it doesn't seem like switching from
remotes::install_deps() to BiocManager::install() fixes the issue. I
updated my GitHub Actions workflow to obtain the list of dependencies
using remotes, but install them with BiocManager::install() instead of
remotes::install_deps(). You can see this at
https://github.com/leekgroup/derfinderPlot/blob/ea58939ac6bf13cae7d26951732914d96b5f7d07/.github/workflows/check-bioc.yml#L139-L149
although I include the relevant lines of code below:
## Locate the package dependencies
deps <- remotes::dev_package_deps(dependencies = TRUE)
## Install any that need to be updated using BiocManager to avoid
## the issues described at
## https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016675.html
## https://github.com/r-lib/remotes/issues/296
remotes::install_cran("BiocManager")
BiocManager::install(deps$package[deps$diff != 0])
This still leads to TxDb.Hsapiens.UCSC.hg19.knownGene failing to
install because GenomeInfoDbData is not available on both macOS and
Windows (again, this doesn't fail on the Bioconductor devel docker).
Here's for example the error on Windows
https://github.com/leekgroup/derfinderPlot/runs/620055131?check_suite_focus=true#step:12:1077.
Immediately after, GenomeInfoDbData does get installed
https://github.com/leekgroup/derfinderPlot/runs/620055131?check_suite_focus=true#step:12:1100
and after it, tibble
https://github.com/leekgroup/derfinderPlot/runs/620055131?check_suite_focus=true#step:12:1174.
Likely this issue only happens on Windows and macOS because of the
availability of some packages in source form and others in binary
form, unlike only using source versions in the Bioconductor docker
run. However, maybe I need some other code to get all the
dependencies of a given package in a different order, though I was
hoping that BiocManager::install() would find the right order for me
as it seems to try to do so already.
Charlotte linked to
https://github.com/r-lib/remotes/commit/88f302fe53864e4f27fc7b3897718fea9a8b1fa9.
So maybe there's still something else to try to fix in remotes and/or
BiocManager instead of the DESCRIPTION files of other packages like I
initially thought of in this thread and in
https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016671.html.
Best,
Leo
On Sun, Apr 26, 2020 at 10:30 AM Martin Morgan <mtmorgan.bioc at gmail.com> wrote:
Thanks Charlotte for the detective work.
Annotation packages (TxDb, org, BSgenome, and GenomeInfoDbData, for instance) are distributed only as source ? this was a decision made quite a while (years) ago, to save disk space (some of these packages are large, and hosting macOS and Windows binaries in addition to source triple disk space requirements) and on the rationale that the packages do not have C-level source code so users do not need RTools or XCode (etc) to install from ?source?. So in this context and in the face of a buggy remotes package, and installation of Bioconductor packages through non-standard approaches (BiocManager::install() for CRAN and Bioconductor packages and their dependencies use base R commands only) I guess the behavior you document is really an (ongoing?) bug in the remotes package?
Over the years the distribution of source-only annotation packages has caused problems, in particular when (usually Windows) users have temporary or library paths with spaces or non-ASCII characters. I believe that this upstream bug (in R?s handling of Windows paths) has been fixed in the 4.0.0 release, but the details are quite complicated and I have not been able to follow the discussion fully.
Martin
From: Charlotte Soneson <charlottesoneson at gmail.com>
Date: Sunday, April 26, 2020 at 5:32 AM
To: Martin Morgan <mtmorgan.bioc at gmail.com>
Cc: Leonardo Collado Torres <lcolladotor at gmail.com>, Bioc-devel
<bioc-devel at r-project.org>
Subject: Re: [Bioc-devel] GenomicFeatures and/or
TxDb.Hsapiens.UCSC.hg19.knownGene issue: missing tibble
Hi Leo, Martin,
it looks like this is related to an issue with the remotes package: https://github.com/r-lib/remotes/issues/296. It gets the installation order wrong, and tries to install source packages before binaries. This can be a problem with GenomeInfoDbData (which I think doesn?t have a binary, and which it looks like Leo is installing manually). The TxDb package also doesn?t seem to be available as a binary package, and currently the source package for tibble is newer than the Windows binary.
According to the issue above, it should have been fixed in remotes v2.1.1 (https://github.com/r-lib/remotes/commit/88f302fe53864e4f27fc7b3897718fea9a8b1fa9). To try things out, I set up a minimal package with the only dependency being TxDb.Hsapiens.UCSC.hg19.knownGene (https://github.com/csoneson/testpkg), and checked it with GitHub Actions on macOS and Windows. It fails in both cases, since it?s trying to install TxDb.Hsapiens.UCSC.hg19.knownGene first (e.g. https://github.com/csoneson/testpkg/runs/619407291?check_suite_focus=true#step:7:533). If I depend instead on GenomicFeatures, everything builds fine (here we have a binary). It is using remotes v2.1.1 though, so perhaps this needs to be investigated further.
Charlotte
On 25 Apr 2020, at 22:20, Martin Morgan <mtmorgan.bioc at gmail.com> wrote:
tibble is not a direct dependency of TxDb*.
db = available.packages(repos = BiocManager::repositories()) deps =
tools::package_dependencies("TxDb.Hsapiens.UCSC.hg19.knownGene", db)
deps
$TxDb.Hsapiens.UCSC.hg19.knownGene
[1] "GenomicFeatures" "AnnotationDbi"
but it is an indirect dependency
deps =
tools::package_dependencies("TxDb.Hsapiens.UCSC.hg19.knownGene", db,
recursive=TRUE) "tibble" %in% unlist(deps)
[1] TRUE
I did
deps1 =
tools::package_dependencies("TxDb.Hsapiens.UCSC.hg19.knownGene", db,
recursive=TRUE)
deps2 = tools::package_dependencies("tibble", db, recursive=TRUE,
reverse=TRUE)
intersect(unlist(deps1), unlist(deps2))
## [1] "GenomicFeatures" "biomaRt" "BiocFileCache" "dbplyr"
## [5] "dplyr"
I believe R checks for immediate dependencies, found all for TxDb* and
GenomicFeatures available, and didn?t check further. I speculate that
you removed tibble, or installed one of the packages in the above
list, without satisfying the dependencies for that package. Or perhaps
what the message is really trying to say is that it failed to load
tibble (because it was installed in a previous version of the R
toolchain?)
It would be interesting to debug this further on your system, to understand the problem for other users.
Martin
?On 4/25/20, 2:48 PM, "Bioc-devel on behalf of Leonardo Collado Torres" <bioc-devel-bounces at r-project.org on behalf of lcolladotor at gmail.com> wrote:
Hi Bioc-devel,
I think that there's a potential issue with either GenomicFeatures,
TxDb.Hsapiens.UCSC.hg19.knownGene or an upstream package.
On a fresh R 4.0 Windows installation with BioC 3.11, I get the
following error message when installing
TxDb.Hsapiens.UCSC.hg19.knownGene as shown at
https://github.com/leekgroup/derfinderPlot/runs/618370463?check_suite_focus=true#step:13:1225.
2020-04-25T18:32:26.0765748Z * installing *source* package
'TxDb.Hsapiens.UCSC.hg19.knownGene' ...
2020-04-25T18:32:26.0769789Z ** using staged installation
2020-04-25T18:32:26.1001400Z ** R
2020-04-25T18:32:26.1044734Z ** inst
2020-04-25T18:32:26.2061605Z ** byte-compile and prepare package for
lazy loading
2020-04-25T18:32:30.7296724Z ##[error]Error: package or namespace load
failed for 'GenomicFeatures' in loadNamespace(i, c(lib.loc,
.libPaths()), versionCheck = vI[[i]]):
2020-04-25T18:32:30.7305615Z ERROR: lazy loading failed for package
'TxDb.Hsapiens.UCSC.hg19.knownGene'
2020-04-25T18:32:30.7306686Z * removing
'D:/a/_temp/Library/TxDb.Hsapiens.UCSC.hg19.knownGene'
2020-04-25T18:32:30.7307196Z there is no package called 'tibble'
2020-04-25T18:32:30.7310561Z ##[error]Error: package 'GenomicFeatures'
could not be loaded
2020-04-25T18:32:30.7311805Z Execution halted
From looking at the bioc-devel landing pages for both GenomicFeatures
and TxDb.Hsapiens.UCSC.hg19.knownGene, I see that tibble is not listed
as a dependency for either package.
Best,
Leo
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel _______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel