Skip to content
Prev 16688 / 21312 Next

[Bioc-devel] GenomicFeatures and/or TxDb.Hsapiens.UCSC.hg19.knownGene issue: missing tibble

Hi,

Thanks for the discussion and the insight into possible solutions.

I currently have the same problem with settings up an RNAmodR GitHub Action. This fails because the RNAmodR requires RNAmodR.Data, which suggests GenomicRanges, which then leads to GenomeInfoDb and GenomeInfoDbData. What struck me was the fact, that GenomeInfoDbData is installed as a source package after RNAmodR.Data, which is basically the same situation as Leonardo describes for the TxDb packages.
So why is the package GenomeInfoDbData, which does not have any dependencies at all (except R) is not installed first? I tried to fix it by adding GenomeInfoDbData to the depends of RNAmodR. This solved the problem on macOS, but not on windows the tibble problem remains (https://github.com/FelixErnst/RNAmodR/runs/623569724). This makes sense, because tibble is currently available as binary for macOS, but not windows. Adding tibble will probably solve this as well, but that cannot be a permanent solution, can it?

I also have a question regarding the inner working of BiocManager::install: I used the following command to install dependencies: BiocManager::install(remotes::dev_package_deps(dependencies = TRUE, repos = c(BiocManager::repositories(),getOption('repos')))$package). Is the order in which the packages are given important?

To state a hypothesis: In both cases, GenomeInfoDbData and tibble, source packages are affected, which are required, by a binary package, which is then again required by a source package. Maybe this bridge by a binary package is not picked up, when trying to sort for the install order of the source packages.

Does this sound reasonable or is it there something I haven't thought about? Thanks for any advice.

Best regards,
Felix

-----Urspr?ngliche Nachricht-----
Von: Bioc-devel <bioc-devel-bounces at r-project.org> Im Auftrag von Leonardo Collado Torres
Gesendet: Montag, 27. April 2020 18:07
An: Charlotte Soneson <charlottesoneson at gmail.com>
Cc: Bioc-devel <bioc-devel at r-project.org>
Betreff: Re: [Bioc-devel] GenomicFeatures and/or TxDb.Hsapiens.UCSC.hg19.knownGene issue: missing tibble

Hi,

I also ran more tests, which makes me think that the issue was with the list of dependencies we were asking `remotes` to install.


First, regarding the second to last email from Charlotte, the step-wise installation I did mostly using remotes was not ideal. I found a complicated scenario in another package that I contribute to where BSgenome.Hsapiens.UCSC.hg19 had to be downloaded twice (it failed the first time). That was `brainflowprobes` on Windows at
https://github.com/LieberInstitute/brainflowprobes/runs/621460015?check_suite_focus=true#step:12:1142
and https://github.com/LieberInstitute/brainflowprobes/runs/621460015?check_suite_focus=true#step:12:1410.
Downloading such a big package (or any package) twice is really wasteful. So we can discard that path.


Secondly, I also found about remotes::local_package_deps() like Charlotte just mentioned prompted by Martin's question. As suggested by Martin, I'm now trying using BiocManager::install() only since it knows how to resolve Bioc's dependency tree. Thus my current GHA workflow uses BiocManager::install() with the "minimal" deps (the immediate dependencies). I still use remotes::dev_package_deps() to find which packages need to be updated in order to enable the caching functionality later on. I did this installation twice, just as a backup. Then I do a third BiocManager::install() call with any outdated packages across the full dependencies. That's what
https://github.com/leekgroup/derfinderPlot/blob/673608493488ae488ccb66e77e6deae5dabe69e0/.github/workflows/check-bioc.yml#L367-L393
does and here's the relevant code:


message(paste('****', Sys.time(), 'installing BiocManager ****'))
remotes::install_cran("BiocManager")

## Pass #1 at installing dependencies
message(paste('****', Sys.time(), 'pass number 1 at installing
dependencies: local dependencies ****')) local_deps <- remotes::local_package_deps(dependencies = TRUE) deps <- remotes::dev_package_deps(dependencies = TRUE, repos =
BiocManager::repositories())
BiocManager::install(local_deps[local_deps %in% deps$package[deps$diff != 0]])

## Pass #2 at installing dependencies
message(paste('****', Sys.time(), 'pass number 2 at installing
dependencies: local dependencies again ****')) deps <- remotes::dev_package_deps(dependencies = TRUE, repos =
BiocManager::repositories())
BiocManager::install(local_deps[local_deps %in% deps$package[deps$diff != 0]])

## Pass #3 at installing dependencies
message(paste('****', Sys.time(), 'pass number 3 at installing
dependencies: any remaining dependencies ****')) deps <- remotes::dev_package_deps(dependencies = TRUE, repos =
BiocManager::repositories())
BiocManager::install(deps$package[deps$diff != 0])


For all 3 OS (Bioconductor devel docker, macOS, Windows) this works for `derfinderPlot`:
https://github.com/leekgroup/derfinderPlot/actions/runs/89153451.
Actually, in all 3, "pass #2" did nothing. Only in the docker one did pass 3 do something (it updated `pkgbuild` which is not a direct dependency of `derfinderPlot`).



If you like this, given that `BiocManager` already suggests `remotes`, I could add a PR. Something like (with all the arguments and all
that):

bioc_dev_package_deps <- function() {

local_deps <- remotes::local_package_deps(dependencies = TRUE) deps <- remotes::dev_package_deps(dependencies = TRUE, repos =
BiocManager::repositories())
BiocManager::install(local_deps[local_deps %in% deps$package[deps$diff != 0]])

}





Charlotte, in your small example, I saw that
https://github.com/csoneson/testinstall/commit/a5d7f473cad8fbaa4c7df8672dbdcf1994c0dd38
worked. Maybe it would still work with
BiocManager::install('TxDb.Hsapiens.UCSC.hg19.knownGene') only (the minimal "deps" you found with Martin's code).

As for remotes::install_bioc() my understanding is that that function ends up using the git Bioconductor versions. Double checking right now, I see that the behavior depends on whether `git2r` is installed https://github.com/r-lib/remotes/blob/master/R/install-bioc.R#L66.


Best,
Leo

PS If you update many packages at the same time with GHA, you can run into timeout problems :P I was just trying to update my GHA workflow on all my packages before the BioC 3.11 freeze.

PS2 Charlotte, I recommend that you set the GITHUB_PAT environment variable https://github.com/leekgroup/derfinderPlot/blob/673608493488ae488ccb66e77e6deae5dabe69e0/.github/workflows/check-bioc.yml#L100.
Otherwise you depend on the one included in remotes and can run into rate limiting issues. Though I actually ran into some even with it :P
https://github.com/LieberInstitute/recountWorkflow/runs/622552889?check_suite_focus=true#step:13:15
https://github.com/LieberInstitute/recountWorkflow/runs/622552889?check_suite_focus=true#step:13:20
On Mon, Apr 27, 2020 at 11:29 AM Charlotte Soneson <charlottesoneson at gmail.com> wrote:
_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
Message-ID: <AM0PR0502MB38744C75CEFF118DD5874C62D6AF0@AM0PR0502MB3874.eurprd05.prod.outlook.com>
In-Reply-To: <CAP8VEO4N=bx189Hi0LPZ+De-N65Vo-QrgGGhvm=RcPyLTfwdXQ@mail.gmail.com>