Skip to content

[Bioc-devel] Too many dependencies / MultiAssayExperiment + rtracklayer

6 messages · Shraddha Pai, Michael Lawrence, 顾祖光 +1 more

#
Hello again,
I'm trying to simplify the dependencies for my package "netDx", make it
easier to install. It's currently got over 200(!) + some Unix libraries
that need to be installed.

1. I ran pkgDepMetrics() from BiocPkgTools to find less-needed pkgs, and
the package with the most dependencies is MultiAssayExperiment (see below
email). I'm using MAE to construct a container - is there a way to use
@importFrom calls to reduce MAE dependencies?

2. Another problem package is rtracklayer which requires Rhtslib, which
requires some unix libraries: zlib1g-dev libbz2-dev liblzma-dev. I'm not
sure which functionality in the package requires rtracklayer - how can I
tell? Is there a way to simplify / reduce these deps so the user doesn't
have to install all these unix packages?

3. Are there other "problem packages" you can see that I can remove? Let's
assume for now ggplot2 stays because people find it useful to have plotting
functions readily available.

Thanks very much in advance,
Shraddha
---
"ImportedAndUsed" "Exported" "Usage" "DepOverlap" "DepGainIfExcluded"
"igraph" 1 782 0.13 0.05 0
"ggplot2" 1 520 0.19 0.19 0
"pracma" 1 448 0.22 0.03 0
"plotrix" 1 160 0.62 0.03 1
"S4Vectors" 2 283 0.71 0.03 0
"grDevices" 1 112 0.89 0.01 0
"httr" 1 91 1.1 0.05 0
"scater" 1 85 1.18 0.4 0
"utils" 3 217 1.38 0.01 0
"GenomeInfoDb" 1 60 1.67 0.06 0
"stats" 12 449 2.67 0.01 0
"bigmemory" 1 35 2.86 0.03 3
"RCy3" 12 386 3.11 0.32 18
"BiocFileCache" 1 29 3.45 0.23 3
"glmnet" 1 24 4.17 0.07 2
"parallel" 2 33 6.06 0.01 0
"combinat" 1 13 7.69 0.01 1
"MultiAssayExperiment" 4 46 8.7 0.22 1
"foreach" 2 23 8.7 0.02 0
"graphics" 8 87 9.2 0.01 0
"GenomicRanges" 15 106 14.15 0.08 0
"rappdirs" 1 7 14.29 0.01 0
"reshape2" 1 6 16.67 0.05 0
"RColorBrewer" 1 4 25 0.01 0
"netSmooth" 1 3 33.33 0.82 3
"Rtsne" 1 3 33.33 0.02 0
"doParallel" 1 2 50 0.03 0
"ROCR" 2 3 66.67 0.05 4
"clusterExperiment" NA 122 NA 0.74 0
"IRanges" NA 255 NA 0.04 0


--

*Shraddha Pai, PhD*
Principal Investigator, OICR
Assistant Professor, Department of Molecular Biophysics, University of
Toronto
shraddhapai.com; @spaiglass on Twitter
https://pailab.oicr.on.ca


*Ontario Institute for Cancer Research*
MaRS Centre, 661 University Avenue, Suite 510, Toronto, Ontario, Canada M5G
0A3
*@OICR_news* <https://twitter.com/oicr_news> | *www.oicr.on.ca*
<http://www.oicr.on.ca/>



*Collaborate. Translate. Change lives.*



This message and any attachments may contain confidential and/or privileged
information for the sole use of the intended recipient. Any review or
distribution by anyone other than the person for whom it was originally
intended is strictly prohibited. If you have received this message in
error, please contact the sender and delete all copies. Opinions,
conclusions or other information contained in this message may not be that
of the organization.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: netdx_pkgdeps.txt
URL: <https://stat.ethz.ch/pipermail/bioc-devel/attachments/20210920/6a1a1104/attachment.txt>
#
Hi Shraddha,
(indirectly) bringing in those system libraries. I would have expected
zlibbioc to cover the zlib dependency, and perhaps bz2 and lzma
support is optional. Perhaps a core member could comment on that.

In the past, I've used this package
https://github.com/Bioconductor/codetoolsBioC to identify missing
NAMESPACE imports. In theory, you could remove the rtracklayer import
and run functions in that package to identify the symbol-level
dependencies. The output is a bit noisy though.

Btw, using @importFrom only allows you to be selective of symbol-level
dependencies, not package-level.

Michael
On Mon, Sep 20, 2021 at 11:37 AM Shraddha Pai <shraddha.pai at utoronto.ca> wrote:
#
An analysis with the pkgndep package (https://github.com/jokergoo/pkgndep)
shows
the three heaviest packages are RCy3, clusterExperiment and netSmooth. If
you can
move these three packages to SUGGESTS (where the packages are loaded only
when related
functions are called), I think the number of dependent packages will be
reduced
to 130~150, or maybe less.

library(pkgndep)
x = pkgndep("netDx")
plot(x)

The plot is here
https://github.com/jokergoo/ComplexHeatmap/files/7198274/test.pdf

Cheers,
Zuguang


On Mon, 20 Sept 2021 at 20:37, Shraddha Pai <shraddha.pai at utoronto.ca>
wrote:
#
Hi Zugang,
Thanks for the tip on pkgndep - very helpful visualization. If I understand
correctly, rows in the matrix are sorted by number of dependency packages
loaded by a package required by netDx. Can see the long stripes for the
packages you mention.
Those packages are used by 1-2 minor functions for visualizing results so I
could indeed move them to Suggests.

Question: When moving packages from Depends to SUGGESTS, I should also
remove the @imports tag from inline Roxygen2 documentation? And then in the
function itself, should I qualify the functions used by package name?
e.g. if I remove RCy3 from the Depends section, I would remove @imports
RCy3 from the function doc of the corresponding function. Then within that
function wherever I use a function I would prepend with RCy3:: (e.g.
RCy3::commandsGET() instead of commandsGET()).

Is this correct?

Thanks,Shraddha
On Mon, Sep 20, 2021 at 3:41 PM ??? <jokergoo at gmail.com> wrote:

            

  
  
#
Yes, I think so.

On Tue, 21 Sept 2021 at 15:53, Shraddha Pai <shraddha.pai at utoronto.ca>
wrote:

  
  
#
Hi Michael,
Thanks! Looks like the package trying to load 'rtracklayer' was 'TCGAutils'
(see graph from Zugang above, generated using pkgndep - looks to be quite
useful). Turns out TCGAutils really wasn't necessary for my package so I
just took it out and removed all associated dependencies - mercifully an
easier fix.

Thanks for your help,
Shraddha

On Mon, Sep 20, 2021 at 2:57 PM Michael Lawrence <lawrence.michael at gene.com>
wrote: