[Bioc-devel] Moving minfi classes definition to a lighter package
hi, about a year ago we had a developer's forum session devoted to this subject, you might find useful the discussion we had starting on minute 29th here: https://www.youtube.com/watch?v=xsM4nN85cok part of the result of that discussion is in section 7 of this vignette: http://bioconductor.org/packages/release/bioc/vignettes/BiocPkgTools/inst/doc/BiocPkgTools.html#dependency-burden which illustrates how to calculate some metrics on the dependency burden of a package using functionality we implemented in the package BiocPkgTools, in the case of minfi, this is the output: library(BiocPkgTools) depdf <- buildPkgDependencyDataFrame(repo=c("BioCsoft", "CRAN"), dependencies=c("Depends", "Imports")) minfidepmetrics <- pkgDepMetrics("minfi", depdf) minfidepmetrics ???????????????????? ImportedAndUsed Exported? Usage DepOverlap DepGainIfExcluded DelayedArray?????????????????????? 1????? 188 0.53?????? 0.11???????????????? 0 grDevices????????????????????????? 1????? 112 0.89?????? 0.01???????????????? 0 data.table???????????????????????? 1????? 100 1.00?????? 0.01???????????????? 1 MASS?????????????????????????????? 1?????? 78 1.28?????? 0.04???????????????? 0 limma????????????????????????????? 4????? 310 1.29?????? 0.04???????????????? 0 reshape??????????????????????????? 1?????? 67 1.49?????? 0.03???????????????? 2 nlme?????????????????????????????? 2????? 109 1.83?????? 0.05???????????????? 1 utils????????????????????????????? 4????? 216 1.85?????? 0.01???????????????? 0 lattice??????????????????????????? 3????? 144 2.08?????? 0.05???????????????? 0 BiocGenerics?????????????????????? 5????? 141 3.55?????? 0.04???????????????? 0 stats???????????????????????????? 16????? 449 3.56?????? 0.01???????????????? 0 siggenes?????????????????????????? 2?????? 51 3.92?????? 0.13???????????????? 3 genefilter???????????????????????? 2?????? 49 4.08?????? 0.38???????????????? 3 Biobase??????????????????????????? 6????? 128 4.69?????? 0.05???????????????? 0 GenomeInfoDb?????????????????????? 3?????? 60 5.00?????? 0.09???????????????? 0 preprocessCore???????????????????? 2?????? 39 5.13?????? 0.02???????????????? 1 GEOquery?????????????????????????? 1?????? 17 5.88?????? 0.32???????????????? 4 HDF5Array????????????????????????? 5?????? 72 6.94?????? 0.15???????????????? 4 bumphunter???????????????????????? 1?????? 14 7.14?????? 0.76??????????????? 25 BiocParallel?????????????????????? 6?????? 68 8.82?????? 0.07???????????????? 0 Biostrings??????????????????????? 23????? 240 9.58?????? 0.11???????????????? 0 graphics?????????????????????????? 9?????? 87 10.34?????? 0.01???????????????? 0 IRanges?????????????????????????? 40????? 254 15.75?????? 0.06???????????????? 0 S4Vectors???????????????????????? 47????? 278 16.91?????? 0.05???????????????? 0 DelayedMatrixStats??????????????? 14?????? 74 18.92?????? 0.14???????????????? 2 GenomicRanges???????????????????? 23????? 106 21.70?????? 0.12???????????????? 0 RColorBrewer?????????????????????? 1??????? 4 25.00?????? 0.01???????????????? 1 SummarizedExperiment????????????? 23?????? 82 28.05?????? 0.19???????????????? 0 illuminaio???????????????????????? 1??????? 3 33.33?????? 0.04???????????????? 2 quadprog?????????????????????????? 1??????? 2 50.00?????? 0.01???????????????? 1 beanplot?????????????????????????? 1??????? 1 100.00?????? 0.01???????????????? 1 mclust??????????????????????????? NA????? 271 NA?????? 0.04???????????????? 1 nor1mix?????????????????????????? NA?????? 38 NA?????? 0.02???????????????? 1 so, with the exception of 'bumphunter', it doesn't look like the removal of a single dependency will give you much gain. it seems that minfi imports a single functionality from bumphunter: imp <- pkgDepImports("minfi") imp[imp$pkg %in% "bumphunter", ] # A tibble: 1 x 2 ? pkg??????? fun ? <chr>????? <chr> 1 bumphunter bumphunter you can explore the gain by excluding combinations of package dependencies with the function 'pkgCombDependencyGain()': pcd <- pkgCombDependencyGain("minfi", depdf, maxNbr=2L) dim(pcd) [1] 561?? 3 head(pcd[order(pcd$DepGain, decreasing = TRUE), ]) ????????????????????????? Packages NbrExcl DepGain 160?????????? bumphunter, GEOquery?????? 2????? 43 175???????? bumphunter, genefilter?????? 2????? 40 98??????? BiocParallel, bumphunter?????? 2????? 31 161????????? bumphunter, HDF5Array?????? 2????? 29 165?????????? bumphunter, siggenes?????? 2????? 28 157 bumphunter, DelayedMatrixStats?????? 2????? 27 have fun with the dependency exploration game! :) robert.
On 3/3/21 1:28 PM, Kasper Daniel Hansen wrote:
I am happy to engage in a discussion about this, although I'm not sure that I am ultimately interested in having two packages. But first I would like to look at some dependency graphs. I am wondering what makes the dependency tree this big (and my tree is smaller than yours, but still big: library(minfi) gives me 16 attached packages and 89 loaded packages for the current release). This includes some part of the tidyverse which we don't really use much though (and which could probably get removed from the package with almost no work). What's the current best tool for dependency graphs in Bioconductor? pkgDepTools? Best, Kasper On Mon, Mar 1, 2021 at 6:24 PM Carlos Ruiz <carlos.ruiz at isglobal.org> wrote:
Dear Bioc developers, I have been developing different packages to analyze DNA methylation. In all of them, I have used minfi's class GenomicRatioSet to manage DNA methylation data, in order to take profit of the features of RangedSummarizedExperiment. Although I am very happy with the potential of the class, importing its definition from minfi, makes me add the package to imports. As minfi has a high number of dependencies (129 in the current release), my packages end up having hundreds of dependencies too. This is particularly problematic as I do not use any of the other functions of minfi. I am wondering whether it could be possible to move minfi's class (or at least GenomicRatioSet) to a lighter package, so people developing packages on DNA methylation could rely on this class without having to import the whole minfi package and its dependencies. Thank you very much, -- Carlos Ruiz -- This message is intended exclusively for its addressee and may contain information that is CONFIDENTIAL and protected by professional privilege. If you are not the intended recipient you are hereby notified that any dissemination, copy or disclosure of this communication is strictly prohibited by law. If this message has been received in error, please immediately notify us via e-mail and delete it. DATA PROTECTION. We inform you that your personal data, including your e-mail address and data included in your email correspondence, are included in the ISGlobal Foundation files. Your personal data will be used for the purpose of contacting you and sending information on the activities of the above foundations. You can exercise your rights of access, rectification, cancellation and opposition by contacting the following address: lopd at isglobal.org <mailto:lopd at isglobal.org>. ISGlobal Privacy Policy at www.isglobal.org <http://www.isglobal.org/>. ----------------------------------------------------------------------------------------------------------------------------- CONFIDENCIALIDAD. Este mensaje y sus anexos se dirigen exclusivamente a su destinatario y puede contener informaci?n confidencial, por lo que la utilizaci?n, divulgaci?n y/o copia sin autorizaci?n est? prohibida por la legislaci?n vigente. Si ha recibido este mensaje por error, le rogamos lo comunique inmediatamente por esta misma v?a y proceda a su destrucci?n. PROTECCI?N DE DATOS. Sus datos de car?cter personal utilizados en este env?o, incluida su direcci?n de e-mail, forman parte de ficheros de titularidad de la Fundaci?n ISGlobal para cualquier finalidades de contacto, relaci?n institucional y/o env?o de informaci?n sobre sus actividades. Los datos que usted nos pueda facilitar contestando este correo quedar?n incorporados en los correspondientes ficheros, autorizando el uso de su direcci?n de e-mail para las finalidades citadas. Puede ejercer los derechos de acceso, rectificaci?n, cancelaci?n y oposici?n dirigi?ndose a lopd at isglobal.org <mailto:lopd at isglobal.org>* *. Pol?tica de privacidad en www.isglobal.org <http://www.isglobal.org/>. [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Robert Castelo, PhD Associate Professor Dept. of Experimental and Health Sciences Universitat Pompeu Fabra (UPF) Barcelona Biomedical Research Park (PRBB) Dr Aiguader 88 E-08003 Barcelona, Spain telf: +34.933.160.514 fax: +34.933.160.550