Dear All, I recently took over maintenance of the ?fastseg? package (http://bioconductor.org/packages/3.16/bioc/html/fastseg.html) and after fixing the issues recommended by `R CMD Check` I wanted to optimize the package's NAMESPACE file and the Depends/Imports given in the DESCRIPTION file. Replacing the generic complete `import` of dependent packages with more fine-grained `importFrom` calls is rather obvious. However, I was wondering if there are any reasons that speak against doing so? Concerning the DESCRIPTION file, given that the used functions were already specified in the NAMESPACE I was planning to edit the DESCRIPTION file and move the ?GenomicRanges? and ?Biobase? dependencies from Depends to Imports. In the package, the Biobase functions are used to query supported ExpressionSet objects, while GenomicRanges is used to support Granges objects and create the final output as Granges object. Is it legit to have GenomicRanges ?only" as Imports, even if the main function's output is in GRanges format? I want to keep the ?Depends? field as small as possible to not pollute downstream packages to attach everything and mask other functions. Is this reasonable, or should I just import ?GenomicRanges? plus all required packages from the beginning and live with it? I hope there are some general guidelines to follow. Best Alex
[Bioc-devel] NAMESPACE best practices
5 messages · Alexander Blume, Hervé Pagès, Kasper Daniel Hansen
Hi Alex,
On 24/05/2022 03:56, Alexander Blume wrote:
Dear All, I recently took over maintenance of the ?fastseg? package (http://bioconductor.org/packages/3.16/bioc/html/fastseg.html) and after fixing the issues recommended by `R CMD Check` I wanted to optimize the package's NAMESPACE file and the Depends/Imports given in the DESCRIPTION file. Replacing the generic complete `import` of dependent packages with more fine-grained `importFrom` calls is rather obvious. However, I was wondering if there are any reasons that speak against doing so?
In my experience doing selective imports for core packages like methods,
BiocGenerics, S4Vectors, IRanges, and GenomicRanges, is almost never
worth it. It's just one more maintenance burden for virtually zero benefits.
However, the following 'R CMD check' NOTES:
??? Namespace in Imports field not imported from: ?stats?
and
??? Consider adding
? ? ? importFrom("grDevices", "dev.cur", "dev.interactive", "dev.new")
reveal real problems that should be addressed.
Concerning the DESCRIPTION file, given that the used functions were already specified in the NAMESPACE I was planning to edit the DESCRIPTION file and move the ?GenomicRanges? and ?Biobase? dependencies from Depends to Imports. In the package, the Biobase functions are used to query supported ExpressionSet objects, while GenomicRanges is used to support Granges objects and create the final output as Granges object. Is it legit to have GenomicRanges ?only" as Imports, even if the main function's output is in GRanges format?
The consequence of moving GenomicRanges from Depends to Imports is that
the basic GRanges functionalities would no longer be available to your
users so it would feel like you're returning objects that "don't work".
Unfortunately I see many Bioconductor packages doing similar things e.g.
some packages return SummarizedExperiment derivatives but don't depend
on the SummarizedExperiment package (they only import it). As a
consequence basic things like assay() or colData() don't work on the object.
Here is a concrete example:
? library(AUCell)
? exprMatrix <- cbind(cell1=100*4:0, cell2=c(500, 0, 90, 0, 750))
? rownames(exprMatrix) <- sprintf("gene%02d", seq_len(nrow(exprMatrix)))
? rankings <- AUCell_buildRankings(exprMatrix, plotStats=FALSE,
verbose=FALSE)? # a SummarizedExperiment derivative
? assay(rankings)
? # Error in assay(rankings) : could not find function "assay"
? colData(rankings)
? # Error in colData(rankings) : could not find function "colData"
? library(SummarizedExperiment)
? assay(rankings)
? # ? ?? ???? cells
? # ? genes??? cell1 cell2
? # ? ? gene01???? 1???? 2
? # ? ? gene02???? 2???? 4
? # ? ? gene03???? 3???? 3
? # ? ? gene04???? 4???? 5
? # ? ? gene05???? 5???? 1
I want to keep the ?Depends? field as small as possible to not pollute downstream packages to attach everything and mask other functions.
Keeping Depends as small as possible is definitely something to aim for, as long as your users can still "operate" on the objects that you expose to them. For example your users should not need to guess what package to load before they can use the accessor functions defined for the object your returned to them.
Is this reasonable, or should I just import ?GenomicRanges? plus all required packages from the beginning and live with it? I hope there are some general guidelines to follow.
Definitely keep GenomicRanges in Depends. Cheers, H.
Best Alex
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com
Hi Herv?, Thank you so much for your detailed response! These are some really helpful advices. I will take care of the missing imports and leave the Depends field as is. You are right, in the end, the usability is most important. Best, Alex Sent from mobile. Herv? Pag?s <hpages.on.github at gmail.com> schrieb am Di., 24. Mai 2022, 19:43:
Hi Alex, On 24/05/2022 03:56, Alexander Blume wrote:
Dear All, I recently took over maintenance of the ?fastseg? package (
http://bioconductor.org/packages/3.16/bioc/html/fastseg.html) and after fixing the issues recommended by `R CMD Check` I wanted to optimize the package's NAMESPACE file and the Depends/Imports given in the DESCRIPTION file.
Replacing the generic complete `import` of dependent packages with more
fine-grained `importFrom` calls is rather obvious.
However, I was wondering if there are any reasons that speak against
doing so?
In my experience doing selective imports for core packages like methods,
BiocGenerics, S4Vectors, IRanges, and GenomicRanges, is almost never
worth it. It's just one more maintenance burden for virtually zero
benefits.
However, the following 'R CMD check' NOTES:
Namespace in Imports field not imported from: ?stats?
and
Consider adding
importFrom("grDevices", "dev.cur", "dev.interactive", "dev.new")
reveal real problems that should be addressed.
Concerning the DESCRIPTION file, given that the used functions were
already specified in the NAMESPACE I was planning to edit the DESCRIPTION file and move the ?GenomicRanges? and ?Biobase? dependencies from Depends to Imports.
In the package, the Biobase functions are used to query supported
ExpressionSet objects, while GenomicRanges is used to support Granges objects and create the final output as Granges object.
Is it legit to have GenomicRanges ?only" as Imports, even if the main
function's output is in GRanges format?
The consequence of moving GenomicRanges from Depends to Imports is that
the basic GRanges functionalities would no longer be available to your
users so it would feel like you're returning objects that "don't work".
Unfortunately I see many Bioconductor packages doing similar things e.g.
some packages return SummarizedExperiment derivatives but don't depend
on the SummarizedExperiment package (they only import it). As a
consequence basic things like assay() or colData() don't work on the
object.
Here is a concrete example:
library(AUCell)
exprMatrix <- cbind(cell1=100*4:0, cell2=c(500, 0, 90, 0, 750))
rownames(exprMatrix) <- sprintf("gene%02d", seq_len(nrow(exprMatrix)))
rankings <- AUCell_buildRankings(exprMatrix, plotStats=FALSE,
verbose=FALSE) # a SummarizedExperiment derivative
assay(rankings)
# Error in assay(rankings) : could not find function "assay"
colData(rankings)
# Error in colData(rankings) : could not find function "colData"
library(SummarizedExperiment)
assay(rankings)
# cells
# genes cell1 cell2
# gene01 1 2
# gene02 2 4
# gene03 3 3
# gene04 4 5
# gene05 5 1
I want to keep the ?Depends? field as small as possible to not pollute
downstream packages to attach everything and mask other functions. Keeping Depends as small as possible is definitely something to aim for, as long as your users can still "operate" on the objects that you expose to them. For example your users should not need to guess what package to load before they can use the accessor functions defined for the object your returned to them.
Is this reasonable, or should I just import ?GenomicRanges? plus all
required packages from the beginning and live with it? I hope there are some general guidelines to follow. Definitely keep GenomicRanges in Depends. Cheers, H.
Best Alex
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
-- Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com
I agree with Herve: packages that define objects that the user actually interacts with, should IMO be Depends. import vs importFrom depends a bit on which package and how many functions I use. There is a limit where I'm just like screw it, I'll get everything. codetoolsBioC has a useful function writeNamespace(). Best, Kasper On Wed, May 25, 2022 at 5:56 AM Alexander Blume <alex.gos90 at gmail.com> wrote:
Hi Herv?, Thank you so much for your detailed response! These are some really helpful advices. I will take care of the missing imports and leave the Depends field as is. You are right, in the end, the usability is most important. Best, Alex Sent from mobile. Herv? Pag?s <hpages.on.github at gmail.com> schrieb am Di., 24. Mai 2022, 19:43:
Hi Alex, On 24/05/2022 03:56, Alexander Blume wrote:
Dear All, I recently took over maintenance of the ?fastseg? package (
http://bioconductor.org/packages/3.16/bioc/html/fastseg.html) and after fixing the issues recommended by `R CMD Check` I wanted to optimize the package's NAMESPACE file and the Depends/Imports given in the DESCRIPTION file.
Replacing the generic complete `import` of dependent packages with more
fine-grained `importFrom` calls is rather obvious.
However, I was wondering if there are any reasons that speak against
doing so?
In my experience doing selective imports for core packages like methods,
BiocGenerics, S4Vectors, IRanges, and GenomicRanges, is almost never
worth it. It's just one more maintenance burden for virtually zero
benefits.
However, the following 'R CMD check' NOTES:
Namespace in Imports field not imported from: ?stats?
and
Consider adding
importFrom("grDevices", "dev.cur", "dev.interactive", "dev.new")
reveal real problems that should be addressed.
Concerning the DESCRIPTION file, given that the used functions were
already specified in the NAMESPACE I was planning to edit the DESCRIPTION file and move the ?GenomicRanges? and ?Biobase? dependencies from Depends to Imports.
In the package, the Biobase functions are used to query supported
ExpressionSet objects, while GenomicRanges is used to support Granges objects and create the final output as Granges object.
Is it legit to have GenomicRanges ?only" as Imports, even if the main
function's output is in GRanges format?
The consequence of moving GenomicRanges from Depends to Imports is that
the basic GRanges functionalities would no longer be available to your
users so it would feel like you're returning objects that "don't work".
Unfortunately I see many Bioconductor packages doing similar things e.g.
some packages return SummarizedExperiment derivatives but don't depend
on the SummarizedExperiment package (they only import it). As a
consequence basic things like assay() or colData() don't work on the
object.
Here is a concrete example:
library(AUCell)
exprMatrix <- cbind(cell1=100*4:0, cell2=c(500, 0, 90, 0, 750))
rownames(exprMatrix) <- sprintf("gene%02d", seq_len(nrow(exprMatrix)))
rankings <- AUCell_buildRankings(exprMatrix, plotStats=FALSE,
verbose=FALSE) # a SummarizedExperiment derivative
assay(rankings)
# Error in assay(rankings) : could not find function "assay"
colData(rankings)
# Error in colData(rankings) : could not find function "colData"
library(SummarizedExperiment)
assay(rankings)
# cells
# genes cell1 cell2
# gene01 1 2
# gene02 2 4
# gene03 3 3
# gene04 4 5
# gene05 5 1
I want to keep the ?Depends? field as small as possible to not pollute
downstream packages to attach everything and mask other functions. Keeping Depends as small as possible is definitely something to aim for, as long as your users can still "operate" on the objects that you expose to them. For example your users should not need to guess what package to load before they can use the accessor functions defined for the object your returned to them.
Is this reasonable, or should I just import ?GenomicRanges? plus all
required packages from the beginning and live with it? I hope there are some general guidelines to follow. Definitely keep GenomicRanges in Depends. Cheers, H.
Best Alex
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
-- Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Best, Kasper [[alternative HTML version deleted]]
1 day later
Dear Kasper,
Yes, I will keep the depends as is, since it was working fine before.
However, I guess I have to be a bit more selective with imports from the core packages, since there is a warning when I just load them using `import`:
W checking whether package ?fastseg? can be installed (16.8s)
Found the following significant warnings:
Warning: replacing previous import ?IRanges::median? by ?stats::median? when loading ?fastseg?
Warning: replacing previous import ?IRanges::quantile? by ?stats::quantile? when loading ?fastseg?
Warning: replacing previous import ?S4Vectors::sd? by ?stats::sd? when loading ?fastseg?
This warning is almost solved if I `importFrom` IRanges and S4Vectors functions as required, but leaves me with a new warning:
W checking whether package ?fastseg? can be installed (18s)
Found the following significant warnings:
Warning: replacing previous import ?BiocGenerics::sd? by ?stats::sd? when loading ?fastseg?
Now I wonder if I the sd function defined by BiocGenerics will fall back to stats::sd when a numeric vector is given,
so that I could drop the import of sd() from stats completely.
I saw some mentions of codetoolsBioC already on StackOverflow, but was not really able to fetch it somehow using svn.
Is there some magic command to download the repository?
Best
Alex
On 26. May 2022, at 03:02, Kasper Daniel Hansen <kasperdanielhansen at gmail.com> wrote: I agree with Herve: packages that define objects that the user actually interacts with, should IMO be Depends. import vs importFrom depends a bit on which package and how many functions I use. There is a limit where I'm just like screw it, I'll get everything. codetoolsBioC has a useful function writeNamespace(). Best, Kasper On Wed, May 25, 2022 at 5:56 AM Alexander Blume <alex.gos90 at gmail.com <mailto:alex.gos90 at gmail.com>> wrote: Hi Herv?, Thank you so much for your detailed response! These are some really helpful advices. I will take care of the missing imports and leave the Depends field as is. You are right, in the end, the usability is most important. Best, Alex Sent from mobile. Herv? Pag?s <hpages.on.github at gmail.com <mailto:hpages.on.github at gmail.com>> schrieb am Di., 24. Mai 2022, 19:43:
Hi Alex, On 24/05/2022 03:56, Alexander Blume wrote:
Dear All, I recently took over maintenance of the ?fastseg? package (
http://bioconductor.org/packages/3.16/bioc/html/fastseg.html <http://bioconductor.org/packages/3.16/bioc/html/fastseg.html>) and after fixing the issues recommended by `R CMD Check` I wanted to optimize the package's NAMESPACE file and the Depends/Imports given in the DESCRIPTION file.
Replacing the generic complete `import` of dependent packages with more
fine-grained `importFrom` calls is rather obvious.
However, I was wondering if there are any reasons that speak against
doing so?
In my experience doing selective imports for core packages like methods,
BiocGenerics, S4Vectors, IRanges, and GenomicRanges, is almost never
worth it. It's just one more maintenance burden for virtually zero
benefits.
However, the following 'R CMD check' NOTES:
Namespace in Imports field not imported from: ?stats?
and
Consider adding
importFrom("grDevices", "dev.cur", "dev.interactive", "dev.new")
reveal real problems that should be addressed.
Concerning the DESCRIPTION file, given that the used functions were
already specified in the NAMESPACE I was planning to edit the DESCRIPTION file and move the ?GenomicRanges? and ?Biobase? dependencies from Depends to Imports.
In the package, the Biobase functions are used to query supported
ExpressionSet objects, while GenomicRanges is used to support Granges objects and create the final output as Granges object.
Is it legit to have GenomicRanges ?only" as Imports, even if the main
function's output is in GRanges format?
The consequence of moving GenomicRanges from Depends to Imports is that
the basic GRanges functionalities would no longer be available to your
users so it would feel like you're returning objects that "don't work".
Unfortunately I see many Bioconductor packages doing similar things e.g.
some packages return SummarizedExperiment derivatives but don't depend
on the SummarizedExperiment package (they only import it). As a
consequence basic things like assay() or colData() don't work on the
object.
Here is a concrete example:
library(AUCell)
exprMatrix <- cbind(cell1=100*4:0, cell2=c(500, 0, 90, 0, 750))
rownames(exprMatrix) <- sprintf("gene%02d", seq_len(nrow(exprMatrix)))
rankings <- AUCell_buildRankings(exprMatrix, plotStats=FALSE,
verbose=FALSE) # a SummarizedExperiment derivative
assay(rankings)
# Error in assay(rankings) : could not find function "assay"
colData(rankings)
# Error in colData(rankings) : could not find function "colData"
library(SummarizedExperiment)
assay(rankings)
# cells
# genes cell1 cell2
# gene01 1 2
# gene02 2 4
# gene03 3 3
# gene04 4 5
# gene05 5 1
I want to keep the ?Depends? field as small as possible to not pollute
downstream packages to attach everything and mask other functions. Keeping Depends as small as possible is definitely something to aim for, as long as your users can still "operate" on the objects that you expose to them. For example your users should not need to guess what package to load before they can use the accessor functions defined for the object your returned to them.
Is this reasonable, or should I just import ?GenomicRanges? plus all
required packages from the beginning and live with it? I hope there are some general guidelines to follow. Definitely keep GenomicRanges in Depends. Cheers, H.
Best Alex
_______________________________________________ Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
-- Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com <mailto:hpages.on.github at gmail.com>
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel <https://stat.ethz.ch/mailman/listinfo/bioc-devel> -- Best, Kasper