On Tue, Oct 28, 2014 at 2:29 PM, Herv? Pag?s <hpages at fredhutch.org> wrote:
Hi,
On 10/28/2014 08:48 AM, Vincent Carey wrote:
On Tue, Oct 28, 2014 at 11:23 AM, Kasper Daniel Hansen <
kasperdanielhansen at gmail.com> wrote:
Well, first I want to make sure that there is not something special
regarding S4 methods and classes. I have a feeling that they are a
special
case.
Second, while I agree with Jim's general opinion, it is a little bit
different when I have return objects which are defined in other
If I don't depend on this other package, the user is hosed wrt. the
return
object, unless I manually export all classes from this other
In what sense? If you return an instance of GRanges, certain things can
be
done
even if GenomicRanges is not attached.
Yes certain things maybe, but it's hard to predict which ones.
You can get values of slots, for
example.
With the following little package
%vjcair> cat foo/NAMESPACE
importFrom(IRanges, IRanges)
importClassesFrom(GenomicRanges, GRanges)
importFrom(GenomicRanges, GRanges)
export(myfun)
%vjcair> cat foo/DESCRIPTION
Package: foo
Title: foo
Version: 0.0.0
Author: VJ Carey <stvjc at channing.harvard.edu>
Description:
Suggests:
Depends:
Imports: GenomicRanges
Maintainer: VJ Carey <stvjc at channing.harvard.edu>
License: Private
LazyLoad: yes
%vjcair> cat foo/R/*
myfun = function(seqnames="1", ranges=IRanges(1,2), ...)
GRanges(seqnames=seqnames, ranges=ranges, ...)
The following works:
library(foo)
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] 1 [1, 2] *
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
So the show method works, even though I have not touched it. (I did not
expect it to work, in fact.)
Exactly. Let's call it luck ;-)
Additionally, I can get access to slots.
The end user should never try to access slots directly but use getters
and setters instead. And most getters and setters for GRanges objects
are defined and documented in the GenomicRanges package. Those that are
not are defined in packages that GenomicRanges depends on.
But
ranges()
fails. If I, the user, want to use it, I need to arrange for that.
IMO if your package returns a GRanges object to the user, then the user
should be able to access the man page for GRanges objects with ?GRanges.
Oddly enough, that seems to be incorrect. I added a man page to foo that
has
a \link[GenomicRanges]{GRanges-class}. I ran help.start and the cross
reference
from my man page succeeds. Furthermore with the sessionInfo below,
?GRanges
succeeds at the CLI. I am not trying to defend the NOTE but the principle
of minimizing
Depends declarations needs to be considered critically, and I am just
exploring the space.
?GRanges # it worked as usual in the tty
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices datasets utils tools methods
[8] base
other attached packages:
[1] foo_0.0.0 rmarkdown_0.3.8 knitr_1.6
[4] weaver_1.31.0 codetools_0.2-9 digest_0.6.4
[7] BiocInstaller_1.16.0
loaded via a namespace (and not attached):
[1] BiocGenerics_0.11.5 evaluate_0.5.5 formatR_1.0
[4] GenomeInfoDb_1.1.26 GenomicRanges_1.17.48 htmltools_0.2.6
[7] IRanges_1.99.32 parallel_3.1.1 S4Vectors_0.2.8
[10] stats4_3.1.1 stringr_0.6.2 XVector_0.5.8
And that works only if the GenomicRanges package is attached. Attaching
GenomicRanges will also attach other packages that GenomicRanges depends
on where some GRanges accessors might be defined and documented (e.g.
metadata()).
In some cases you'll decide you want the user to have a full complement
methods for your package to function meaningfully. For example, I am
considering
using dplyr idioms to work with data structures in a package, and it
I should
just depend on dplyr rather than pick out and document which things I
to expose. But that
may still be an undesirable design.
package, like
importClassesFrom("GenomicRanges", "GRanges")
exportClasses("GRanges")
Surely that is not intended.
It is important that my package works without being attached to the
search
path and I do this by carefully importing what I need, ie. my code does
not
require that my dependencies are attached to the search path. But the
end
user will be hosed without it.
Yes s/he will. Fortunately when your package namespace gets loaded by
another package, then nothing gets attached to the search path, even if
your package depends (instead of imports) on other packages. So using
Depends instead of Imports for your own dependencies won't make any
difference in that respect, which is good.
My impression is that the NOTE in R CMD check was written by someone
did not anticipate large-scale use and re-use of classes and methods
across
many packages.
That's my impression too.
Cheers,
H.
Best,
Kasper
On Tue, Oct 28, 2014 at 11:14 AM, James W. MacDonald <jmacdon at uw.edu>
wrote:
I agree with Vince. It's your job as a package developer to make
available to your package all the functions necessary for the package
work. But I am not sure it is your job to load all the packages that
your
end user might need.
Best,
Jim
On Tue, Oct 28, 2014 at 11:04 AM, Vincent Carey <
stvjc at channing.harvard.edu> wrote:
On Tue, Oct 28, 2014 at 10:19 AM, Kasper Daniel Hansen <
kasperdanielhansen at gmail.com> wrote:
What is the current best paradigm for using all the classes in
S4Vectors/GenomeInfoDb/GenomicRanges/IRanges
I obviously import methods and classes from the relevant packages.
But shouldn't I depend on these packages as well? Since I basically
the user to have this functionality at the command line? That is
now.
I've wondered about this as well. It seems the principle is that
user
should
take care of attaching additional packages when needed. It might be
appropriate
to give a hint in the package startup message, if having some other
package
attached
would typically be of great utility.
Given your list above, I would think that depending on GenomicRanges
would
often
be sufficient, and IRanges/S4Vectors would not require dependency
assertion. I would
think that GenomeInfoDb should be a voluntary attachment for a
session.
These are just my guesses -- I doubt there will be complete
but
I have
started to think very critically about using Depends, and I think it
better when its
use is minimized.
That of course leads to the R CMD check NOTE on depending on too
packages.... I guess I should ignore that one.
Best,
Kasper
[[alternative HTML version deleted]]