Skip to content

[Bioc-devel] exptData(SummarizedExperiment)

10 messages · Michael Lawrence, Hervé Pagès, Kasper Daniel Hansen +2 more

#
Hi Tim,

The SummarizedExperiment class is being replaced with the
RangedSummarizedExperiment class from the new SummarizedExperiment
package. This is a work-in-progress and the name and internal
representation of the RangedSummarizedExperiment class are not
finalized yet. The main goal for now is to move all the
SummarizedExperiment stuff from GenomicRanges to its own package.

Anyway, metadata() is the replacement for exptData() on
RangedSummarizedExperiment objects. It's on my list to add
an exptData method for backward compatibility.

Cheers,
H.
On 05/11/2015 04:37 PM, Tim Triche, Jr. wrote:

  
    
#
Splitting stuff into different packages is good for modularity, but tough
on the mind of the user. What about having some sort of "meta" package that
simply loads the core infrastructure packages? Named something simple like
"Genomics" or "GenomicsCore".
On Mon, May 11, 2015 at 5:10 PM, Herv? Pag?s <hpages at fredhutch.org> wrote:

            

  
  
#
Hi Michael,
On 05/11/2015 05:35 PM, Michael Lawrence wrote:
Don't know if we need this. For example, for all the
SummarizedExperiment use cases I ran into, the end-user generally
only needs to load the corresponding high-level package (DESeq2,
VariantAnnotation, minfi, GenomicAlignments, etc...) and that takes
care of loading all the low-level infrastructure packages.

H.

  
    
#
It's more general than SummarizedExperiment. I think people would
appreciate a simple way to load the core, without having to remember, for
example, that VCF reading is in VariantAnnotation.
On Mon, May 11, 2015 at 9:51 PM, Herv? Pag?s <hpages at fredhutch.org> wrote:

            

  
  
#
SummarizedExperiment was just an example. I agree it can be a
little challenging for end users to know where to find a particular
functionality but I'm not sure about using "meta" packages to address
that. At least I feel we should probably avoid creating new "meta"
packages out of the blue, with arbitrary limits and possibly endless
discussions about what exactly goes in them. Also I don't think there
is a single "core" but rather several domain-specific cores.

What about using the existing workflow packages instead?
A workflow package (like the variants package here
http://bioconductor.org/help/workflows/variants/)
covers a specific domain and loading it should load the "core"
for that domain. Plus the user gets a great vignette as a bonus
to get started so it's not just an empty shell.

There are probably some shortcomings with workflow packages
that would need to be addressed before they can serve as
convenient "meta" packages though e.g. they're treated too
differently from other BioC packages (e.g. they're not available
via biocLite() and don't show up under the biocViews tree here
http://bioconductor.org/packages/release/BiocViews.html).
Nothing that seems impossible to address though...

H.
On 05/12/2015 03:22 PM, Michael Lawrence wrote:

  
    
#
I like the idea of having multiple, domain-specific cores. Those could also
serve as a vehicle for high-level documentation, including the workflows
but also more "cheat-sheet" and/or cookbook-style documentation. Rafa has
brought this up on the phone calls.
On Tue, May 12, 2015 at 4:10 PM, Herv? Pag?s <hpages at fredhutch.org> wrote:

            

  
  
#
Agreed that the workflow vehicle should get more attention.  Do all
workflows correspond to packages?

On Tue, May 12, 2015 at 7:31 PM, Michael Lawrence <lawrence.michael at gene.com

  
  
#
the original workflow idea was exactly that it should go beyond a single
package.

Having domain specific cores might be controversial since we often have
multiple packages competing in the same domain.  To some extent the
GenomicRanges/Biostring/etc/etc is a special case of this, where "everyone"
is using these packages.  To me it sounds too much like an official
endorsement of a specific combination of packages.  With workflows, all one
is saying is that "you can use package A,B,C to accomplish tasks 1,2,3";
there is no official endorsement of a "winner".  I think this is worth
thinking about: I think the project does benefit from multiple attempts at
achieving the same result and the resulting competition it creates.

Best,
Kasper

On Tue, May 12, 2015 at 11:26 PM, Vincent Carey <stvjc at channing.harvard.edu>
wrote:

  
  
#
I personally don't care as long as I have a cubbyhole for unstructured data
and the name of that cubbyhole doesn't change every few weeks

hence the patch that I sent about 12 hours after complaining :-)


Statistics is the grammar of science.
Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science>

On Wed, May 13, 2015 at 10:48 AM, Kasper Daniel Hansen <
kasperdanielhansen at gmail.com> wrote:

            

  
  
#
On Wed, May 13, 2015 at 1:48 PM, Kasper Daniel Hansen <
kasperdanielhansen at gmail.com> wrote:

            
My question here was not about workflow intent.  It was about the vehicle
for maintaining a workflow.  Some
are created as documents only, and a package is generated from the
document; others are
authored as packages and the workflow document is a vignette of the
enclosing package (I think).

Making a valid package is a bit more involved than making a computable
workflow document from which a
package is autogenerated.
Fair point.  Competing workflows are also quite possible.  There's a
tension between simplicity for the user and complexity/cacophony induced by
allowing multiple solutions in the system.  I think the user gains more
from the complex
system even if it is a bit harder to "use".