Skip to content

[Bioc-devel] A method for combining SummarizedExperiment objects

3 messages · Peter Hickey, Martin Morgan

#
I often find myself with multiple `SE` objects (I'm using `SE` as a
shorthand for the `SummarizedExperiment0` and `RangedSummarizedExeriment`
classes), each with different samples but possibly non-overlapping
features/ranges. Currently, it is difficult to combine these objects;
 `rbind()` can only combine objects with the same samples but different
features/ranges and `cbind()` can only combine objects with the same
features/ranges but different samples. I think it would be useful to have a
"combine" method for `SE` objects that handles the situation where each
object has different samples but with possibly non-overlapping
features/ranges.

I've written a first pass at a method to do this at
https://gist.github.com/PeteHaitch/8993b096cfa7ccd08c13.
Is this a method other people find themselves in need of and, if so, might
we add something like this to the SummarizedExperiment package? As noted in
the gist, there's a few things I'd like to address to make it more robust
and complete (probably some optimisations too).

Cheers,
Pete
#
Sorry, the URL may have been mangled. It's
https://gist.github.com/PeteHaitch/8993b096cfa7ccd08c13
<https://gist.github.com/PeteHaitch/8993b096cfa7ccd08c13.>
On Thu, 15 Oct 2015 at 12:52 Peter Hickey <peter.hickey at gmail.com> wrote:

            

  
  
#
Hi Pete -- looks like a good idea. 

I think the generic could be adjusted to pass named (not x, y) args to methods, rather than trying (incorrectly) to combine them. I don't think the inefficiency of recursion is a particular concern, because it is not like hundreds (or even tens) of objects are typically being combined.

combine() takes the approach of implementing methods for each component -- so I guess DataFrame, GRanges, GRangesList, SimpleList (for the assays, which are matrix, which are already combine()-able). 

Any interest in re-implementing your code along these lines (as methods on the building blocks)? Some guidance might come from selectMethod("combine", c("data.frame", "data.frame")).

FWIW -- 

stop(paste0()) is just stop(), which accepts multiple arguments and pastes them together without a separator. 

x at NAMES is names(), as in names(GRanges("chr1", IRanges(1, 10, names="A")))

?elementMetadata says "Alternatives to 'mcols' functions. Their [i.e., elementMetadata] use is discouraged."
This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.