Skip to content
Prev 14785 / 21318 Next

[Bioc-devel] Call for collaborators/advice

Power's back, so continuing on:

The Bioconductor Hi-C infrastructure should probably be consolidated 
into packages with more clearly defined boundaries:

1) A package to define a base (virtual) "Interactions" class. This would 
basically have a constant "Vector" store with a "Hits" object specifying 
the pairwise interactions between elements in the constant store. One 
could also distinguish between "SelfInteractions" (constant store) and 
the more general "Interactions" (two stores, possibly of different 
types, e.g., genomic interval -> protein interactions). A variety of 
methods would be available here to do manipulations and such.

2) A package to define an "Interactions" subclass where the store is a 
genomic interval, with basic methods to operate on such classes. Methods 
such as findOverlaps(), linkOverlaps() and boundingBox() would probably 
go here. @Luke, a binning method could also conceivably go here.

3) A package to define the "InteractionSet" and "ContactMatrix" classes. 
Basically just the "InteractionSet" package with the "GInteractions" 
class stripped out and moved into (2).

4) Additional packages for higher-level analysis, e.g., diffHic. These 
won't need much change beyond fiddling with the Imports.

So, (2) depends on (1), (3) depends on (2), and (4) depends on (3). (1) 
could either be S4Vectors itself, or we could take out the "Pairs" class 
from S4Vectors and put it into a separate package that provides data 
structures for interaction-esque thingies.

@Liz, "GenomicInteractions" (the package) would be a natural home for 
the class/methods in (2). It would also resolve the confusion between 
the "GInteractions" class and "GenomicInteractions" (the class) by 
making these one thing. There are two obvious hurdles:

- I'm not familiar with the requirements for the class specialization in 
"GenomicInteractions", but anything really custom would not belong in (2).
- Any methods for specialized data analysis would need to go into 
another package for (4). I don't have a good definition of what is 
specialized; but if there's statistical inference, it shouldn't be in (2).

All of this is open for discussion, if people are interested and willing 
to volunteer. These changes will not make the next release anyway.

-A
On 22/03/2019 19:54, Aaron Lun wrote: