Skip to content
Prev 4228 / 21312 Next

[Bioc-devel] RFC: IntervalTrees for GRanges objects

On 04/03/2013 10:28 AM, Kasper Daniel Hansen wrote:
A cheap (both computationally and philosophically) hack is along the lines of

gr <- GRanges(c("A", "B", "A"), IRanges(c(2000, 1000, 3000), width=100))

rng <- range(gr)
off0 <- (width(rng)[-length(rng)] + 1L)
offset <- setNames(c(1L, cumsum(off0)) - start(rng), seqnames(rng))
shift(ranges(gr), offset[as.character(seqnames(gr))])

which shifts the ranges to be non-overlapping by seqname, so findOverlaps can 
operate on the IRanges directly. This is a trick I used in selectMethod(disjoin, 
"CompressedIRangesList"). There's the risk of integer overflow (when 
sum(width(rng) + 1L) > .Machine$integer.max) but this could be handled (if it 
occurs!) by partitioning into .Machine$integer.max-sized ranges. Unless IRanges 
were updated to support longer integers...

Martin