Skip to content
Prev 309908 / 398502 Next

fusion of overlapping intervals

On 11/05/2012 09:14 AM, Hermann Norpois wrote:
This data is very naturally handled by the "GRange" class  in Bioconductor's 
GenomicRanges package

   source("http://bioconductor.org/biocLite.R")
   biocLite("GenomicRanges')
   library(GenomicRanges)

   gr = GRanges(rep(c("a", "b"), each=3),
                IRanges(c(5, 30, 49, 70, 100, 129),
                        c(10, 52, 101, 103, 130, 140)),
                strand="*")

and then

 > reduce(gr)
GRanges with 3 ranges and 0 metadata columns:
       seqnames    ranges strand
          <Rle> <IRanges>  <Rle>
   [1]        a [ 5,  10]      *
   [2]        a [30, 101]      *
   [3]        b [70, 140]      *
   ---
   seqlengths:
     a  b
    NA NA

There are vignettes

   vignette(package="GenomicRanges")

and additional training material, e.g.,

   http://bioconductor.org/help/course-materials/2012/CSC2012/

If you pursue this solution then please follow-up with questions on the 
Bioconductor mailing list

   http://bioconductor.org/help/mailing-list/

Martin