Skip to content
Prev 3594 / 21307 Next

[Bioc-devel] distanceToNearest Self Query

Hi Dario,

Thanks for catching this. Now fixed in GenomicRanges 1.9.61.

It was already the case for nearest,GenomicRanges,GenomicRanges that 
self-hits were found when both 'x' and 'subject' were present but not 
when only 'x' was provided. The problem in distanceToNearest was that 
when 'subject' was missing it was set equal to 'x' before calling 
nearest. This resulted in self-hits being found regardless of how the 
method was called.  distanctToNearst now behaves appropriately,.


gr <- GRanges("1", IRanges(c(1, 10, 1000), width=2), "+")
 > gr
GRanges with 3 ranges and 0 metadata columns:
       seqnames       ranges strand
<Rle> <IRanges> <Rle>
   [1]        1 [   1,    2]      +
   [2]        1 [  10,   11]      +
   [3]        1 [1000, 1001]      +
   ---
   seqlengths:
     1
    NA

When only 'x' is provided no self-hits are found,

 > distanceToNearest(gr)
DataFrame with 3 rows and 3 columns
   queryHits subjectHits  distance
<integer> <integer> <integer>
1         1           2         8
2         2           1         8
3         3           2       989

When both 'x' and 'subject' are provided self-hits are found,

 > distanceToNearest(gr, gr)
DataFrame with 3 rows and 3 columns
   queryHits subjectHits  distance
<integer> <integer> <integer>
1         1           1         0
2         2           2         0
3         3           3         0


Valerie
On 09/04/2012 11:00 PM, Dario Strbenac wrote: