Skip to content

[Bioc-devel] viewMedians

9 messages · Hervé Pagès, Michael Lawrence, Peter Haverty

#
Hi Peter,

Seems like you have a pretty good implementation of the view* functions
in genoset. Nice work! And great to hear that there is so much room for
improvements to the implementation currently in IRanges. I'll try to
give this a shot soon but first I want to move Rle's to the S4Vectors
package.

Cheers,
H.
On 06/01/2014 07:58 PM, Peter Haverty wrote:

  
    
#
There is a lot going on with respect to the view* stuff, and yes, it's 
not just about Rle's but they also need to work on atomic vectors.
Right now min(), max(), sum(), and mean() all work on IntegerList,
NumericList, RleList, XIntegerViews and XDoubleViews but
implementations are disparate and share almost nothing. Comparing
for example the sum,XIntegerViews, sum,CompressedIntegerList, and
sum,SimpleIntegerList methods:

   library(XVector)
   set.seed(33)
   subject <- sample(50, 5400200, replace=TRUE)

   ## XIntegerViews:
   xiv <- successiveViews(as(subject, "XInteger"), width=rep(200, 
5400200/200))

   ## CompressedIntegerList:
   cil <- extractList(subject, ranges(xiv))

   ## SimpleIntegerList:
   sil <- IntegerList(unname(split(subject, togroup(ranges(xiv)))), 
compress=FALSE)

Then:

   > system.time(res1 <- sum(xiv))
      user  system elapsed
     0.008   0.000   0.008

   > system.time(res2 <- sum(cil))
      user  system elapsed
     0.488   0.004   0.492

   > system.time(res3 <- sum(sil))
      user  system elapsed
     0.036   0.000   0.034

The 3 methods share zero code. sum,XIntegerViews is implemented in
C while sum,CompressedIntegerList and sum,SimpleIntegerList are
implemented in R. Just an example.

All this need to be revisited. This is actually one of my goals for
BioC 3.0. viewMedians() on RleViews is just the tip of the iceberg.

H.
On 06/02/2014 01:24 PM, Michael Lawrence wrote: