Skip to content
Prev 5432 / 21312 Next

[Bioc-devel] GenomicAlignments: using asMates=TRUE and yieldSize with paired-end BAM files

Hi Mike,

This is fixed in Rsamtools 1.15.35.

The bug was related to when the mate-pairing was performed wrt meeting 
the 'yieldSize' requirement. Thanks for sending the file and 
reproducible example.

The file has ~115 million records:

fl <- "wgEncodeCaltechRnaSeqGm12878R2x75Il200AlignsRep1V2.bam"
To process the complete file with a yield size of 1e6 took ~ 18 GIG and 
25 minutes. (ubuntu server, 16 processors, 387 GIG of ram)

bf <- BamFile(fl, yieldSize=1000000, asMates=TRUE)
grl <- exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, by="gene")
SO <- function(x)
     summarizeOverlaps(grl, x, ignore.strand=TRUE, singleEnd=FALSE)
Thanks for reporting the bug.

Valerie
On 03/21/14 13:55, Michael Love wrote: