Skip to content
Prev 5598 / 21312 Next

[Bioc-devel] Rsamtools applyPileups function not merging positions from multiple files if not identical

On 04/21/2014 02:33 PM, Jonathon Hill wrote:
I think your understanding is basically correct.

The function is assuming that the BAM files are sorted by position (with, e.g., 
sortBam, but the files don't have to be sorted by Rsamtools).

Executing a similar command gives me

 > str(r3[[1]])
List of 3
  $ seqnames: Named int 211195
   ..- attr(*, "names")= chr "chr20"
  $ pos     : int [1:211195] 60026 60027 60028 60029 60030 60031 60032 60033 
60034 60035 ...
  $ seq     : int [1:5, 1:2, 1:211195] 0 0 0 0 0 0 0 0 0 0 ...
   ..- attr(*, "dimnames")=List of 3
   .. ..$ : chr [1:5] "A" "C" "G" "T" ...
   .. ..$ : chr [1:2] "normal_srx113635_sorted.bam" "tumor_srx036691_sorted.bam"
   .. ..$ : NULL

Do you get something similar, especially the identical seqnames, pos dimension, 
and third dimension of seq? 'pos' should apparently be unique; so

 > any(duplicated(r3[[1]][["pos"]]))
[1] FALSE

If there are duplicates, I wonder how many there are and where they occur

   pos = r3[[1]][["pos"]]
   table(table(pos))
   udpos = unique(pos[duplicated(pos)])
   head(pos[match(pos, udpos)], 20)
   head(match(pos, udpos), 20)

If nothing is suggested by the above, can you make a subset of the BAM files 
available to me, e.g., the result of

   param = ScanBamParam(which=GRanges("chr20", IRanges(1, 1000000)))
   filterBam(fls[1], tempfile(), param=param)

Thanks,

Martin