Skip to content
Prev 16288 / 21312 Next

[Bioc-devel] MRD measurements in Leukemic patients using NGS data in r

Oh hey, one last thing ? if all you want is to get nucleotide counts per region of interest, just use pileup() in Rsamtools, with bamWhich(GRanges) holding a GRanges (Genomic Ranges) of your regions added to scanBamParams for each BAM. It sounds awkward but in practice it is super fast and will give you all the nucleotide and read level information you could want. One of my interns implemented this for mitochondrial variant calling in MTseeker when we got sick of using gmapR and being flagged for errors on not-Linux. (We gutted the entire package recently and have new, insanely deep examples from Oxford Nanopore direct RNA sequencing and from large single cell datasets; I need to add those and get the package back out of purgatory). 

That said, in the end you will want a LOT of validation material so this is very much just a starting point. But still, it?s your starting point, in R at least. And truthfully I much prefer R/Bioconductor idioms to (say) pysam or the like. htsnim is nice but then you?ll be implementing the ML bits from scratch, so I think your instincts to try R first are sensible. 

Good luck! Even if you use this for something else besides MRD, I think it will become a useful exercise.  

--t