Hello, Functions such as vmatchPattern and vmatchPDict naturally lend themselves to being parallelised. Could they be enhanced to accept a BiocParallelParam object ? Or, is there no significant performance difference using them as-is and having the bplapply loop surrounding them and repeatedly calling DNAString (it's odd that vmatchPattern - for searching BSgenome objects - requires a DNAString for the pattern, rather than a DNAStringSet) or DNAStringSet ? -------------------------------------- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia
[Bioc-devel] String Matching in Parallel
2 messages · Dario Strbenac, Martin Morgan
1 day later
On 09/04/2016 10:00 PM, Dario Strbenac wrote:
Hello, Functions such as vmatchPattern and vmatchPDict naturally lend themselves to being parallelised. Could they be enhanced to accept a BiocParallelParam object ? Or, is there no significant performance difference using them as-is and having the bplapply loop surrounding them and repeatedly calling DNAString (it's odd that vmatchPattern - for searching BSgenome objects - requires a DNAString for the pattern, rather than a DNAStringSet) or DNAStringSet ?
fwiw, and until there is direct support for this, probably one wants to look at bpvec() rather than bplapply() to parallelize these. Also, one would _not_ want to parallelize on the argument that becomes the PDict, because this part of the calculation scales favorably with number of elements. Martin
-------------------------------------- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
This email message may contain legally privileged and/or...{{dropped:2}}