Message-ID: <MWHPR01MB27011E623C2D8F599DB2E808CDBE0@MWHPR01MB2701.prod.exchangelabs.com>
Date: 2016-11-17T00:00:11Z
From: Dario Strbenac
Subject: [Bioc-devel] vmatchPattern Returns Out of Bounds Indices
Hello,
If using vmatchPattern to find a sequence in another sequence, the resulting end index can be beyond the length of the subject XStringSet. For example:
forwardPrimer <- "TCTTGTGGAAAGGACGAAACACCG"
> range(width(reads))
[1] 75 75
primerEnds <- vmatchPattern(forwardPrimer, reads, max.mismatch = 1)
> range(unlist(endIndex(primerEnds))
[1] 23 76
This causes problems if using extractAt to obtain the sequences within each read. For example:
sequences = extractAt(reads, locations)
Error in .normarg_at2(at, x) :
some ranges in 'at' are off-limits with respect to their corresponding sequence
in 'x'
It's rare, but still a problem, nonetheless.
> table(unlist(endIndex(primerLocations)) > 75)
FALSE TRUE
366225 2
This happens with Biostrings 2.42.0.
--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia