Skip to content
Back to formatted view

Raw Message

Message-ID: <MWHPR01MB27011E623C2D8F599DB2E808CDBE0@MWHPR01MB2701.prod.exchangelabs.com>
Date: 2016-11-17T00:00:11Z
From: Dario Strbenac
Subject: [Bioc-devel] vmatchPattern Returns Out of Bounds Indices

Hello,

If using vmatchPattern to find a sequence in another sequence, the resulting end index can be beyond the length of the subject XStringSet. For example:

forwardPrimer <- "TCTTGTGGAAAGGACGAAACACCG"
> range(width(reads))
[1] 75 75
primerEnds <- vmatchPattern(forwardPrimer, reads, max.mismatch = 1)
> range(unlist(endIndex(primerEnds))
[1] 23 76

This causes problems if using extractAt to obtain the sequences within each read. For example:

sequences = extractAt(reads, locations)
Error in .normarg_at2(at, x) : 
  some ranges in 'at' are off-limits with respect to their corresponding sequence
  in 'x'

It's rare, but still a problem, nonetheless.

> table(unlist(endIndex(primerLocations)) >  75)

 FALSE   TRUE 
366225      2

This happens with Biostrings 2.42.0.

--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia