Am 10.06.2011 19:54, schrieb Martin Morgan:
On 06/10/2011 08:01 AM, Christian Ruckert wrote:
Hi,
I have written a function to read-in Roche SFF(Standard Flowgram Format)
files into R. Now I want to store the contents in standard Bioconductor
structures (e.q. sequences as DNAStringSet object). I have the quality
scores as a list of integer vectors. One list entry for each sequence.
The vector lengths correspond to the sequence lengths. The vectors
contain entries between 0 and 40 corresponding to the base quality at
this position.
Hi Christian
Maybe along the lines of
PhredQuality(sapply(qualitylist, function(x) rawToChar(as.raw(x + 33))))
This really speeds up things, thanks.
or via ShortRead::readQual / readFastaQual (can use a character vector
for the path; no need to create a RochePath). Probably you'll find
ShortReadQ useful for coordinating the sequences and qualities
Martin
I have successfully created a ShortReadQ object out of my sequences
class: ShortReadQ
length: 95551 reads; width: 77..1201 cycles
Is it reasonable to use ShortReadQ for sequences from Roche with their
differing lengths? It seems to work but the manual always states
"uniform-length short reads".