[Bioc-devel] phred qualities
On 06/27/2012 11:22 AM, Martin Morgan wrote:
On 06/27/2012 08:02 AM, Kasper Daniel Hansen wrote:
Phred qualities are usually presented as ascii-encode numbers with an offset of either 32 or 64. Some packages returns this as a BStringSet. I can convert a character vector "charvec" to a list of integers using code like sapply(charvec, function(xx) charToRaw(xx) - 33L) Do we have fast(er) ways of doing this, when charvec is really long and not necessarily with the same number of chars in each string? I am thinking of implementing the sapply() above in C (directly vectorizing it), but surely someone has done something like that somewhere.
I think you get this with XStringSet, e.g., PhredQuality, with
x = PhredQuality(c("HH", "III"))
y = as.numeric(unlist(x)) - 33L
as.integer
z = relist(y, x)
or for a simple list split(y, rep(seq_along(x), elementLengths(x)) I have a recollection that there is something built-in... Martin
Martin
Kasper
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793