Skip to content
Prev 181597 / 398502 Next

Loop avoidance and logical subscripts

Thank you! The script is now adapted to Biostrings and it is really fast! For
example, it does:

   alph_sequence <- alphabetFrequency(data$sequence, baseOnly=TRUE)
   data$GCsequence <- rowSums(alph_sequence[,c("G", "C")]) /
rowSums(alph_sequence)

in the G+C computation. It also works amazingly fast in substring extraction
(substring), reverse complement (reverseComplement sequences), palindromes
search (findComplementedPalindromes) and so on.

Now, my bottleneck is conventional string handling, because I have not found
yet how to convert DNAStringSets to vector of chars. Now, I'm doing it by:

   dna <- vector()
	for (i in 1:length(dnaset)) {
		c(dna, toString(data$dnaset[[i]])) -> dna
	}

Regards,

Retama