Skip to content
Back to formatted view

Raw Message

Message-ID: <BAY179-W424D25C87634BDEBAD7393C8710@phx.gbl>
Date: 2014-11-27T23:38:39Z
From: Rodrigo Bertollo de Alexandre
Subject: [Bioc-devel] Regarding the function: oligonucleotideFrequency for k-mers > 11 bps

I've seen that it is almost impossible to work with k-mers as big as 13 with this function. This is mainly because this function doesn't create a list of k-mers from the sequence but from all possible combinations.
This is basically a bug, since in a big sequence of 1000 bps the maximum number of 13-mers is L-k+1 = 988. While the number of possible 13-mers is 4^k = 28561.This means that the code is basically analyzing 27573 nonexistent k-mers. 
I'm wondering if there could have a modification in the package regarding this issue...
I did my own function for this (which it runs ok). However, having all you need in a unique package would be even better...(I posted my code on the stackoverflow: http://stackoverflow.com/a/27178731/4004499)
Sincerely,Rodrigo 		 	   		  
	[[alternative HTML version deleted]]