[Bioc-devel] ¿A useful addition to MotifDb package?
Hi Steve, Paul,
On 10/09/2012 09:03 AM, Steve Lianoglou wrote:
Hi Paul, On Tue, Oct 9, 2012 at 11:29 AM, Paul Shannon <pshannon at fhcrc.org> wrote:
Hi Steve, Very timely, very helpful! Just yesterday I proposed to Martin, as a taks for the coming sprint: 4) Add the new TF PWMs from ENCODE into MotifDb I had not yet gotten as far as locating the data at ebi. Thanks! If you care to take a look, perhaps comment, this Bioc workflow became visible yesterday, but has not yet been generally announced: http://www.bioconductor.org/help/workflows/gene-regulation-tfbs/
Interesting. I'll have to take a closer look at it later. I (really) quickly skimmed the first 1/4th of it -- here is a rather minor comment: Under the "Sequence Search" section, the numbers for "loosely" defining the promoter bounds is 1k-3k up and 100-300 downstream from the TSS. I think these numbers aren't too controversial if you're talking about yeast (which the workflow seems to be about), but it might not hurt to specify that these numbers may not be appropriate in all contexts -- as another point of ref, the paper I linked to uses 5k up/down stream from the TSS for "proximal regulatory regions" of genes
So I wonder if it would not be better to not provide default values for the 'upstream' and 'downstream' arguments of the promoters() extractor. Whatever we do, getPromoterSeq() and promoters() should probably do the same (default values of 2000 and 200 for promoters(), no default values for getPromoterSeq()). Thanks, H.
... I will look at this more closely later, though -- it looks very helpful. Nice work! -steve
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319