An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/bioc-devel/attachments/20110512/cac59dc4/attachment.pl>
[Bioc-devel] FeatureDb could be generalized
5 messages · Michael Lawrence, Marc Carlson
Hi Michael, That is an interesting idea. I like the idea of having more data be available via FeatureDb, and I especially like the idea of having useful transformations of the data it provides. But I am a little confused about one part of what you are suggesting. Is there a reason why we would want to add a bunch of stuff to basically re-implement what the database does instead of just writing some simple methods to allow the import of these other kinds to files? I can think of advantages to keeping the data container type consistent (providing that it is not proving burdensome), since SQL allows joins to be made across databases thus allowing a collection of data that has all been stored in this way to be easily linked together as needed. But what is the advantage of making a bunch of classes and methods that will allow us to pretend that our bam and vcf files are actually databases? Also what would the purpose of a SequenceDb object be? The name is generic enough that I am unable to guess what you have in mind. Marc
On 05/12/2011 06:08 AM, Michael Lawrence wrote:
Hi guys, I was just looking at the FeatureDb class in GenomicFeatures. I'm wondering if we couldn't abstract that from its SQLite implementation. There are many other sources of features, e.g., files like BAM, VCF and even BED. If these are indexed properly, we could make fast queries against them. So what we really need is a class, named something like FeatureDb, that returns, for a given 'which' (as a bare minimum), a GRanges. I could also imagine having proxy FeatureDb objects that transform the data on the way. Like a FeatureDb that will return the coverage, using another FeatureDb as a source. Caching could be implemented as part of the base class. I'm also wondering whether these should be reference classes. Then if some "parent" FeatureDb is modified, the down-stream objects can be informed of the change. And a SequenceDb would be nice, too. I'll write up a prototype in the MutableRanges package (in the bioc repo), but I'll call it RangeDb to avoid conflicts for now. Michael [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/bioc-devel/attachments/20110512/e02700bb/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/bioc-devel/attachments/20110513/b21de6ed/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/bioc-devel/attachments/20110513/e2af8410/attachment.pl>