[Bioc-devel] SQLite databases

Seth Falcon · 2007-06-11T16:48:31Z

Hi, Francois Pepin writes: > I would personally very much appreciate something of the sort and I know > several other of my collaborators would also. > > My personal favorite was the idea of a by-species package that would > behave just like the chip annotation. To use EntrezID instead of the > probe ids and to have all the xxxGO, xxxENZYME, xxxSYMBOL, etc. This, in particular, is on the way. We plan to have EG.db packages for = human, mouse, and rat. Th

Seth Falcon

Mon, Jun 11, 2007 9:48 AM

Hi,

Francois Pepin <fpepin at cs.mcgill.ca> writes:

This, in particular, is on the way.  We plan to have <what>EG.db
packages for <what> = human, mouse, and rat.  These will replace the
<what>LLMappings packages, be SQLite-based, and look as much as
possible like the standard chip packages in terms of the maps provided
and interface.

Our plan is to have all BioC annotation data packages be SQLite-based.
There is a package in devel called AnnotationDbi and it implements an
interface for SQLite-based ann pkgs that allows them to be used just
like their environment-based cousins.  We are actively working on this
interface and making the set of SQLite-based packages complete.

In the process of creating these packges, we are creating a new
package building pipeline where we generate larger intermediate DBs
from which the individual annotation packages are generated.  At least
in principle, these are along the lines of a SQLite DB containing data
from the Entrez Gene ftp site.

Whether these intermediate DBs will be of use to others isn't clear to
me, but when our process gels a bit more, we will be happy to share
what we have.  Genrally, I think it will be useful to distribute
SQLite DB versions of public annotation data since this will support:

   - general SQL querries
   - works platform
   - can be accessed from just about any programming language

But in terms of making things easily accessible to Bioconductor users,
simply making a SQLite DB file available is not, in general, going to
be enough.  If we want users to be able to access the data without
writing SQL, then we will need careful study of the DB schema and
interface classes that provide alternate query mechanisms.

Best,

+ seth

Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org

[Bioc-devel] SQLite databases

Thread (3 messages)