Skip to content
Prev 1160 / 21307 Next

[Bioc-devel] SQLite databases

Hi,

Francois Pepin <fpepin at cs.mcgill.ca> writes:
This, in particular, is on the way.  We plan to have <what>EG.db
packages for <what> = human, mouse, and rat.  These will replace the
<what>LLMappings packages, be SQLite-based, and look as much as
possible like the standard chip packages in terms of the maps provided
and interface.
Our plan is to have all BioC annotation data packages be SQLite-based.
There is a package in devel called AnnotationDbi and it implements an
interface for SQLite-based ann pkgs that allows them to be used just
like their environment-based cousins.  We are actively working on this
interface and making the set of SQLite-based packages complete.

In the process of creating these packges, we are creating a new
package building pipeline where we generate larger intermediate DBs
from which the individual annotation packages are generated.  At least
in principle, these are along the lines of a SQLite DB containing data
from the Entrez Gene ftp site.

Whether these intermediate DBs will be of use to others isn't clear to
me, but when our process gels a bit more, we will be happy to share
what we have.  Genrally, I think it will be useful to distribute
SQLite DB versions of public annotation data since this will support:

   - general SQL querries
   - works platform
   - can be accessed from just about any programming language

But in terms of making things easily accessible to Bioconductor users,
simply making a SQLite DB file available is not, in general, going to
be enough.  If we want users to be able to access the data without
writing SQL, then we will need careful study of the DB schema and
interface classes that provide alternate query mechanisms.

Best,

+ seth