Skip to content
Prev 17069 / 21318 Next

[Bioc-devel] Question about org.Dr.eg.db package

Hi Gennady,

That information should probably be cleaned up, and the BiMaps that point
to the location data removed. While the OrgDbs do contain position
information, it's been deprecated, which you would find if you tried to
query using select():
'select()' returned 1:1 mapping between keys and columns
  ENTREZID CHR
1    30037   5
Warning message:
In .deprecatedColsMessage() :
  Accessing gene location information via 'CHR','CHRLOC','CHRLOCEND' is
  deprecated. Please use a range based accessor like genes(), or select()
  with columns values like TXCHROM and TXSTART on a TxDb or OrganismDb
  object instead.

The rationale being that the OrgDb packages are intended to contain
functional annotations, which are not based on any build, and instead are
current as of the construction of the OrgDb package. Since positional
information should be based on a genome release, those data have been
migrated to the TxDb and EnsDb packages, which are based on a given release.

Put a different way, the data in an OrgDb package is downloaded from NCBI
as of a particular date, and the positional data we get are whatever we got
from NCBI on that date. This is obviously a problem for the positional
data, because what we get isn't necessarily build-specific. We get the TxDb
data from the UCSC Genome Browser, which is build specific, so we can tell
end users exactly what build the data come from. Ideally these data would
be defunct in the OrgDb packages, but it hasn't happened yet.

Best,

Jim



On Thu, Aug 13, 2020 at 4:39 PM Margolin, Gennady (NIH/NICHD) [C] via
Bioc-devel <bioc-devel at r-project.org> wrote: