Skip to content
Prev 17120 / 21312 Next

[Bioc-devel] glitches with releaseName() and seqlevelsStyle()

Hi Robert,
On 9/2/20 11:12, Robert Castelo wrote:
I'm dropping the notion of "release name" for BSgenome objects in BioC 
3.12. For backward compatibility I just restored a temporary 
releaseName() method (in BSgenome 1.57.6) that does the following:

 > releaseName(BSgenome.Hsapiens.UCSC.hg19)
[1] NA
Warning message:
   Starting with Bioconductor 3.12, BSgenome objects no longer have a
   "release name". As a consequence of this change calling releaseName()
   on a BSgenome object now always returns NA and is deprecated.
This happens when you end up with a mix of styles. In the case of hg19, 
all the sequences get renamed from UCSC to NCBI names **except** chrM:

   library(TxDb.Hsapiens.UCSC.hg19.knownGene)
   txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

   table(genome(txdb))
   # hg19
   #   93

   seqlevelsStyle(txdb) <- "NCBI"

   seqlevelsStyle(txdb)
   # [1] "NCBI" "UCSC"

   table(genome(txdb))
   # GRCh37.p13       hg19
   #         92          1

   genome(txdb)[genome(txdb) == "hg19"]
   #   chrM
   # "hg19"

chrM cannot and should not be renamed to MT because the MT sequence in 
GRCh37.p13 is NOT the same as the chrM sequence in hg19. So the new 
seqlevelsStyle() behavior fixes a long standing bug.

See previous discussion on this list about the long dance between UCSC 
and NCBI about the mitochondrian sequence in hg19/GRCh37/GRCh37.p13: 
https://stat.ethz.ch/pipermail/bioc-devel/2020-August/017086.html

Hope this helps,
H.