Skip to content
Prev 9551 / 21318 Next

[Bioc-devel] BiomartGeneRegionTrack question

Valerie,
I took a somewhat closer look at this, and I think that a mapping between Ensembl genome version and UCSC genome identifiers is all that is needed from the Bioconductor side. I can figure out a way to identify the relevant Ensembl archive to load during Gviz package build. 
Biomart provides the version information of all data sets via the listDatasets() function in the form of a data.frame:
head(ds)
                         dataset                                description
1         oanatinus_gene_ensembl     Ornithorhynchus anatinus genes (OANA5)
2        cporcellus_gene_ensembl            Cavia porcellus genes (cavPor3)
3        gaculeatus_gene_ensembl     Gasterosteus aculeatus genes (BROADS1)
4 itridecemlineatus_gene_ensembl Ictidomys tridecemlineatus genes (spetri2)
5         lafricana_gene_ensembl         Loxodonta africana genes (loxAfr3)
6        choffmanni_gene_ensembl        Choloepus hoffmanni genes (choHof1)
  version
1   OANA5
2 cavPor3
3 BROADS1
4 spetri2
5 loxAfr3
6 choHof1

As you can see, the species is somewhat stored in the dataset column, but not in a standard term. The genome or assembly version is stored in the version column. With that information and the table provided here (https://genome.ucsc.edu/FAQ/FAQreleases.html) it should be fairly straight forward to set up a manual mapping. If you do not want to go through all the old Biomart archives you can get a complete listing from this table on the ENSEMBL web site: http://www.ensembl.org/info/website/archives/assembly.html
I have already done this exercise with the tables in the Gviz package, and could provide a current version with the relevant information. Mappings from UCSC genome to ENSEMBL versions do not have to be unique since the latter are typically down to the minor release, whereas UCSC only lists the major release. My understanding is that all minor releases are guaranteed to share the same chromosome coordinates, and only represent local patches, but you guys surely know more about all of this.

Just let me know about the best way forward.
Florian
On 26/07/16 21:27, "Obenchain, Valerie" <Valerie.Obenchain at RoswellPark.org> wrote: