Hi Ilari, org.Hs.eg.db is one of the packages included in Homo.sapiens and it's the origin of 'MAP'. This variable maps between entrez gene ids and cytoband names, not genomic coordinates (as you've discovered). It includes bands and sub-bands provided by Entrez Gene downloaded from here: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA To see a full description of 'MAP': library(org.Hs.eg.db) ?org.Hs.egMAP We don't have an annotation package with cytoband coordinates but you can download them using rtracklayer: library(rtracklayer) session <- browserSession() genome(session) <- "hg19" query <- ucscTableQuery(session, "cytoBandIdeo") tbl <- getTable(query)
dim(tbl)
[1] 931 5
head(tbl)
chrom chromStart chromEnd name gieStain 1 chr1 0 2300000 p36.33 gneg 2 chr1 2300000 5400000 p36.32 gpos25 3 chr1 5400000 7200000 p36.31 gneg 4 chr1 7200000 9200000 p36.23 gpos25 5 chr1 9200000 12700000 p36.22 gneg 6 chr1 12700000 16200000 p36.21 gpos50
Valerie
On 03/22/2014 10:12 AM, Ilari Scheinin wrote:
Hi, I would like to obtain the boundaries of cytogenetic bands for human (hg19) as I need to map arbitrary genomic positions to the band containing them. I figured these would be available via the Homo.sapiens annotation package, so I took a look at the available keytypes. MAP looked promising:
library(Homo.sapiens) head(keys(Homo.sapiens, keytype="MAP"))
[1] "19q13.4" "12p13.31" "8p22" "14q32.1" "3q25.1" ?2q35? However, upon a closer look, these don?t appear to be the actual bands themselves, but are instead the matching bands for some other level of data, as it contains entries such as "19q13-qter?. (And there are 2,446 of these entries whereas there are 862 bands.) A bit of searching returned two software packages that do contain this information: idiogram (data(Hs.cytoband)) and OmicCircos (data(UCSC.hg19.chr)). The first one seems to be from genome build hg17, but the second one has hg18 and hg19. However, using a software package instead of an annotation one to obtain this information seems wrong, and makes me worry if it will be kept up-to-date in the future (c.f. idiogram). So, are the coordinates of the cytogenetic bands contained somewhere in the annotation packages? Thanks, Ilari
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Valerie Obenchain Program in Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B155 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: vobencha at fhcrc.org Phone: (206) 667-3158 Fax: (206) 667-1319