Dear BioC team, I think I found something incorrect in TxDb.Hsapiens.UCSC.hg38.knownGene, and reported in https://support.bioconductor.org/p/88232/ but didn't get reply. I think it is a bug, so decided to send it via email to let you know. I am using the developing version of TxDb.Hsapiens.UCSC.hg38.knownGene, because the release version is build in 2015 and has a lot of difference with UCSC website. Here is the R code for the bug: require(TxDb.Hsapiens.UCSC.hg38.knownGene) require(GenomicRanges) geneDb=TxDb.Hsapiens.UCSC.hg38.knownGene allGeneRange<-genes(geneDb) allGeneRange["875"] txs <- transcriptsBy(TxDb.Hsapiens.UCSC.hg38.knownGene) txs["875"] We can find CBS gene (txs["875"]) has 25 transcripts, from two regions: chr21 [6444869, 6467509] and chr21 [43075107, 43076288] 1. CBS gene ("875") was only in chr21 [43075107, 43076288]. The region of chr21 [6444869, 6467509] was CBSL gene ("102724560"). But CBSL was not in the database, and its transcripts were recorded in CBS. 2. The gene region of CBS gene (allGeneRange["875"]) was in chr21 [6444869, 43076943], which included all the region between 6444869-43076943. But it is not correct as they were two separate regions. Thanks! Shilin
[Bioc-devel] A bug in TxDb.Hsapiens.UCSC.hg38.knownGene?
1 message · zhao shilin