[Bioc-devel] RE : AnnotationDbi and select function
Hi guys, Thanks for your feedbacks. Indeed I put GENEID because it is used in the txdb database.
library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene columns(txdb)
[1] "CDSID" "CDSNAME" "CDSCHROM" "CDSSTRAND" "CDSSTART" [6] "CDSEND" "EXONID" "EXONNAME" "EXONCHROM" "EXONSTRAND" [11] "EXONSTART" "EXONEND" "GENEID" "TXID" "EXONRANK" [16] "TXNAME" "TXCHROM" "TXSTRAND" "TXSTART" "TXEND" I will move to ENTREZID which is much faster ! I'm glad It could help Nicolas ________________________________________ De : bioc-devel-bounces at r-project.org [bioc-devel-bounces at r-project.org] de la part de Marc Carlson [mcarlson at fhcrc.org] Date d'envoi : mercredi 12 mars 2014 20:18 ? : bioc-devel at r-project.org Objet : Re: [Bioc-devel] AnnotationDbi and select function Thanks Nicolaus! That's a good bug. I will work on a fix. The reason why James work-around here functions is because the number of databases that it has to query is fewer by one. It is also faster for this reason. So when you say GENEID you mean the ids used in the associated txdb database which means that these have to be checked against that DB (and anything related to it extracted) and then merged with the results of the symbol information by joining on the foreign key for these two DBs. So thats actually much more complex than just extracting all the same data from just the org package even though the end result (in this case) is the same. The bug is probably happening in the associated merge step. Marc
On 03/12/2014 10:06 AM, James W. MacDonald wrote:
Hi Nicolas, On 3/12/2014 12:39 PM, Servant Nicolas wrote:
Dear all,
I have an error using the select function from the AnnotationDbi
package.
I try to convert some geneID into Symbol, but for some strange
reasons it crashed.
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
isActiveSeq(txdb)[seqlevels(txdb)] <- FALSE
isActiveSeq(txdb)[c("chr16","chr1")] <- TRUE
geneGR <- exonsBy(txdb, "gene")
library(Homo.sapiens)
symbol <- select(Homo.sapiens, keys = names(geneGR), keytype =
"GENEID", columns = "SYMBOL")
Erreur dans head(select(Homo.sapiens, keys = names(geneGR)[1:1001],
keytype = "GENEID", :
erreur d'?valuation de l'argument 'x' lors de la s?lection d'une
m?thode pour la fonction 'head' : Erreur dans res[,
.reverseColAbbreviations(x, cnames), drop = FALSE] :
length(geneGR)
[1] 3269 ## The first 1K work
symbol <- select(Homo.sapiens, keys = names(geneGR)[1:1000], keytype = "GENEID", columns = "SYMBOL")
## The 1K+1 does not !
symbol <- select(Homo.sapiens, keys = names(geneGR)[1:1001], keytype = "GENEID", columns = "SYMBOL")
Erreur dans res[, .reverseColAbbreviations(x, cnames), drop = FALSE] : nombre de dimensions incorrect It looks like I cannot convert more than 1K elements ?? Any reason for that ? Thank you very much Nicolas
Not sure what 'GENEID' is in this context - it appears to be Entrez Gene. But anyway, if you use "ENTREZID" instead, it works fine:
symbol <- select(Homo.sapiens, names(geneGR), "SYMBOL", "ENTREZID") symbol <- select(Homo.sapiens, names(geneGR), "GENEID", "ENTREZID")
Error in res[, .reverseColAbbreviations(x, cnames), drop = FALSE] : incorrect number of dimensions
symbol <- select(Homo.sapiens, names(geneGR)[1:1000], "GENEID",
"ENTREZID")
symbol <- select(Homo.sapiens, names(geneGR)[1:1001], "GENEID",
"ENTREZID") Error in res[, .reverseColAbbreviations(x, cnames), drop = FALSE] : incorrect number of dimensions Best, Jim
sessionInfo()
R Under development (unstable) (2014-03-05 r65125)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] Homo.sapiens_1.1.2
[2] org.Hs.eg.db_2.10.1
[3] GO.db_2.10.1
[4] RSQLite_0.11.4
[5] DBI_0.2-7
[6] OrganismDbi_1.5.3
[7] XVector_0.3.7
[8] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1
[9] GenomicFeatures_1.15.9
[10] AnnotationDbi_1.25.14
[11] GenomeInfoDb_0.99.17
[12] Biobase_2.23.6
[13] GenomicRanges_1.15.32
[14] IRanges_1.21.32
[15] BiocGenerics_0.9.3
[16] RColorBrewer_1.0-5
[17] reshape2_1.2.2
[18] reshape_0.8.4
[19] plyr_1.8.1
[20] ggplot2_0.9.3.1
[21] Matrix_1.1-2-2
loaded via a namespace (and not attached):
[1] BatchJobs_1.2 BBmisc_1.5
[3] BiocParallel_0.5.16 biomaRt_2.19.3
[5] Biostrings_2.31.14 bitops_1.0-6
[7] brew_1.0-6 BSgenome_1.31.12
[9] codetools_0.2-8 colorspace_1.2-4
[11] dichromat_2.0-0 digest_0.6.4
[13] fail_1.2 foreach_1.4.1
[15] GenomicAlignments_0.99.29 graph_1.41.3
[17] grid_3.1.0 gtable_0.1.2
[19] iterators_1.0.6 labeling_0.2
[21] lattice_0.20-27 MASS_7.3-29
[23] munsell_0.4.2 proto_0.3-10
[25] RBGL_1.39.2 Rcpp_0.11.0
[27] RCurl_1.95-4.1 Rsamtools_1.15.32
[29] rtracklayer_1.23.15 scales_0.2.3
[31] sendmailR_1.1-2 stats4_3.1.0
[33] stringr_0.6.2 tools_3.1.0
[35] XML_3.98-1.1 zlibbioc_1.9.0
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel