[Bioc-devel] AnnotationHubData Error: Access denied: 530
Hi Johannes, Sonali,
On 04/10/2015 09:40 AM, Arora, Sonali wrote:
Hi Rainer, Just to be clear - what do you want to be available from AnnotationHub() in the end? Currently the GTF files from Ensembl are already present inside the AnnotationHub library(AnnotationHub) ah = AnnotationHub() gtf <- query(ah, "GTF") gtf <- query(gtf, "Ensembl") gtf[1] gtf[[1]] # returned to you as GenomicRanges object. - why not get the GTF files directly from AnnotationHub instead of getting them from the ftp site? Then you can make your EnsDb classes from these GRanges. It will also make your recipe faster because you will not have to download the file and parse the object.
A GRanges object is not the same as a GTF file and I guess Johannes
wants access to the GTF file. Are these GTF files available on
AnnotationHub?
@Johannes - Here is one alternative: You could take a different approach
and implement some equivalent of makeTxDbFromGRanges() for EnsDb
objects. So people could just do:
library(ensembldb)
ensdb <- makeEnsDbFromGRanges(gtf[[1]])
like they can do right now with makeTxDbFromGRanges():
library(GenomicFeatures)
txdb <- makeTxDbFromGRanges(gtf[[1]])
That way you don't need a recipe or try to add things to AnnotationHub
at all.
@Sonali - These GRanges objects I get from AnnotationHub have no genome
information and their seqlevels are not sorted:
> seqinfo(gtf[[1]])
Seqinfo object with 22 sequences from an unspecified genome; no
seqlengths:
seqnames seqlengths isCircular genome
X <NA> <NA> <NA>
9 <NA> <NA> <NA>
8 <NA> <NA> <NA>
7 <NA> <NA> <NA>
6 <NA> <NA> <NA>
... ... ... ...
12 <NA> <NA> <NA>
11 <NA> <NA> <NA>
10 <NA> <NA> <NA>
1 <NA> <NA> <NA>
MT <NA> <NA> <NA>
I know it's easy enough to sort the seqlevels with sortSeqlevels() but
what about having these things done by the recipe instead?
Thanks,
H.
Thanks, Sonali. On 4/9/2015 11:14 PM, Rainer Johannes wrote:
dear all, I have added a recipe to the AnnotationHubData to provide EnsDb classes (from my ensembldb package) based on GTF files from Ensembl. Now, after adding the recipe to the AnnotationHubData package and installing it (following the vignettes from the AnnotationHub and AnnotationHubData) I called updateResources(AnnotationHubRoot=getWd(), BiocVersion=biocVersion(), preparerClasses="EnsemblGtfToEnsDbPreparer", insert=FALSE, metadataOnly=TRUE) and got the output: Ailuropoda_melanoleuca.ailMel1.78.gtf.gz Anas_platyrhynchos.BGI_duck_1.0.78.gtf.gz Anolis_carolinensis.AnoCar2.0.78.gtf.gz Astyanax_mexicanus.AstMex102.78.gtf.gz Bos_taurus.UMD3.1.78.gtf.gz Caenorhabditis_elegans.WBcel235.78.gtf.gz Callithrix_jacchus.C_jacchus3.2.1.78.gtf.gz Canis_familiaris.CanFam3.1.78.gtf.gz Cavia_porcellus.cavPor3.78.gtf.gz Chlorocebus_sabaeus.ChlSab1.1.78.gtf.gz Choloepus_hoffmanni.choHof1.78.gtf.gz Ciona_intestinalis.KH.78.gtf.gz Ciona_savignyi.CSAV2.0.78.gtf.gz Danio_rerio.Zv9.78.gtf.gz Dasypus_novemcinctus.Dasnov3.0.78.gtf.gz Dipodomys_ordii.dipOrd1.78.gtf.gz Drosophila_melanogaster.BDGP5.78.gtf.gz Error in function (type, msg, asError = TRUE) : Access denied: 530 I guess that must be related to the Ensembl ftp? Is anybody else experiencing this error? cheers, jo my session info:
sessionInfo()
R Under development (unstable) (2015-03-04 r67940) Platform: x86_64-apple-darwin14.3.0/x86_64 (64-bit) Running under: OS X 10.10.3 (Yosemite) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] AnnotationHubData_0.0.205 futile.logger_1.4 [3] AnnotationHub_1.99.81 GenomicRanges_1.19.52 [5] GenomeInfoDb_1.3.16 IRanges_2.1.43 [7] S4Vectors_0.5.22 BiocGenerics_0.13.11 loaded via a namespace (and not attached): [1] Rcpp_0.11.5 BiocInstaller_1.17.7 [3] XVector_0.7.4 futile.options_1.0.0 [5] GenomicFeatures_1.19.37 bitops_1.0-6 [7] tools_3.2.0 zlibbioc_1.13.3 [9] biomaRt_2.23.5 digest_0.6.8 [11] BSgenome_1.35.20 jsonlite_0.9.15 [13] RSQLite_1.0.0 shiny_0.11.1 [15] DBI_0.3.1 rtracklayer_1.27.11 [17] httr_0.6.1 stringr_0.6.2 [19] Biostrings_2.35.12 Biobase_2.27.3 [21] R6_2.0.1 AnnotationDbi_1.29.21 [23] XML_3.98-1.1 BiocParallel_1.1.24 [25] RJSONIO_1.3-0 ensembldb_0.99.15 [27] lambda.r_1.1.7 Rsamtools_1.19.50 [29] htmltools_0.2.6 GenomicAlignments_1.3.34 [31] AnnotationForge_1.9.7 mime_0.3 [33] interactiveDisplayBase_1.5.6 xtable_1.7-4 [35] httpuv_1.3.2 RCurl_1.95-4.5 [37] VariantAnnotation_1.13.47
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319