Skip to content
Prev 14555 / 21307 Next

[Bioc-devel] Best practices to load data for vignette/tests

You could see if there is any existing data already in Bioconductor for use with your package.  That would be preferable.


http://bioconductor.org/packages/release/BiocViews.html#___Software


searching for fastq -  you could see what data ShortRead, seqTools, and FastqCleaner

similarly you could also search for rna-seq packages to see if any of their data is appropriate.


There are also a number of experiment data packages that may provide the data format you are in need of.

http://bioconductor.org/packages/release/BiocViews.html#___ExperimentData

You could search here as well.


Lastly,  Bioconductor has an experimentHub for storing large data files. You can search interactively in R or the web API interface here:

https://experimenthub.bioconductor.org/



If none of those location provide data currently in Bioconductor that is suitable for your package,  You can submit your own data to the ExperimentHub.

http://bioconductor.org/packages/devel/bioc/vignettes/ExperimentHub/inst/doc/CreateAnExperimentHubPackage.html

You could download directly but this could be time consuming depending on internet connections and download speeds.  The Bioconductor hubs provide a caching mechanism so it is only downloaded once and then it remembers where the file is on the system for later use.


Cheers,




Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263