[Bioc-devel] Zipped Rdata files in windows binaries
Hi Adi The recommended size for software packages is less than 2 MB on disk http://wiki.fhcrc.org/bioc/Package_Guidelines#size-requirements Data files in the range of 10 ~ 20 MB will need to be built into a separate data package and would typically go into our experimental data package repository. http://bioconductor.org/packages/release/ExperimentData.html A typical user of your software package would probably not need the large experiment data files for their application and hence we would like to maintain them as separate packages. I can help you with creating the data package once you have the data files ready. Nishant
Tarca, Adi wrote:
Dear Robert, Thank you for your advice. I will then put all the data at the build time as .RData files. The only issue I had was that I did not know if there is a limitation in terms of disk space occupied by these files. I am talking here about around 10 MB but it may double in the future releases. I was not too worried about people needing an internet connection when using my package in conjunction with a new organism for the first time, since is the same thing as trying to use some affy functions on a chip for which you do not have the cdf (except that you do not download a file but an additional package). Regards, Adi Adi Laurentiu Tarca, PhD Assistant Professor (Research), Bioinformatics and Computational Biology Unit of the NIH Perinatology Research Branch, Department of Computer Science & Center for Molecular Medicine and Genetics, Wayne State University, 3990 John R., Office 4809, Detroit, Michigan 48201 Tel: 1-313-5775305 Cell: 1-313-4043116 http://bioinformaticsprb.med.wayne.edu/tarca/ -----Original Message----- From: rgentlem at gmail.com [mailto:rgentlem at gmail.com] On Behalf Of Robert Gentleman Sent: Friday, February 06, 2009 3:56 PM To: Tarca, Adi Cc: bioc-devel at stat.math.ethz.ch Subject: Re: [Bioc-devel] Zipped Rdata files in windows binaries Hi, On Fri, Feb 6, 2009 at 11:05 AM, Tarca, Adi <atarca at med.wayne.edu> wrote:
Hi all,
I am writing an R packge and at a given point I need to load an Rdata file from the "data" folder of the installed package, and in case the file it is not there I try to download it from somwhere.
That does not sound like a good thing to do. The data folder is exclusively for data that is stored essentially at package build time and is not a place to put other files, or to use during a session to store objects. Objects there are platform independent and are accessed using the data command in R. Please don't try to modify this behavior.
If you want/need to have your own data storage type and want to control it, you should use a different folder. A common choice is inst/extdata. And then you are in control of everything.
Since lots of people use R in cases where they do not have access to the internet, the idea that they should download something for your package to work seems problematic. Why not just use one of the many platform independent formats and distribute the data on all platforms in the same way.
There are a number of examples in Bioconductor packages (eg simpleaffy or flowCore)
Best wishes
Robert
I used to do the following test to see if a file called "datload" is NOT there, case in which I need to download it:
if(! paste(datload,".RData",sep="") %in%
dir(system.file("data",package="SPIA"))) { ...download the file from
somwhere else }
It works fine except that the windows binary package created by bioconductor scripts from my source, puts all RData file in a Rdata.zip file. Is there a way to list the files in Rdata.zip to see if my file is in there?
Alternatively I tried to use the data() function and try to load it (in a private environment), and in case it is not loaded then try to download it. However, the data() function does not return an error but only a warning.
I tried to use:
ow <- options("warn")
options(warn=2) # to make warnings into errors
errs<-try(data(list=datload, envir=.myDataEnv),silent=TRUE)
if(class(errs)!="try-error"){
...download the file from somwhere else }
This works fine, except that a warning is still printed when the function returns.
Any ideas would be appreciated.
Thanks,
Adi Laurentiu Tarca
_______________________________________________ Bioc-devel at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
-- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
_______________________________________________ Bioc-devel at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel