In process of creating data package from an existing one. The current package has both raw data files and the associated RData objects created from them. Currently, the data subdirectory is 1.5Mb and the extdata is 5.4Mb. Never having created a data package before, how is this best done? Should the data package contain only the raw data, or the RData objects too (tightly coupled)? If the later, what (if anything) is added to the DESCRIPTION meta-information to denote the dependency? Should both packages suggest each other?
Data package questions
2 messages · Roebuck,Paul L, Claudia Beleites
Dear Paul,
In process of creating data package from an existing one. The current package has both raw data files and the associated RData objects created from them. Currently, the data subdirectory is 1.5Mb and the extdata is 5.4Mb. Never having created a data package before, how is this best done? Should the data package contain only the raw data, or the RData objects too (tightly coupled)? If the later, what (if anything) is added to the DESCRIPTION meta-information to denote the dependency? Should both packages suggest each other?
I have one data set that I ship as a few principal components and reconstruct it by a function. The result is then assigned to the exported variable by delayedAssign. That way, you'd need to ship the raw data only, but can provide a ready-to-use R object as well. For my example data, I don't ship the raw data as it is too big (30 MB toy example is really unpolite on CRAN), but I provide it at r-forge and give the link on the help page and in the vignette. Claudia
Claudia Beleites Spectroscopy/Imaging Institute of Photonic Technology Albert-Einstein-Str. 9 07745 Jena Germany email: claudia.beleites at ipht-jena.de phone: +49 3641 206-133 fax: +49 2641 206-399