[R-pkg-devel] Questions about making a database package (Rpolyhedra)
Hi Ale, I'd personally use a more specific solution like github LFS (large file storage) for a versioned database. You should also check with CRAN itself, as they keep high standards for everything that's not a standard install. More specifically (from CRAN policies) : Downloads of additional software or data as part of package installation or startup should only use secure download mechanisms (e.g., ?https? or ?ftps?). Personally I would store that information in a public database somewhere with a (minimal) API. This can then be extended without inflating the download and would allow people to install only a subset of what they need. That would also allow people to also port your work to other language by simply writing a wrapper around the DB API. It's not a necessity, but I thought it was worth mentioning as an option. Cheers Joris On Wed, Jun 27, 2018 at 10:22 PM, alejandro baranek <
alejandrobaranek at gmail.com> wrote:
By now, we are on that situation: +- 150 polyhedra published. But +800 able to publish and because of package size cannot publish all of them. It is not a problem on github, it's a problem on CRAN, with building (fixed testing timing with simple sample techniques) timing. I would like to hear more from experienced package developers about this issues, but we seemed to found a solution. We decided to make another github repo RpolyhedraDB. When you install the package, it downloads the database from the correct tag marked in the data folder of the package in a home directory of the user. So package will be minimal for CRAN, will be RR and will install database on first use (In case of TRAVIS or other qa/continuous integration, it will install it of course). It will be possible to setup different DB size using the TAGS, in case we find it preferable to the users. Best, Ale. 2018-03-29 4:43 GMT-03:00 Berry Boessenkool <berryboessenkool at hotmail.com> :
I assume you cannot simply reduce the 150 to a few for demonstration purposes? I have seen people using DRAT packages on github for data, but gh is limited in size restrictions as well... No expert in this, but maybe this helps a little bit... Berry - ------------------------------ *From:* R-package-devel <r-package-devel-bounces at r-project.org> on
behalf
of alejandro baranek <alejandrobaranek at gmail.com> *Sent:* Tuesday, March 27, 2018 19:26 *To:* r-package-devel at r-project.org *Subject:* [R-pkg-devel] Questions about making a database package (Rpolyhedra) Hello group: We released Rpolyhedra V0.2 last month. It is able to scrape +800
polyhedra
definitions from public sources. At V0.2.4 we are publishing only 150 because the time needed for scrape all the polyhedra, testing and the resulting size of the package. The difference is a configuration in
zzz.R,
very simple to change (Who wants to try it, can build the package for themeselves) Only the source files of polyhedra definitions are +12MB of size (We are including it in the data folder for package self suficience). But we have doubts about good practices for publishing a database
package.
We think the solution is to split the package in an internal Rpolyhedra-lib, opensource but not in CRAN, and Rpolyhedra with a catalog sewhich enables to connect with that repo for downloading scraped
polyhedra
on-demand. We have to think further the way of connecting both repositories, but before touching any code, want to listen to experienced package
developers
and the community in general, about to do this. Do you know any package with analog behavior than this package? We didn't find it. Best, Ale. -- alejandro baranek @ken4rab <https://twitter.com/ken4rab> qbotics <http://qbotics.tumblr.com/> | surferinvaders <http://surferinvaders.tumblr.com> | algebraic-soundscapes <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle <http://imaginary.org/program/surfer-shuffle> [[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
-- alejandro baranek @ken4rab <https://twitter.com/ken4rab> qbotics <http://qbotics.tumblr.com/> | surferinvaders <http://surferinvaders.tumblr.com> | algebraic-soundscapes <http://imaginary.org/content/algebraic-soundscapes> | surfer-shuffle <http://imaginary.org/program/surfer-shuffle> [[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Joris Meys Statistical consultant Department of Data Analysis and Mathematical Modelling Ghent University Coupure Links 653, B-9000 Gent (Belgium) <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g> tel: +32 (0)9 264 61 79 ----------- Biowiskundedagen 2017-2018 http://www.biowiskundedagen.ugent.be/ ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]