Hi Ale,
I'd personally use a more specific solution like github LFS (large file
storage) for a versioned database. You should also check with CRAN itself,
as they keep high standards for everything that's not a standard install.
More specifically (from CRAN policies) :
Downloads of additional software or data as part of package installation
or startup should only use secure download mechanisms (e.g., ?https? or
?ftps?).
Personally I would store that information in a public database somewhere
with a (minimal) API. This can then be extended without inflating the
download and would allow people to install only a subset of what they need.
That would also allow people to also port your work to other language by
simply writing a wrapper around the DB API. It's not a necessity, but I
thought it was worth mentioning as an option.
Cheers
Joris
On Wed, Jun 27, 2018 at 10:22 PM, alejandro baranek <
alejandrobaranek at gmail.com> wrote:
By now, we are on that situation: +- 150 polyhedra published.
But +800 able to publish and because of package size cannot publish all of
them.
It is not a problem on github, it's a problem on CRAN, with building
(fixed
testing timing with simple sample techniques) timing. I would like to hear
more from experienced package developers about this issues, but we seemed
to found a solution.
We decided to make another github repo RpolyhedraDB. When you install the
package, it downloads the database from the correct tag marked in the data
folder of the package in a home directory of the user. So package will be
minimal for CRAN, will be RR and will install database on first use (In
case of TRAVIS or other qa/continuous integration, it will install it of
course). It will be possible to setup different DB size using the TAGS, in
case we find it preferable to the users.
Best, Ale.
2018-03-29 4:43 GMT-03:00 Berry Boessenkool <berryboessenkool at hotmail.com
I assume you cannot simply reduce the 150 to a few for demonstration
purposes?
I have seen people using DRAT packages on github for data, but gh is
limited in size restrictions as well...
No expert in this, but maybe this helps a little bit...
Berry
-
------------------------------
*From:* R-package-devel <r-package-devel-bounces at r-project.org> on
of alejandro baranek <alejandrobaranek at gmail.com>
*Sent:* Tuesday, March 27, 2018 19:26
*To:* r-package-devel at r-project.org
*Subject:* [R-pkg-devel] Questions about making a database package
(Rpolyhedra)
Hello group:
We released Rpolyhedra V0.2 last month. It is able to scrape +800
definitions from public sources. At V0.2.4 we are publishing only 150
because the time needed for scrape all the polyhedra, testing and the
resulting size of the package. The difference is a configuration in
very simple to change (Who wants to try it, can build the package for
themeselves)
Only the source files of polyhedra definitions are +12MB of size (We are
including it in the data folder for package self suficience).
But we have doubts about good practices for publishing a database
We think the solution is to split the package in an internal
Rpolyhedra-lib, opensource but not in CRAN, and Rpolyhedra with a
sewhich enables to connect with that repo for downloading scraped
on-demand.
We have to think further the way of connecting both repositories, but
before touching any code, want to listen to experienced package
and the community in general, about to do this.
Do you know any package with analog behavior than this package? We