Skip to content

[R-pkg-devel] Installed package size

3 messages · Carsten Croonenbroeck, Ivan Krylov, Dirk Eddelbuettel

#
Hello everyone,

I would like to ask what's the maximum for installed package size. The thing is, I would like to publish a package that contains, sure, a set of functions, but also a very nice data set concerning meteorological data. That data set is required for wind farm layout optimization, a topic the entire package deals with.

Now, if I have RStudio check my package, it reports correctly that the package size is 65.9 MB (which is the size of the tarball), but the uncompressed size is 118.8 MB (which is also correct as is roughly matches the size of the .RData file). The package uses lazyload. Using the RStudio check mechanism, I get zero errors, zero warnings and one note, namely the one concerning installed package size. If I use devtools::check(); I even get a warning at "checking data for ASCII and uncompressed saves", reporting the same (correct) sizes.

I would like to know what's the maximum size and if there's a way around that limit. I also think to remember that I've downloaded other rather large packages from CRAN before. Unfortunately, I can't remember right now which packages that were... :-/

Best regards and thanks in advance

Carsten
#
On Thu, 12 Mar 2020 15:16:13 +0000
Carsten Croonenbroeck <carsten.croonenbroeck at uni-rostock.de> wrote:

            
Here's what CRAN policy [*] says about that:
(According to src/library/tools/R/check.R, 5MB seems to be the limit on
installed size, not compressed tarball size.)
If publishing the data separately from the code is acceptable, you
could use drat [**] to set up a repository for the data package
somewhere else, then list the data package in Suggests: and the repo in
Additional_repositories: in the DESCRIPTION of the code package, which
you could submit to CRAN.
#
On 12 March 2020 at 20:14, Ivan Krylov wrote:
| On Thu, 12 Mar 2020 15:16:13 +0000
| Carsten Croonenbroeck <carsten.croonenbroeck at uni-rostock.de> wrote:
| >> Where a large amount of data is required (even after compression),
| >> consideration should be given to a separate data-only package which
| >> can be updated only rarely (since older versions of packages are
| >> archived in perpetuity).
| 
| If publishing the data separately from the code is acceptable, you
| could use drat [**] to set up a repository for the data package
| somewhere else, then list the data package in Suggests: and the repo in
| Additional_repositories: in the DESCRIPTION of the code package, which
| you could submit to CRAN.

Thanks for the pointer! And Neal just kindly edited a SO answer describing
this, so I updated the list of CRAN packages doing this. [1]

The best reference, though, may still be our R Journal paper on this. [2]

Hth, Dirk

[1] https://stackoverflow.com/questions/36105257/how-to-make-r-package-recommend-a-package-hosted-on-github/36105343#36105343
[2] https://journal.r-project.org/archive/2017/RJ-2017-026/index.html