CRAN package sizes

Robin Hankin's post reminded me to post about the following recent 
addition to 'Writing R Extensions', in the section on 'Submitting a 
package to CRAN'

   Ensure that the package sources are not unnecessarily large. ...
   As a general rule, doc directories should not exceed 5Mb, and
   where data directories need to be 10Mb or more, consideration should
   be given to a separate package containing just the data. (Similarly
   for external data directories, large jar files and other libraries
   that need to be installed.)

With 2800 packages on CRAN, overall size is becoming a concern and 
currently to install all of CRAN takes 4Gb.  As the attached (I hope) 
graph shows, the 20 packages over 20Mb take a quarter, and those over 
5Mb take half.  (And this is after we have removed 100Mb from the 
largest installed package by re-compression, and archived the second 
largest, so Robin's package is currently the largest.)  Some of the 
largest packages are data/jar packages, but there are 55 packages with 
'doc' directories over 5Mb.  To put that in perspective, PDFs of whole 
books with lots of figures (MASS, Paul's R Graphics) are well under 
5Mb.

R CMD check in R-devel reports on large packages, and expect in future 
that submitted package sizes will be questioned more often.

There are lots of different reasons why doc directories are large, but 
the major ones are

- installing files that are unneeded, such as Rplots.pdf and .eps
   figures.
- using PDF figures of images where PNG would be more appropriate.
- including less than relevant material (such as how to install R,
   with screenshots!)

There are several ways to reduce the sizes of PDFs with no loss in 
quality, e.g. Adobe Acrobat Standard/Pro.