I am testing out the next release of survival, which involves running R CMD check on 868
CRAN packages that import, depend or suggest it.
The survival package has a lot of data sets, most of which are non-trivial real examples
(something I'm proud of).? To save space I've bundled many of them, .e.g., data/cancer.rda
has 19 different dataframes.
This caused failures in 4 packages, each because they have a line such as "data(lung)"? or
data(breast, package= "survival"); and the data() command looks for a file name.
This is a question about which option is considered the best (perhaps more of a poll),
between two choices
1. unbundle them again? (it does save 1/3 of the space, and I do get complaints from R CMD
build about size)
2. send notes to the 4 maintainers.? The help files for the data sets have the usage
documented as? "lung" or "breast", and not data(lung), so I am technically legal to claim
they have a mistake.
A third option to make the data sets a separate package is not on the table.? I use them
heavily in my help files and test suite, and since survival is a recommended package I
can't add library(x) statements for? !(x %in% recommended).?? I am guessing that this
would also break many dependent packages.
Terry T.