RFC: Loading packages at startup
Prof Brian D Ripley writes:
I've been kicking the following idea around for a while, and am now proposing to put some version into 1.7.0. I'd be interested in comments on the desirability and the design, before I start writing any code.
S4 introduced a file .S.chapters which can contain a list of S chapters (equivalent to R packages) to be loaded on start-up. This was the germ of this proposal.
Proposal:
Extend the initialization as described in Startup.Rd by having optionally files named, say, R_HOME/etc/Rpackages.site and .Rpackages, the latter in the starting directory or failing that the user's home directory. Each would contain a list of packages, one per line, to be loaded when R is started, in the order in the files.
Some details:
1) I think the packages should be loaded before .Rprofile and .RData are processed, and R_HOME/etc/Rpackages.site before .Rpackages. This can be argued, and the S4 parallel would seem to be to load packages after S.init (the nearest it has to Rprofile). But we would load library/base/Rprofile first of all so the analogy is not close.
2) The present kludge of loading ctest in .First could be replaced by making ctest the default content of R_HOME/etc/Rpackages.site (in the light of point 5).
3) It would be useful to allow the library tree to be specified, as a second field on the line.
4) One problem with saving an R session and then restoring it is that the packages in use are not reloaded. Quitting an R session and saving could write .Rpackages in the current directory (with the library recorded if it were not the default). Then restarting a session in that directory would restore the loaded packages automatically.
5) We might want to allow a .Rpackages file to override Rpackages.site (or we might not). One idea is to allow a minus sign in front of a package name, and to merge the Rpackages.site and .Rpackages files before loading any packages. If we did this we probably need to be able to save the list of packages to be loaded (and can't easily save those not to be loaded), so perhaps -- as the first list of .Rpackages should empty the list.
6) One could argue for R_HOME/etc/Rpackages as the `system' file as well, and this might be useful if we break base up into smaller components.
7) I would allow comment lines in the files, starting with #.
8) The file names or names could be set by environment variables. It's strange that we allow the site file names for Rprofile and Renviron and the user file name for command histories to be set in that way.
I'd recommend against going this way. In fact, I am not sure whether we really want to have *user environment* files in the long run. We currently have two user files controlling startup, .Renviron and .Rprofile. A split like this is necessary because not all customization for R can be done from 'inside R', i.e., after R has been started, but needs to be done before that. To my understanding, this includes setting LD_LIBRARY_PATH (or the system's equivalent), and maybe some env vars which can be given instead of command line args, but with R_VSIZE and R_HSIZE sort of gone what else is there? So perhaps the user environment mechanism is not the right thing anyway (and it is not used by R CMD *). For the things that can be done from inside R, I'd recommend doing it this way. E.g., there is really no reason for looking for an env var R_BROWSER when we can portably specify one using options(browser). The fact that specifying the packages to be loaded at startup inside .First() and that users cannot simply add to .First() is a problem that needs to be addressed anyway. As Robert has indicated, the obvious idea is to introduce (an Emacs-style) hooks mechanism to be run at certain times, or more generally, when certain events occur. Unfortunately, the attempt to make this rather general, as discussed in Boston, means that things also take a bit longer to get done. But conceptually, what we want is a suite of setHook() addHook() runHooks() functions, and things like .First(), .First.lib() (and the user variants), code to be run when creating a save image etc., can all be integrated into this general mechanism. [Basically, hooks are lists of functions because in some cases we need to call them with certain arguments, e.g. the .First.lib package load hook always has library and package and maybe also version eventually.] For package config, users can then use setHook() to override the system and/or site defaults, or addHook() to add to them. As everything happens inside R there is no need for a special format, as discussed in (3), (5), and (7). [If users cannot be expected to code their preferences in R ... then we could provide an interactive tool which eventually emits the R code.] Long term, startup configuration will include setting defaults, loading packages, perhaps playing namespace magic with some of them, perhaps everything according to predefined 'themes' ... and I'd like to have all of this in one place, if possible. Point (4) is very important, in particular if we think about saving and restoring the 'state' of an R session (as opposed to just the work space). I think many of us (I know from at least Greg, Fritz and myself) have code going in this direction, but then we need more than just names of the packages. We should perhaps also know about attached non-package objects etc. Also, I am not sure about how attempting to load/attach in reverse order will interact with namespace import/export and pre-computing package dependencies before loading them. But in any case, dumping an object that somehow represents the session state would be extremely important---but I do think we need an R object to represent the information. -k -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._