Skip to content

The install.R and R_PROFILE.R files

11 messages · Seth Falcon, Martin Maechler, Brian Ripley +1 more

#
R-exts says

   Both install.R and R_PROFILE.R should be viewed as experimental; the
   mechanism to execute code before attaching or installing the package may
   change in the near future. With the other facilities available as from R
   2.0.0 they should be removed if possible.

Every usage of these on CRAN is unnecessary. If you want to save the 
image, say so in the SaveImage field in DESCRIPTION (but why not LazyLoad 
instead?).  If you require methods, say so in Depends in DESCRIPTION.

I propose that we deprecate/remove this mechanism (as it makes 
installation a lot convoluted than it needs to be).  Does any one know of

(a) A usage that requires one of these files to do something which cannot 
now be done more simply via fields in DESCRIPTION, or

(b) A package that requires a saved image (and lazy loading is 
insufficient)?

If so, please let us know.

I am also minded to deprecate/remove the following command-line flags to 
INSTALL

   -s, --save[=ARGS]     save the package source as an image file, and
                         arrange for this file to be loaded when the
                         package is attached; if given, ARGS are passed
                         to R when creating the save image
       --no-save         do not save the package source as an image file
       --lazy            use lazy loading
       --no-lazy         do not use lazy loading
       --lazy-data       use lazy loading for data
       --no-lazy-data    do not use lazy loading for data (current default)

since these have been superseded by the DESCRIPTION fields (and Windows 
does not support --save=ARGS).
#
On 1 Feb 2006, ripley at stats.ox.ac.uk wrote:
I've looked over the packages in the Bioconductor repository and I
believe that every usage of R_PROFILE.R and install.R is also
unnecessary.
Many Bioc packages use SaveImage in their DESCRIPTION file.

Could someone provide more detail on the difference between SaveImage
and LazyLoad.  It is possible that LazyLoad would do just as well.

Thanks,

+ seth
#
On Wed, 1 Feb 2006, Seth Falcon wrote:

            
Thanks for that.
When an R package is installed, a file is prepared which is the 
concatenation of the (valid) files in the R directory.

With SaveImage, that file is loaded into an environment, and the 
environment dumped as a all.rda file.  The R code is then replaced by 
a loader whose sole job is to load the all.rda file.

With LazyLoad, the R file is loaded into an environment, and the objects 
in the environment are dumped individually into a simple database. The R 
code is then replaced by a loader whose sole job is to create an 
environment of promises to load the objects from the database.
(There is an article in R-news about lazy-loading.)

Lazy-loading is the norm for all but packages with very small amounts of R 
code.

I don't know when saving an image might be needed in preference to 
lazy-loading.  The differences in space (and hence time) when the package 
is used can be considerable, in favour of lazy-loading.  Since saving the 
objects in an environment and saving an environment are not quite the same 
thing there are potential differences.  (I have forgotten, if I ever knew, 
what happens with lazy-loading when you have an object whose environment 
is another namespace, for example.)

There have been packages ('aod' and 'gamlss' are two) which have both 
SaveImage and LazyLoad true. That works but is wasteful.

I just looked at the 12 non-Windows CRAN packages with SaveImage: yes and 
replaced this by LazyLoad: yes.  All passed R CMD check after the change. 
This included 'debug' and 'mvbutils' which had SaveImage: yes, LazyLoad: 
no which suggests the author thought otherwise.

There is no intention to withdraw SaveImage: yes.  Rather, if lazy-loading 
is not doing a complete job, we could see if it could be improved.
#
Thanks for the explaination of LazyLoad, that's very helpful.
On 1 Feb 2006, ripley at stats.ox.ac.uk wrote:
It seems to me that LazyLoad does something different with respect to
packages listed in Depends and/or how it interacts with namespaces.

I'm testing using the Bioconductor package graph and find that if I
change SaveImage to LazyLoad I get the following:

   ** preparing package for lazy loading
   Error in makeClassRepresentation(Class, properties, superClasses, prototype,  : 
           couldn't find function "getuuid"              

Looking at the NAMESPACE for the graph package, it looks like it is
missing some imports.  I added lines:
  import(Ruuid)
  exportClasses(Ruuid)

Aside: am I correct in my reading of the extension manual that if one
uses S4 classes from another package with a namespace, one
must import the classes and *also* export them?

Now I see this:

    ** preparing package for lazy loading
    Error in getClass("Ruuid") : "Ruuid" is not a defined class
    Error: unable to load R code in package 'graph'
    Execution halted   

But Ruuid _is_ defined and exported in the Ruuid package.

Is there a known difference in how dependencies and imports are
handled with LazyLoad as opposed to SaveImage?  

Thanks,

+ seth
#
Seth> Thanks for the explaination of LazyLoad, that's very helpful.
Seth> On 1 Feb 2006, ripley at stats.ox.ac.uk wrote:
>> There is no intention to withdraw SaveImage: yes.  Rather, if
    >> lazy-loading is not doing a complete job, we could see if it could
    >> be improved.

    Seth> It seems to me that LazyLoad does something different with respect to
    Seth> packages listed in Depends and/or how it interacts with namespaces.

    Seth> I'm testing using the Bioconductor package graph and find that if I
    Seth> change SaveImage to LazyLoad I get the following:

Interesting.

I had also the vague feeling that  saveImage  was said to be
important when using  S4 classes and methods; particularly when
some methods are for generics from a different package/Namespace
and other methods for `base' classes (or other classes defined
elsewhere).
This is the case of 'Matrix', my primary experience here.
OTOH, we now only use 'LazyLoad: yes' , not (any more?)
'SaveImage: yes' -- and honestly I don't know / remember why.

Martin


    Seth> ** preparing package for lazy loading
    Seth> Error in makeClassRepresentation(Class, properties, superClasses, prototype,  : 
    Seth> couldn't find function "getuuid"              

    Seth> Looking at the NAMESPACE for the graph package, it looks like it is
    Seth> missing some imports.  I added lines:
    Seth> import(Ruuid)
    Seth> exportClasses(Ruuid)

    Seth> Aside: am I correct in my reading of the extension manual that if one
    Seth> uses S4 classes from another package with a namespace, one
    Seth> must import the classes and *also* export them?

    Seth> Now I see this:

    Seth> ** preparing package for lazy loading
    Seth> Error in getClass("Ruuid") : "Ruuid" is not a defined class
    Seth> Error: unable to load R code in package 'graph'
    Seth> Execution halted   

    Seth> But Ruuid _is_ defined and exported in the Ruuid package.

    Seth> Is there a known difference in how dependencies and imports are
    Seth> handled with LazyLoad as opposed to SaveImage?  

    Seth> Thanks,

    Seth> + seth
#
The short answer is that there are no known (i.e. documented) differences, 
and no examples on CRAN which do not work with lazy-loading (except party, 
which loads the saved image in a test).  And that includes examples of 
packages which share S4 classes.  But my question was to tease things like 
this out.

You do need either SaveImage or LazyLoad in a package that defines S4 
classes and methods, since SetClass etc break the `rules' for R files in 
packages in `Writing R Extensions'.

When I have time I will take a closer look at this example.
On Fri, 3 Feb 2006, Martin Maechler wrote:

            

  
    
#
My understanding, and John or others may correct that, is that you need 
SaveImage if you want to have the class hierarchy and generic functions, 
plus associated methods all created and saved at build time. This is 
basically a sort of compilation step, and IMHO, should always be done 
since it only needs to be done once, rather than every time a package is 
loaded. Note that attaching your methods to other people's generics has 
to happen at load time, since you won't necessarily know where they are 
or even what they are until then (using an import directive may 
alleviate some of those issues but I have not tested just what does and 
does not work currently).

I hope that LazyLoad does what it says it does, that is dissociates the 
value from the symbol in such a way that the value lives on disk until 
it is wanted, but the symbol is available at package load time. I do not 
see how this relates to precomputing an image, and would not be very 
happy if the two ideas became one, they really are different and can be 
used to solve very different problems.

best wishes
  Robert
Prof Brian Ripley wrote:

  
    
#
On 3 Feb 2006, ripley at stats.ox.ac.uk wrote:
The issue I was seeing with the graph packge is caused by the Ruuid
package creating class instances at the C level using MAKE_CLASS.
MAKE_CLASS doesn't know about namespaces and if it gets called when a
package is loaded via an import, the class def will not be found.

With SaveImage *and* listing Ruuid in Depends, the Ruuid package ends
up in the right place for the class def to be found.  If one uses
LazyLoad, the Ruuid package does not end up in the same place.
Similarly, if one only specifies Ruuid in Imports, then both SaveImage
and LazyLoad fail.

I did a quick test of adding R_do_MAKE_CLASS_NS (see below) to allow
one to get class definitions from a specified namespace.  This seems
to work: I can use LazyLoad on a package (graph) that imports a
package that creates class instances in C code (Ruuid).


/* in src/main/objects.c */
SEXP R_do_MAKE_CLASS_NS(char *what, char *where)
{
    static SEXP s_getClass = NULL;
    SEXP val, call, namespace, force;
    if(!what)
	error(_("C level MAKE_CLASS macro called with NULL string pointer"));
    if(!s_getClass)
	s_getClass = Rf_install("getClass");
    PROTECT(force = allocVector(LGLSXP, 1));
    LOGICAL(force)[0] = 0;
    PROTECT(namespace = R_FindNamespace(mkString(where)));
    PROTECT(call = allocVector(LANGSXP, 4));
    SETCAR(call, s_getClass);
    val = CDR(call);
    SETCAR(val, mkString(what));
    val = CDR(val);
    SETCAR(val, force);
    val = CDR(val);
    SETCAR(val, namespace);
    val = eval(call, R_GlobalEnv);
    UNPROTECT(3);
    return(val);
}


+ seth
1 day later
#
On Fri, 3 Feb 2006, Robert Gentleman wrote:

            
That meaning the time of using R CMD INSTALL rather than using R CMD 
build, I guess?  (We do have an unfortunate ambiguity.)
My understanding is that `compilation step' creates objects which are then 
saved in the image.  Such objects would also be saved if the image is 
converted into a lazyload database.
You obviously have this defined a different way to me: I believe (and so 
does my dictionary) that the image is what I save in my camera, not the 
real world scene.  I understand 'save' to save an image of an environment, 
that is to make a representation on a connection that can be used to 
recreate the environment at a later date.
To create a lazyload database you first need an environment to save. On 
loading the package it then recreates not the environment but symbols 
linked to promises that will recreate the values at a later date.  So both 
mechanisms create an environment which they `image' in different ways.

The difference here is an inadvertent difference in the Unix INSTALL 
script, which I have now corrected.

  
    
#
I had a bumpy ride with this one.

Ruuid/src/Makefile.win refers to src/include, which is not in a binary 
distribution so cannot be installed from an installed version of R 2.2.1. 
(That's a bug report.)

graph throws an S4 signature error in R-devel.

After fixing those, it works with LazyLoad on Windows but not in Unix 
where there is an error in the INSTALL script which I have now fixed.
On Thu, 2 Feb 2006, Seth Falcon wrote:

            

  
    
1 day later
#
On 5 Feb 2006, ripley at stats.ox.ac.uk wrote:
Thanks for the report, this has been fixed in the devel version of
Ruuid.
Also fixed in devel, thanks.
Excellent, thanks.  

+ seth