How packages are set up - R-help

Tue, May 1, 2001 9:22 AM #

In trying to get more familiar with R I have two questions.

1. For large packages it would be slow to parse all R source
code for the package each time library() is issued.  
Yet I haven't found where a package's functions are stored 
in in .RData format.  Would someone please clarify this?

2. Have package developers found that it works best to
maintain packages locally using the same directory structure 
described in "Creating R Packages" in the "Writing R
Extensions" manual?

Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Thomas Lumley

Tue, May 1, 2001 9:38 AM #

On Tue, 1 May 2001, Frank E Harrell Jr wrote:

Well, the R source code *is* all parsed each time library() is issued, and
it isn't that slow. On my system (with network drives) it is less than a
second even for nlme, survival5, and MASS.  I don't think .RData format
would be that much faster.  The time to load the ratetables dataset in
survival5 (which is in .RData format) and to load the nlme package are
similar, and they are about the same size.

That's what I do.

	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Brian Ripley

Tue, May 1, 2001 9:41 AM #

On Tue, 1 May 2001, Frank E Harrell Jr wrote:

It is not slow: base is parsed every time R is started, for example.  Have
you tried this on a real package?  The largest current example, nlme,
takes 0.49 secs to load on my system, and that is wice the size of the
next largest.

However, as of about yesterday in the R-devel version there is an
option --save to INSTALL to create an image instead and make library()
load that.  It is not clear that is appreciably faster unless there
is more going on than parsing (like some actual computations).  (It also
unclear if it will ever be able to be done on a Mac.)

Yes, or something very similar.  (For some packages I keep the master help
in S .d format in subdirectory help, for example.)

Brian

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Frank E Harrell Jr

Tue, May 1, 2001 10:48 AM #

Thanks to Thomas Lumley and Brian Ripley.  The source code
for both the Hmisc and Design libraries is each the same size
as that of nlme, so I am encouraged.  nlme also loads in about
0.5s on my system.  Still the upcoming option to INSTALL to 
create a binary image is nice to hear about.

It sounds as if the R directory structure is good for
everyday use for developers too.  If anyone has opinions on how best
to set up a directory structure that is also suitable
for working with a CVS repository I would appreciate
hearing about that.

Thanks

Frank
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Luke Tierney

Tue, May 1, 2001 1:26 PM #

On Tue, May 01, 2001 at 09:38:01AM -0700, Thomas Lumley wrote:

Just to follow up on the use of .RData format to store package code:
unfortunately this is currently only worth while if a significant
amount of computation can be moved from load time to package creation
time.  If this is not the case, .RData currently loses both on speed
and size: On my system loading nlme with library takes about 1.5
seconds; loading a version saved as .RData takes over 3 seconds.  The
R/nlme file in the installed package is also quite a bit smaller than
the .Rdata version.

This is clearly something we need to work on, and both loading speed
and size of .RData files used for code are likely to improve over
time.  Performance for data already seems to be quite reasonable.

luke

Luke Tierney
University of Minnesota                      Phone:           612-625-7843
School of Statistics                         Fax:             612-624-8868
313 Ford Hall, 224 Church St. S.E.           email:      luke at stat.umn.edu
Minneapolis, MN 55455 USA                    WWW:  http://www.stat.umn.edu
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Paul Gilbert

Tue, May 1, 2001 1:37 PM #

Yes, except that I keep larger help files in a directory I call mansrc/ and generate
man/ by splitting these into the smaller files for each function. The .R files in my
R/  contain many functions and larger files in mansrc/ correspond to these .R files.

However, I do not run R CMD check from above the package directory, as may be
tempting to do. If you do this you may get extra things like Rcheck directories and
core dumps generated in your source directories.

And while I use the package structure I do not use the bundle structure. This means
that I have to copy my packages into the bundle structure before I use "R CMD build"
to make the tar ball. I have sometimes thought about organizing my source in the
bundle structure and if others have ideas about this I would appreciate hearing. One
of the reasons I have not is because I intend on occassion to move packages into
different bundles. While it is no problem to move a directory around, I am a novice
with CVS and am not so sure that I can move directories around without messing up CVS
badly. (BTW,  there is nothing special about setting up CVS with the package
structure. I actually start CVS somewhat higher so that it gets my Makefiles and
other documentation as well.)

Paul Gilbert

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._