Skip to content

[R-pkg-devel] speeding up package code/build/test cycle

14 messages · Greg Minshall, Wolfgang Viechtbauer, Dirk Eddelbuettel +4 more

#
hi.

when developing packages, my current work flow is to change the code,
(re-)build the package, detach/load the package, test (to find the
N+1'st bug, sigh).

the building step takes tens of seconds.

is there an obvious way to present some code to the R command line and
have it replace the appropriate function in a given package?

or, are there other things people do to speed up this process?

thanks in advance, Greg
#
Dear Greg,

You might want to look into the load_all() function from 'devtools'.

Best,
Wolfgang
#
Hi Greg,
On 24 June 2021 at 12:15, Greg Minshall wrote:
| when developing packages, my current work flow is to change the code,
| (re-)build the package, detach/load the package, test (to find the
| N+1'st bug, sigh).
| 
| the building step takes tens of seconds.

You may benefit from looking into ccache, I wrote about it a few times in a
few places including http://dirk.eddelbuettel.com/blog/2017/11/27/

If you enable it via ~/.R/Makevars as I do then it helps for all package
installations from shell or IDE or ...  I find it rather helpful esp with
compiled code. If you only write R code it will not help you.

| is there an obvious way to present some code to the R command line and
| have it replace the appropriate function in a given package?
| 
| or, are there other things people do to speed up this process?

There are a few options to R CMD build and R CMD INSTALL that may help. I use
a front-end script 'build.r' (from my littler package) where '-f' enables
fast mode skips vignette and (pdf) manual building which tend to slow me
down. If you look at 'R CMD INSTALL --help' you will see a few more that may
help.

I find I am happier with write/build/test cycle from the shell as I am then
guaranteed to have clean and pristine R sessions. By having front-end scripts
(to build and/or install) I can then easily have very compact shell
expressions you can simply Ctrl-r for, or, as you too are an Emacs user, use
'Ctrl-x c' to trigger the compile command ... which is both a) editable so I
can replace the 'make -f' default and b) has its own history so I can cycle
through commands and/or get back to recent ones. "Works for me", YYMV.

Cheers, Dirk
#
On 24/06/2021 5:15 a.m., Greg Minshall wrote:
Wolfgang pointed out the devtools way to do this.  I don't use that for 
rgl (a pretty slow 75 second from scratch), but I find R CMD INSTALL 
pkgdir works fast enough:  if C++ source has already been compiled, it 
won't be compiled again, as it would be if I installed from a tarball. 
A second build after a clean build takes about 15 seconds.  That's 
acceptable for me.

A disadvantage of the devtools method is that a regular build after 
load_all() seems to do a full 75 sec build:  load_all caches things for 
itself, but doesn't put them in the same place as a regular build, so 
make doesn't see the object files, and rebuilds all of them.  Or at 
least it did that last time I tried it, a few months ago.

As Dirk said, you can also use command line options to suppress building 
vignettes or other things if they take too long.

Duncan Murdoch
#
On Thu, Jun 24, 2021 at 8:55 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
[...]
That is definitely not how load_all() should work, so this might be a
bug. AFAIK it does not use a special cache or anything, just compiles
the files inside the package tree, like a regular package install from
a package tree. This is a regular install after load_all():

? system.time(devtools::load_all())
? Loading rgl
This build of rgl does not include OpenGL functions.  Use
 rglwidget() to display results, e.g. via options(rgl.printRglwidget = TRUE).
   user  system elapsed
  0.411   0.024   0.446

? q()

? time R CMD INSTALL .
* installing to library ?/Users/gaborcsardi/Library/R/4.0/library?
[...]
* DONE (rgl)
R CMD INSTALL .  9.95s user 1.28s system 97% cpu 11.503 total

So a load_all() that does not need to recompile anything, just reload
the R code, takes less than half a second. After that a regular
install from the command line takes about 12s, most of which is the
byte compilation. If I turn byte compilation off, then it is less than
6s.

But it is rare that you actually need to install the package while
working on it, and you typically just use load_all() to iterate, or
devtools::test() if you use testthat.

Gabor

[...]
#
On 24/06/2021 3:44 p.m., G?bor Cs?rdi wrote:
I'm working in RStudio on a Mac, in case that makes any difference. 
I've just done the following:

1.  devtools::load_all(".")

This does a full compile of the C++ source, so it's slow.

2.  devtools::load_all(".")

No recompile, so really quick.

3.  Click "Install and Restart" button

This does R CMD INSTALL --pre-clean ....  so it's really slow.

4.  Click "Install and Restart" button again

This is the fast build I was talking about:  no recompiling.

5.  devtools::load_all(".")

This does the full compile again, so it's slow.

So it appears that "Install and Restart" doesn't trust the object files 
that load_all() produced, and load_all() doesn't trust the object files 
that R CMD INSTALL produced.

This might be influenced by my choice not to "Use devtools package 
functions" in the Build Tools menu.  Probably the slow compile in step 3 
is an RStudio issue, not a devtools issue:  it used different command 
line options.  But the slow compile in step 5 looks like a devtools issue.

Your last paragraph is incorrect if "you" is taken to be me rather than 
a generic developer:  I *often* want to install the package while 
working on it.  I like the help system to work; I like to work on the 
source code and help pages together.  That may not be typical. 
Certainly help pages seem to be a low priority in most tidyverse packages.

Duncan Murdoch
#
On Thu, Jun 24, 2021 at 10:31 PM Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
[...]
Yes, RStudio probably does its own thing in "Install and restart". The
pre-clean of course removes the object files from the package src/
directory, so the next load_all() will need to recompile everything.
I meant 'you' to be generic, and yes, YMMV.

After devtools::load_all(), `?` will show the help pages of the
`load_all()`-d package, not the installed one. So just to get the help
pages of the dev package, you don't need to install it. E.g. after I
`load_all()` rgl, I get
? Rendering development documentation for 'rgl.bbox'

But there are certainly other good reasons to install the package, and
I am sure that with ccache it is not that bad.

[...]

G.
#
On 24/06/2021 4:52 p.m., Dirk Eddelbuettel wrote:
I haven't tried ccache:  shouldn't "make" handle that kind of thing?

Duncan Murdoch
#
On 24/06/2021 5:22 p.m., G?bor Cs?rdi wrote:
But why did step 5 recompile everything?  Here's what you left out of 
the quote:
Duncan Murdoch
#
Actually, I see somewhat different things in RStudio, for me there is
no pre-clean (with the default options AFAIR).

I can reproduce the rebuilding for rgl, but not for other packages I
tried. I vaguely remember a similar issue for another package, which
also used autoconf. So this indeed might be a bug in devtools/pkgload
or some unfortunate interplay between load_all() and autoconf's
configure.

In general, it seems that both RStudio and devtools just use the
package directory, and the object files are in src/. For all the
packages I tried, I did not see an unneeded full rebuild.

G.

On Thu, Jun 24, 2021 at 11:32 PM Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
#
On 24/06/2021 5:53 p.m., G?bor Cs?rdi wrote:
I'm not using the default options, as I said.

Duncan Murdoch
#
On 6/24/21 11:31 PM, Duncan Murdoch wrote:
In principle, yes, and I think one should not need ccache for package 
development when make files are designed well.

But there is a case for ccache. If for whatever reason one decides to do 
a "clean" build in "make" sense often, ccache can help. For example when 
you are working on the build system itself, on the make file or 
compilation options.

MXE (used to build the experimental UCRT toolchain for Windows) uses 
ccache when building external software, there clean re-builds are a 
common task (newer versions of that software, slightly changed build 
configuration) and some builds take forever when the projects are large.

ccache could help also when switching between code versions in the same 
workspace (via a versioning system), or when building same files in 
different directories (and not handled by the build system).

Tomas
#
On Thu, 24 Jun 2021, Greg Minshall writes:
I am not sure if it was already mentioned; but if you feel
comfortable with Emacs (I have a hunch you do), then ESS has
functionality that might help:

    http://ess.r-project.org/Manual/ess.html#Developing-with-ESS

and in particular

    http://ess.r-project.org/Manual/ess.html#Namespaced-Evaluation