Skip to content

[R-pkg-devel] How to reduce examples in a package that takes more than 5 seconds to run?

7 messages · Ismail Otoakhia, Brian G. Peterson, Ben Bolker +3 more

#
The R package 'ardl.nardl' has some examples that take more than 5 seconds
to run. I was advised by the CRAN team to reduce the run time to less than
5 seconds.



How can this be achieved?



E. I Otoakhia
#
On 12/15/22 08:34, Ismail Otoakhia wrote:
- you can lower the amount of data in the example

- you can use a faster method than your default method

- you can wrap the example in a dontrun tag so it will not run during 
checking
#
On 2022-12-15 9:57 a.m., Brian G. Peterson wrote:
If your example involves something like a fitted model object that 
takes a long time to compute, you can pre-compute the model fit and 
store the object in an inst/testdata directory, then use something like

fitted_model <- readRDS(system.file("testdata", "my_example.rds", 
package = "mypackage"))

to retrieve it

  
    
#
On 12/15/22 9:02 AM, Ben Bolker wrote:
The "sos" package includes a function "CRAN", which is used for that 
purpose.  The examples section in "findfn.Rd" includes the following:


# Skip these tests on CRAN,
# because they take more than 5 seconds
if(!CRAN()){
...
}


	  I know that some on this list do not like this construct, but it has 
helped me manage this problem for several years.  NOTE:  This CRAN 
function is NOT maintained by anyone on CRAN, so it is NOT guaranteed to 
work.  However, the "sos" package is currently on CRAN, and I have not 
seen an email asking me to avoid it ;-)


https://github.com/sbgraves237/sos/blob/master/man/findFn.Rd


	  Hope this helps.
	  Spencer Graves
#
Thanks, everyone. From the ideas shared, I adopted the following on the .rd
files:
\examples{
\dontrun{
...
...
...
}
}
The "..." are the examples which take more than 5 seconds to run. I hope
this resolved the CRAN team's request.


E. I Otoakhia


On Thu, Dec 15, 2022 at 4:26 PM Spencer Graves <
spencer.graves at effectivedefense.org> wrote:

            

  
  
11 days later
#
On 15/12/2022 10:25 a.m., Spencer Graves wrote:

            
I've been reminded today why this is such a bad idea.  I'm preparing an 
update of the rgl package.  As I attempt to be a good citizen, I'm doing 
comparison checks of all of the 300 packages that use it, to see if my 
updates break anything.  I am not CRAN, so if any of those packages 
follow Spencer's scheme, I'll have to wait even longer for the run to 
finish.  (Currently it's about 15 hours predicted time on my slow laptop.)

It's more considerate to have a function that defaults to not running 
slow tests unless you specifically ask for them to be run.  The testthat 
package does this using the NOTCRAN environment variable:  it assumes 
everyone is CRAN unless they specifically declare themselves not to be. 
Strange choice of name, but it achieves what I'm suggesting.  The 
tinytest package has an "at_home" argument to some functions; when 
testing a whole package, it defaults to FALSE, which is also the 
considerate choice.

Duncan Murdoch
#
On 27 December 2022 at 13:27, Duncan Murdoch wrote:
| On 15/12/2022 10:25 a.m., Spencer Graves wrote:
| > 	  I know that some on this list do not like this construct, but it has
| > helped me manage this problem for several years.  NOTE:  This CRAN
| > function is NOT maintained by anyone on CRAN, so it is NOT guaranteed to
| > work.  However, the "sos" package is currently on CRAN, and I have not
| > seen an email asking me to avoid it ;-)
| 
| I've been reminded today why this is such a bad idea.  I'm preparing an 
| update of the rgl package.  As I attempt to be a good citizen, I'm doing 
| comparison checks of all of the 300 packages that use it, to see if my 
| updates break anything.  I am not CRAN, so if any of those packages 
| follow Spencer's scheme, I'll have to wait even longer for the run to 
| finish.  (Currently it's about 15 hours predicted time on my slow laptop.)

100% agree. This never struck me as a good idea, nomatter how widespread the
practice became.

| It's more considerate to have a function that defaults to not running 
| slow tests unless you specifically ask for them to be run.  The testthat 
| package does this using the NOTCRAN environment variable:  it assumes 
| everyone is CRAN unless they specifically declare themselves not to be. 
| Strange choice of name, but it achieves what I'm suggesting.  The 
| tinytest package has an "at_home" argument to some functions; when 
| testing a whole package, it defaults to FALSE, which is also the 
| considerate choice.

And its documention mentions how some packages use their own toggle variable
to turn tests on and off; Rcpp is one of those.

Second email:
On 28 December 2022 at 08:09, Duncan Murdoch wrote:
| On 28/12/2022 7:06 a.m., Daniel Kelley wrote:
| >  > I've been reminded today why this is such a bad idea. ?I'm preparing an
| > update of the rgl package. ?As I attempt to be a good citizen, I'm doing
| > comparison checks of all of the 300 packages that use it, to see if my
| > updates break anything.
| > 
| > Hi Duncan, I'm wondering whether you are checking against CRAN sources 
| > or github/gitlab/etc sources.
| 
| CRAN sources only.  Unreleased packages may have lots of errors in them, 
| and it's up to their authors to test them.  Users shouldn't rely on 
| those packages, but they might rely on CRAN packages, and I don't want 
| to cause extra headaches for them.

Same for Rcpp, RcppArmadillo, BH, ...  I wrote (and use) a simply (and still
fairly raw) package 'prrd' for 'parallel running of reverse depends' to do
this (across a job queue to which one can add runners as clients), but I do
this via a courtesy shell account on a machine so slow that Rcpp and its
2600+ packages now take 48 hours for one run. Oh well. 

| > I ask because the scheme I use in my packages is that I have lots of 
| > tests that only get done if a certain local file (or directory) is 
| > present. ?Those files/directories are listed in .Rbuildignore, so the 
| > CRAN tests skip those tests. ?Some of these files/directories are stored 
| > in the same github repo as the package code, and some are not. ?The 
| > latter are because the datasets are huge, or because they are private 
| > data sent to me by users to test new features. ?(Often users share data 
| > for testing, as I develop new code. They need to keep the data private 
| > until they publish papers and theses.)

I do exactly that (keeping some test files from the .tar.gz) for some other
packages where local tests. Use of all tests is as simple as this in tinytest 
    Rscript -e 'tinytest::run_test_dir("inst/tinytest")'
and the package gets shorter default run.         

| > I know rgl does not depend on oce, so my question is not exactly 
| > pertinent to you, but I would appreciate it if you could comment on 
| > whether what I am doing seems to be useful. ?To be honest, I am a bit 
| > lost in how folks are supposed to handle slow tests; I'd not be happy to 
| > report how many times I've done web searches to find out just what 
| > \dontrun and \donttest do, and when I should use them or should do 
| > something else.
| 
| Your method sounds good.  You choose to run your tests, you don't force 
| anyone else to run them.

Agree! I had not much luck with \dontrun and \donttest which still get
overruled. The tinytest package is a very good here because "tests files are
just R scripts" so I can condition as I please and `exit_file()` as
needed. Which I use quite a lot to skip for various reasons (time, platform,
API additions unavailable in tests with older libraries etc).

Cheers, Dirk