[R-pkg-devel] Submission to CRAN when package needs personal data (API key)

On 7 Sep 2018, at 02:16, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:

On 06/09/2018 10:32 AM, Hadley Wickham wrote:
On Wed, Sep 5, 2018 at 3:03 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
On 05/09/2018 2:20 PM, Henrik Bengtsson wrote:
I take a complementary approach; I condition on, my home-made,
R_TEST_ALL variable.  Effectively, I do:

if (as.logical(Sys.getenv("R_TEST_ALL", "FALSE"))) {
    ...
}

and set R_TEST_ALL=TRUE when I want to run that part of the test.  You
can also imagine refined versions of this, e.g. R_TEST_SETS=foo,bar
and test scripts with:

if ("foo" %in% strsplit(Sys.getenv("R_TEST_SETS"), split="[, ]+")[[1]]) {
    ...makes no assumption
}

That avoids making assumptions on where the tests are submitted/run,
may it be CRAN, Bioconductor, Travis CI, ...
This is the right way to do it.
I would like to gently push back on this assertion: if CRAN set an
environment variable we would have one single convention that all
packages could rely on.
When packages delete tests just for CRAN, the quality of the repository suffers.
Absolutely. But in some cases. But t the moment, one is forced to use workarounds if test **can** not be run on CRAN (API keys, computing times, ?) but should be run on local tests. It would make much more sense if there would be a standardised way of dealing with this.
Users should be able to check an install by running the tests that passed on CRAN and seeing them pass on their system as well.
Also agreed - so if the user sets the environmental variable CRAN for the test, the CRAN tests are executed (as today), if not set, the extended tests are executed.
The current system relies on each package
author evolving their own solution. This makes life difficult when you
are running local reverse dependency checks: there is no way to
systematically assert that you want to run tests in a way as similar
as possible to CRAN.
Most packages don't need to evolve anything:  the CRAN tests are sufficient.
But there seems to be a need to exclude certain tests, due to various reasons.

I know that the CRAN maintainers already have a very large workload,
and I hate to add to it, but setting CRAN=1 in a few profile files
doesn't seem excessively burdensome.
It would be easy to do that, but then CRAN wouldn't be testing the same things that users would test.
See my comment above.
A user might see a test failure that didn't happen on CRAN, and suspect that there was something wrong with their install, when in fact it was an author trying to hide a deficiency in their package from CRAN.
Only if they execute the extended tests. I can still hide deficiencies in my package by not applying a specific test or doctoring the result, if that is my intention. But the extended tests could be used to test additional setup options, which can not be tested on CRAN.

This discussion has come up before.  If you want to submit to CRAN, you
should include tests that satisfy their requests.  If you want even more
tests, there are several ways to add them in addition to the CRAN tests.
  Henrik's is one, "R CMD check --test-dir=myCustomTests" is another.

Rainer's package is unusual, in that from his description it can't
really work unless the user obtains an API key.  There are other
packages like that, and those cases need manual handling by CRAN:  they
don't really run full tests by default.  But the vast majority of
packages should be able to live within the CRAN guidelines.
10 years ago, I would have definitely supported this statement. But I
am not sure it is still correct today, as there are now many packages
that require a connection to web API to work (or depend on a package
that uses an API). Additionally, CRAN only allows a limited amount of
compute time for each check, so there are often longer tests that you
want to run locally but not on CRAN. CRAN is a specialised testing
service and it does have different constraints to your local machine,
travis, and bioconductor.
A quick search of the CRAN mirror on github
(https://github.com/search?q=org%3Acran+skip_on_cran&type=Code)
reveals that there are ~2700 tests that use testthat::skip_on_cran().
This is obviously an underestimate of the total number of tests
skipped on CRAN, as many packages don't use testthat, or use an
alternative technique to avoid running code on CRAN.
That's not so obviously an underestimate, as packages that use that technique use it many times, not just once per package.  (A sample I looked at averaged 15 calls per package, but I don't know if that's unbiased.)

But in any case, the skip_on_cran() function implements a version of Henrik's approach.  The name of the function is misleading, it doesn't attempt to distinguish between CRAN and a regular user.
I would guess because it can?t. If there would be a standardised way of identifying that the test is run on CRAN, I would use this immediately.

Cheers,

Rainer
Duncan Murdoch

______________________________________________
R-package-devel at r-project.org <mailto:R-package-devel at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel <https://stat.ethz.ch/mailman/listinfo/r-package-devel>
--
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany)

University of Z?rich

Cell:       +41 (0)78 630 66 57
email:      Rainer at krugs.de
Skype:      RMkrug

PGP: 0x0F52F982

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: Message signed with OpenPGP
URL: <https://stat.ethz.ch/pipermail/r-package-devel/attachments/20180907/226b02ce/attachment.sig>

[R-pkg-devel] Submission to CRAN when package needs personal data (API key)

Thread (38 messages)