Skip to content

R CMD check and CRAN's Rust policy

23 messages · Mossa Merhi Reimert, Duncan Murdoch, Hiroaki Yutani +6 more

#
Hello everyone!

I'm Mossa, I'm one of the maintainers of extendr, an automated generation of bindings project for
Rust code, for use in R-packages.

I'm writing to you, as R 4.4.3 was just released, and there have not been
follow-up on an issue important to us. Link to the issue as discussed on r-devel
https://stat.ethz.ch/pipermail/r-devel/2024-October/083666.html

A community member has provided a suggestion to a patch here https://github.com/r-devel/r-svn/pull/182, and we have also attempted to bring it up on
Bugzilla: https://bugs.r-project.org/show_bug.cgi?id=18806

TLDR: Default `R CMD check` uses additional CRAN-specific checks for Rust,
instead of keeping this behind the --as-cran flag.

I would like to say, that there is a growing interest in Rust within the R community.
And generally, Rust becoming a widely adopted language within the Python community (including the scientific part of that community). It is time to deal with the
pain points with using Rust in R.

Therefore, I would kindly ask that we have a dialogue on how to remedy the issue above first, and how we may deal with other issues going forward. There are both challenges embedded in R itself, and the current CRAN policy for Rust is prohibitive.



Mossa Merhi Reimert
Postdoctoral Researcher

K?benhavns Universitet
Department of Veterinary and Animal Sciences
Animal Welfare and Disease Control
Gr?nneg?rdsvej 8
1870 Frederiksberg C
Denmark

+45 35324135
mossa at sund.ku.dk<mailto:mossa at sund.ku.dk>
#
Mossa,

the issue you cite is lacking any pertinent information and it's not even clear why it should be an issue. The check is perfectly justified, it just reports whether a package using rust declares this correctly and where it downloads 3rd party content - something that is important to R users in general and not related to CRAN. I don't see how any of this is "prohibitive" it just calls out what the package is already doing.

As discussed before, my hope was that the "R"ust community will mature enough to work on proper support. It is not clear that it happened yet, but once it does it would make sense to talk about support just as we have for C, C++ and Java, so certainly that should be the right discussion. However, it will have to start with some thinking and a proposal on how the associated issues (compiler support, versioning, dependency sources etc.) are to be addressed, as opposed to making random demands. All this has nothing to do with CRAN so the issue you mention seems irrelevant to the progress. Also I'd like to know what are the "challenges embedded in R itself".

Cheers,
Simon
#
Dear Simon Urbanek,

There has been very little engagement with the issue I referred to. If it was decided that this ?check? ought to be part of the default checks for R,
then that could have been written to us. Either on the bugs.r-project.org or the proposed patch. Before we talk about anything else,
it does seem very strange that we cannot get a reasonable dialogue going.

I would like to say that the R/Rust community has grown substantially. From my end, there are 3 bindings project, extendr, savvy, and roxido.
Then, there are now many rust-based packages on CRAN, see this most recent compiled list https://github.com/nanxstats/r-rust-pkgs.
There is also proof-of-concept https://github.com/r-rust/hellorust that integrates `cargo`, rust?s official build system, with R?s package build system,
and https://github.com/extendr/hellorustc, which showcases how Rust compiler could be directly linked with R?s package system.

 Let me say, that the current R CMD check is not meant to be ?helpful?. When a package is built, `cargo` tells the user ?Downloading crates?.
Thus, this information is already conveyed to the user.

Personally, I do wish we could debate this requirement further. I do not believe that having R-packages on CRAN vendor rust dependencies
as a good policy. Download statistics is a success metric of a given r-package and rust packages. By insisting on vendoring, and thus
side-stepping `cargo` / crates.io, we are robbing upstream authors of their download-numbers. I do not think such policy is honourable.

While C/C++ do not have official package repositories, it could be thought of, as fair game, to have CRAN act as a pseudo package manager for C/C++ libraries.
I?m not going to argue for or against this part.

There are many objections from the CRAN side to all things related to Rust. I don?t want to open multiple topics in the same thread.
But there is plenty to bring up. And I had hoped we could talk this little issue through, before embarking on a larger discussion.
I do not appreciate the ?random demands? comment, as this is not a demand, nor is it random.
I have inquired my end of the community for suggestions
to compile a larger proposal, but then I was afraid that this would be perceived as a big, bulky demand.

Rust is not C/C++/Java, and the support for Rust cannot look like the support for these languages.



From: Simon Urbanek <simon.urbanek at R-project.org>
Date: Sunday, 2 March 2025 at 00.39
To: Mossa Merhi Reimert <mossa at sund.ku.dk>
Cc: r-devel at r-project.org <r-devel at r-project.org>
Subject: Re: [Rd] R CMD check and CRAN's Rust policy
[Du f?r ikke ofte mails fra simon.urbanek at r-project.org. F? mere at vide om, hvorfor dette er vigtigt, p? https://aka.ms/LearnAboutSenderIdentification ]

Mossa,

the issue you cite is lacking any pertinent information and it's not even clear why it should be an issue. The check is perfectly justified, it just reports whether a package using rust declares this correctly and where it downloads 3rd party content - something that is important to R users in general and not related to CRAN. I don't see how any of this is "prohibitive" it just calls out what the package is already doing.

As discussed before, my hope was that the "R"ust community will mature enough to work on proper support. It is not clear that it happened yet, but once it does it would make sense to talk about support just as we have for C, C++ and Java, so certainly that should be the right discussion. However, it will have to start with some thinking and a proposal on how the associated issues (compiler support, versioning, dependency sources etc.) are to be addressed, as opposed to making random demands. All this has nothing to do with CRAN so the issue you mention seems irrelevant to the progress. Also I'd like to know what are the "challenges embedded in R itself".

Cheers,
Simon

  
  
#
You seem to be taking a confontational tone, which isn't likely to 
encourage a reasonable dialogue.

I've looked for other messages on this, and didn't see any besides this 
one explaining why including check_rust() in the checks is a problem. 
The problem you talk about here is that it encourages vendoring, which 
makes it harder for package authors to count downloads.

To be honest, that doesn't seem like a very serious problem.  I assume 
the packages ("crates") we are talking about are open source, so this is 
entirely in the spirit of how they are allowed to be distributed.

If they aren't open source, then users of those packages should be 
warned about that, and a check failure is a good way to do that.

So you need to explain why it is important to be able to download and 
install software and not be warned about it.

I am not in R Core or CRAN, but I can suggest why it is better to 
include source in the package:  it makes the use of that package more 
reliable in the future.  It's not uncommon to run an R computation that 
was written a few years ago.  Sometimes libraries or R have changed, and 
a user will need to go back to a previous version to reproduce the 
calculation.  Being able to able to rebuild a system as it would have 
been back then is important.

Is that possible if the package needs to make a download?  The download 
site that worked a few years ago may no longer exist.  If the site 
exists, the code versions there may be different.

Those are some of the issues that Simon was alluding to.

Duncan Murdoch
On 2025-03-02 5:45 a.m., Mossa Merhi Reimert via R-devel wrote:
#
I agree with you. It seems no one explained what problem it causes to them.
In my understanding (disclaimer: I haven't hit this by myself yet), the
problem is that the "Downloading crates ..." log raises a warning, which
makes the CI check fail. Although it's true that none of this is
"prohibitive," it's just inconvenient.

Maybe it's possible to make it a NOTE instead of a WARNING at least?

I'm suggesting so because otherwise this will end up encouraging the
package authors to hide these logs as a workaround to avoid the CI failure.
As it's very easy, I'm personally fine with the status quo, but if it
becomes a common practice, it makes it harder for the CRAN maintainers to
investigate the installation logs.

Best,
Yutani


2025?3?2?(?) 23:49 Duncan Murdoch <murdoch.duncan at gmail.com>:

  
  
#
Well this has surely veered off course!

As the one who filed the BugZilla report, I'd like to redirect the
conversation and provide further context.

The question should be *"how do we get a dialogue started on this bugzilla
issue before the next minor *
*release of R?"*

The current check for Rust-based R package's downloading external
dependencies works by looking at
the output logs for the presence of  "Downloading crates." This can is an
entirely fine requirement for
CRAN?however, due to the fact that it is an error, packages distributed
through other repositories
fail the R-CMD check.

Folks who use R-universe or PPM or some mysterious third thing may not
share the same philosophy as
CRAN and prefer the convenience of fetching the dependencies at compile
time and not vendoring them.

An alternative would be for the check to be optionally skipped or become a
NOTE when the CRAN
flag is not set and an ERROR otherwise. Skipping this CRAN check is as easy
as adding `--quiet`
or setting an environment variable?but that is against the spirit of the
check.

Ideally, the check can remain, but scoped appropriately.


On Sun, Mar 2, 2025 at 6:49?AM Duncan Murdoch <murdoch.duncan at gmail.com>
wrote:

  
  
#
On 2025-03-02 11:03 a.m., Josiah Parry wrote:
Isn't this exactly that dialogue?
I think you misunderstood me.  CRAN shares the view I gave that you 
should be able to run old code to reproduce old results, but they aren't 
the only ones.  That's always been a goal of R.
If it is that easy to skip the check, then I really don't see the issue. 
  Just ask the repository where you want to put your package to put that 
option or environment variable in place, and there's no longer a problem.

Duncan Murdoch
#
I, like Duncan, am just following along here. I think there might be 
two distinct questions which it would be useful to keep distinct:

  * how to silence the rust-check if desired?

   rather than debating whether the rust-check should be always-on, 
on-for-CRAN-only, etc., would it provide for useful flexibility to add 
an environment variable that enables/disables this functionality?  There 
are already 168 of these environment variables, how much would one more 
cost?

   I'm not sure how adding an environment variable to allow easier 
user/alternate-repository control of the check is "against the spirit of 
the check" ...

   All the existing check-regulating env variables ...

cd src/library/tools/R
grep 'Sys.getenv("_R_CHECK' * | sed -e 's/^.*Sys.getenv(//' | sed -e 
's/[,)].*//' | sort | uniq | wc


   * should CRAN allow Rust crates to be downloaded?

   This is a much more fundamental policy decision, which I have no 
opinion about.

   cheers
    Ben Bolker
On 2025-03-02 12:21 p.m., Duncan Murdoch wrote:

  
    
#
On 2025-03-02 1:09 p.m., Ben Bolker wrote:
I may have misunderstood Josiah.  I thought his message said that it is 
already easy to silence the check, by stopping the code from issuing the 
message the check is looking for.

Presumably the package shouldn't do that, but if there's an environment 
variable that can be set to do it, then the repository or user can 
choose to do it, so there's no need for R to add another environment 
variable.

BTW, as far as I can see current R-devel doesn't issue an error, it just 
issues warnings about two issues:

  - the package is downloading crates
  - the rustc compiler doesn't report a version number

Duncan Murdoch
#
Mossa,
I don't see anything from you on this list - your first engagement was yesterday. I have no idea what you refer to as "us" and what makes you think you should have been notified if no one heard from you before. A start of any engagement is to start communication, so here we are, perhaps not the most fortunate way to start off, but we have a discussion and there is hope.
I think this part of the problem - there is no systematic rust support, so each package author does something differently. As much as it is nice to have the freedom to have many different implementation of the same thing, I would argue that in cases like language support it makes more sense to combine the effort into one solution (after everyone experimented and gained enough experience) that is easy to manage and is well maintained. This is what happened to most mature languages such as C++, Java and Python. That would avoid the "hacks" in place today (I'm referring to the check).
You are jumping issues here: as I said before this has nothing to do with CRAN. So let us first take CRAN out of the picture and talk about the check. The check does two things: a) it checks that the package correctly declares rust dependency and  b) checks whether the package uses 3rd party dynamic downloads. Since the "R"ust community has yet to come up with any systematic rust support, both seem very reasonable checks. We want to know if a package requires rust by checking the DESCRIPTION file alone so the user can make an informed decision whether they want (or even can) use the package. It is also important to know if a package can accesses 3rd party resources online. Due to rising security threats it is increasingly common to not allow analytics machines to have access to the Internet so sensitive data cannot be leaked. It also opens the can of legality as the resulting software may not adhere to the license of the package and there is no guarantee that the user will still have the license. Moreover, reproducibility is very important to R users so it should be possible to reproduce the installation - which excludes 3rd party distributed systems which don't have any such guarantees unless they provide a way to fully vendor dependencies. So, in short, there are many reasons why the user should know about the things checked so they can make informed decisions. Whether this is the best way to signal that is up for debate.

Your argument is that the important reason is a popularity contest based on download statistics. I would argue that it is a very weak reason, since vast majority of R users does not use source installations to install packages, so there is no "robbing" of upstream authors - the statistics don't reflect real usage anyway.

If you want to propose improvements to the check, I'm sure it would be appreciated, but putting it behind --as-cran doesn't seem the right approach nor does that solve the problem in any way as the issues are not CRAN-specifc. I would think that some proposal to declare rust requirements (incl. toolchain) and have declared a way to vendor dependencies to address off-line install, licensing and security issues uniformly for rust packages would be steps in the right direction.
Why not? They all require compilers, ways to deal with dependencies and produce binaries - so does Rust. It's just one of many similar languages. The key is to have proper support instead of having each package deal with the complexities alone.

Cheers,
Simon
28 days later
#
Following up with this as I address the new R-devel "Compiled code should
not call entry points which might terminate R" WARNING and this issue has
reared its head again.

Would a path forward be an environment variable similar
to _R_CHECK_CRAN_INCOMING_ to skip this check primarily for GitHub Actions
and CI?

Or, alternatively, if this could be a NOTE when the `--as-cran` flag isn't
set but a WARNING when it is?

Re-vendoring dependencies each time they are changed during the development
lifecycle is quite a bit. However, vendoring once prior to publishing makes
good sense.

Is there a balance we can strike here to lower development friction but
also ensure the robust package installation requirements when expected?

Using




On Sun, Mar 2, 2025 at 11:42?AM Duncan Murdoch <murdoch.duncan at gmail.com>
wrote:

  
  
#
On Mon, 31 Mar 2025, at 4:50 PM, Josiah Parry wrote:
If this is primarily about CI then can you tweak your scripts not to fail on that particular warning? If you are using the r-lib/actions then I believe they utilise rcmdcheck (https://cran.r-project.org/package=rcmdcheck) which does give an output object you can work with.

Tim
#
On 2025-03-31 11:50 a.m., Josiah Parry wrote:
The "Compiled code should not call entry points which might terminate R" 
isn't a new warning.  I think the last change to it was made in 2022.

Maybe your code, or code in one of the libraries you use, has changed?

Duncan Murdoch
#
Duncan, the changes to symbols checking was introduced March 22nd see
https://bugs.r-project.org/show_bug.cgi?id=18789 and
https://developer.r-project.org/blosxom.cgi/R-devel/NEWS/2025/03/22#n2025-03-22.
But that is unrelated.

To Tim's comment?the check is a simple grep of the installation log for
"Downloading crates." This could be easily circumvented on CRAN and locally
by suppressing stdout/err. But that would be adversarial and I would like
to adhere to the intent of the check.



On Mon, Mar 31, 2025 at 9:23?AM Duncan Murdoch <murdoch.duncan at gmail.com>
wrote:

  
  
#
On 2025-03-31 12:41 p.m., Josiah Parry wrote:
Sorry, I missed that.
I think Tim was suggesting that you modify your Github action to ignore 
this particular warning.  The warning would still appear, but it 
wouldn't cause a check failure.

Duncan Murdoch
#
Josiah - I do sympathise but, irrespective of this particular check, this highlights an inflexibility in your CI setup to handle different warnings as you wish. It is not adversarial to not fail on a warning produced by R CMD check, within your own CI.
#
On 2025-03-31 1:04 p.m., Duncan Murdoch wrote:
At a very quick look, I don't see an easy way to do that (but I am 
admittedly really bad at doing stuff with Github actions). Maybe longer 
term, but it feels like the best way to do this would be to request a 
feature in `rcmdcheck` that allowed the user to request ignoring 
specific warnings (e.g. specified by regexp?), then expose that feature 
in the r-check-package action (or whatever it's called ...)

  
    
#
I don't see an easy way, but I think this is an approach:

Configure r-lib/actions/check-r-package to not fail on warnings, only on 
errors, and to upload the result of the rcmdcheck() run.  I think it 
will contain all errors, warnings, and notes.  There are options 
"error-on" which should be set to "error", and "upload-result", which 
should be set to "true".  So this is the easy part.

Then examine those results, ignore the warnings you don't care about, 
and trigger on warnings you do care about.  I have never written a 
github action, but this one sounds possible.

Duncan Murdoch
On 2025-03-31 1:48 p.m., Ben Bolker wrote:
#
I took a more extreme version of this approach for a project that keeps many R packages in a monorepo and checks them all at once, where we do a lot of saying ?let?s ignore this warning _in this package_ until someone has a chance to fix it properly, but still fail the build if it shows up in _other packages_?.

The key idea in our approach is for each package you check to cache a previous check result that contains the warning(s) you want to ignore, then compare the current check against it and have your action fail only on _newly added_ warnings.

My implementation[1] is brittle and fussy and could be simplified a lot if you?re only checking one package at a time, but may be a useful starting point.

[1] https://github.com/PecanProject/pecan/blob/develop/scripts/check_with_errors.R
#
Just for fun I forked rcmdcheck and added arguments to it to allow 
particular messages to be changed in severity.

For example, if the WARNING message says something which matches the 
regexp "Compiled code should not call entry points which might terminate 
R" you could run

   rcmdcheck::rcmdcheck(".", demote = list(warnings = "Compiled code 
should not call entry points which might terminate R"))

and the warning will be counted as a NOTE.  The decision about whether 
to signal an error from the run will be based on the value after demotion.

  I haven't done anything with the Github action, but users can play 
with this fork if they like.  It can be installed using

   remotes::install_github("dmurdoch/rcmdcheck")

You can install custom builds in a Github action fairly easily, but it's 
hard to add a new argument to a call deep within the action script.  A 
simpler approach would be to fork my fork and set the default value for 
"demote" to whatever you want, then install your own fork during the 
Github action.

Comments are welcome.

Duncan Murdoch
On 2025-03-31 1:48 p.m., Ben Bolker wrote:
#
It is also pretty straightforward to roll your own actions and / or use
different, simpler YAML setups. I still use a 'rolled forward and maintained
by me now version' of the shell script many of us started with at Travis CI.
It works, is portable across multiple CI backend, and is still a shell script
and easy to customize. I eventually added basic actions that fetch it / do
the setup.  Would be easy to deploy a modified rcmdcheck with it too. It only
covers linux (where it works rather well with r2u for speed, ease of use and
reliability) and macOS, both arm64 and x86_64. See [1] if interested.

Dirk

[1] https://eddelbuettel.github.io/r-ci/
#
On 2025-03-31 4:50 p.m., Duncan Murdoch wrote:
Sorry, that should be

   remotes::install_github("dmurdoch/rcmdcheck at demotions")

Duncan Murdoch
#
One more change to

    remotes::install_github("dmurdoch/rcmdcheck at demotions")

Now the changes in message severity have to be coded into the 
DESCRIPTION file of the package being checked.  For example, add this line:

Config/rcmdcheck/demote/warnings:
   Compiled code should not call entry points which might terminate

The message being matched has to appear all on one line, but multiple 
patterns can be given on separate lines.  Embedded newlines are not 
allowed.  A simple fixed grep is used to match it to the check log.
See the ?rcmdcheck help page for more details.

Duncan Murdoch
On 2025-03-31 7:20 p.m., Duncan Murdoch wrote: