Skip to content

Respecting custom repositories files in interactive/batch R sessions

7 messages · Gabriel Becker, Dirk Eddelbuettel, Kurt Hornik

#
Hi all,

A company I work with mirrors CRAN internally behind its firewall for
security (and reproducibility/consistency/etc) reasons. In that case, we
would like all R processes (across all the R CMD *, as well as interactive
and batch sessions) to automatically hit our cran mirror instead of
prompting the user to select a mirror or failing to contact CRAN at all
(during check).

I recently found out about the ${R_HOME}/etc/repositories file (after
multiple years owning the R installations of a sizable corporate research
outfit in my previous job).

Contrary to my expectations, however, the CRAN entry found in the
repositories file is not respected in interactive or batch sessions.

With the value "https://fakeyfakeyfake" for the CRAN URL, I get this
behavior in an interactive session in Rdevel built from trunk:

R Under development (unstable) (2022-09-14 r82853) -- "Unsuffered
Consequences"


<snip>
--- Please select a CRAN mirror for use in this session ---

Secure CRAN mirrors


 <snip>


Selection: 0

*Error in contrib.url(repos, type) : *

*  trying to use CRAN without setting a mirror*
<snip>

[11] "menu_name\tURL\tdefault\tsource\twin.binary\tmac.binary"


[12] "CRAN\tCRAN\t\*"https://fakeyfakeyfake
<https://fakeyfakeyfake>\"*\tTRUE\tTRUE\tTRUE\tTRUE"
  <snip>


R CMD check, on the other hand, *does* use it the entry in repositories out
of the box:


gabrielbecker$ Rdevel CMD check switchr_0.14.5.tar.gz

[1]
"/Users/gabrielbecker/local/Rdevelraw/R.framework/Versions/4.3/Resources/library"

* using log directory ?/Users/gabrielbecker/gabe/checkedout/switchr.Rcheck?

* using R Under development (unstable) (2022-09-14 r82853)

* using platform: x86_64-apple-darwin21.5.0 (64-bit)

* using session charset: UTF-8

* checking for file ?switchr/DESCRIPTION? ... OK

* checking extension type ... Package

* this is package ?switchr? version ?0.14.5?

* checking package namespace information ... OK

* checking package dependencies ...Warning: unable to access index for
repository https://fakeyfakeyfake/src/contrib:

  cannot open URL 'https://fakeyfakeyfake/src/contrib/PACKAGES'


This behavior is coming from the fact that the repos option is unilaterally
set to c(CRAN = "@CRAN@") in utils::.onLoad


I propose instead that this should be set to either a) the CRAN entry to
the repository file, or even better imho, b) the set of all repos marked as
default in the repositories file, with a caveat that its set to @CRAN@ in
the case there is no cran entry, though comments around the source code in
tools suggest other things will break in that case anyway.

The default value of the repositories file has @CRAN@ for the cran entry,
and cran is the only repo marked as default, so this preserves the existing
behavior in what I assume to be the overwhelming majority of cases where
the repositories file is either not custom, or is only appended to .

I have a patch which does option (b) (and can easily be adapted to option
(a)) that I will submit to bugzilla after any discussion here.

Also, as a separate issue, I strongly feel that the R administration manual
section about repositories be updated to more clearly describe behavior and
best practices around setting the repos R will look in. I will develop a
patch for that separately once I see whether one of the above changes is
likely to go in or not (as I don't want to write it twice).

For completeness, I know that we could put a setRepositories call in the
the site Rprofile, but I have to admit I don't really understand why this
should be necessary.

Thoughts?
~G
#
I may be missing something here but aren't you overcomplicating things?  One
can avoid the repetitive dialog by setting   options(repos)   accordingly,
and I have long done so.  The Debian (and hence Ubuntu and other derivatives)
package does so via the Rprofile.site I ship.  See e.g. here

 https://sources.debian.org/src/r-base/4.2.1-2/debian/Rprofile.site/

I have used the same mechanism to point to intra-company repositories, easily
a decade or so ago. I had no problems with R CMD check of in-house packages
using this.

Dirk
#
Hi Dirk,

So there's a couple of things going on. First off you're correct that that
works generally. There are a couple of reasons that made it not. The first
is a bug/design error in Rstudio which is causing the R_PROFILE to not be
adhered to when you build there. I will be filing a bug regarding that with
them, as I know that is irrelevant to this list.  There was some indication
that even raw R CMD check running via an R studio server installation was
missing the profile, but that ended up being spurious upon deeper testing.

That said, I do think that there is a case to be made for the ability to
modify what repositories R knows about at a more fundamental level than
setting options in a site profile, and that is, ostensibly, what the
repositories file machinery does. I understand it was intended initially
and is currently only (?) used for the windows repository gui menu and
related setRepositories function, but I still think there is some value in
extending it in the ways I described.

One major difference is that in this case, even when run with --vanilla,
administrators would still be in control of which repositories users hit
(by default only, of course, but there is still value in that).

Best,
~G
On Thu, Sep 15, 2022 at 11:31 AM Dirk Eddelbuettel <edd at debian.org> wrote:

            

  
  
#
Friends,

I always keep forgetting how these things currently/precisely work, but
I guess the principle is that utils:::.onLoad() does

  options(repos = c(CRAN = "@CRAN@"))

unless the repos option was already set (in the user or site profiles).
As the latter are not used when checking, the check code in tools takes
advantage of the repositories file mechanism, see ? setRepositories:

     The default list of known repositories is stored in the file
     ?R_HOME/etc/repositories?.  That file can be edited for a site, or
     a user can have a personal copy in the file pointed to by the
     environment variable ?R_REPOSITORIES?, or if this is unset or does
     not exist, in ?HOME/.R/repositories?, which will take precedence.

which also points to Startup etc).

I guess one could teach utils:::.onLoad() to use the user/site
repositories setting instead of the hard-wired CRAN = @CRAN@?  Afaict,
that would make no difference if the repositories file was not
configured, and provide the configured setting in case repos was not set
in the user/site profile ...

Best
-k

        
#
Hi Kurt,

Thanks.
On Fri, Sep 16, 2022 at 12:57 AM Kurt Hornik <Kurt.Hornik at wu.ac.at> wrote:

            
Yes this is exactly what happens.
Precisely, that is my proposal. I have a patch which does this and passes
make check-devel for me (there is some slight technical gotchas because
tools::get_repositories calls utils::read.delim which isn't available yet
during utils::onLoad execution), but I have a workaround for that that
works.

If the consensus is that this is a good idea I'm more than happy to submit
the patch, along with an update to the admin manual reflecting the change
(either together or as separate issues).

Best,
~G

  
  
#
Thanks.  Perhaps submit your patch via R-bugs?

(I would have hoped we can simply use tools:::.get_repositories() right
away ...)

Best
-k