Skip to content

Regarding R_LIBS_USER

9 messages · Pavel Krivitsky, Johannes Ranke, Sergio Oller +2 more

#
Hello,

I just subscribed to the list to join the discussion after being
blindsided by the change and reading Dirk Eddelbuettel's reply to my
bug report at https://bugs.debian.org/866768 . 

As far as I can tell the advantages of site library are:

   1. Saves disk space and a little bit of user time spent installing and
      upgrading.
   2. Other Debian package manages, like pip, default?to trying to install
      to a site library.

However, it seems to me that the case for status quo ante is stronger,
even if the jarring behaviour (asking user whether to create a personal
directory, then failing to do so) is fixed:

   1. Correct me if I am wrong, but wouldn't the change make the default
      behaviour of R on Debian different from its default behaviour on
      other distributions?
   2. Site library has no benefit over user library on a single-user
      system.
   3. Making the site library writeable by default is a severe security
      vulnerability, since it enables users to rewrite library code that
      will be blindly executed by other users on the system.
   4. Making the site library not-writeable by default means that users
      relying on site library need to contact an administrator to get a
      package installed or upgraded. This seems to me to largely defeat
      the benefit of a site library for most systems, since most users
      aren't so patient and would just figure out how to use a user
      library.
   5. The use case where defaulting to installing to site library is most
      beneficial---a system shared by few trusted users---is rare compared
      to the single-user and the many-untrusted-user systems.
   6. There are workarounds, and, indeed, users can always use user
      libraries, but the new setup puts the burden of tweaking the system
      on the *least skilled* users, whereas the old setup Just Worked for
      them (since R 2.5.0).
   7. Even in Python and others, the user library takes precedence over
      the site library (which, in turn, takes precedence over the dpkg-
      installed library). This means that .libPaths() should have the user
      library (if it exists) in first position, not as a fallback.

So, overall, I think the change does more harm than good. Am I missing anything?

				Best Regards,
				Pavel
#
Pavel,

Your tone does not exactly help in this discussion.

Briefly, I (like many other people) consider Linux and Unix to be
multi-user systems.  You can argue as passionately for a default
installation in /usr/loca/lib/R/site-library as you did against. And
some of your arguments are just silly ("dangerous group": dude, it is
one 'sudo addgroup r-adm' [or anotther name...]) away).

I have used such settings (such as un-setting R_LIBS_USER or its
predecessors) for over a decade, it just works (if you give write
permissions). It clearly helps us at work because everybody sees by
the default the same packages. I have also spoken with different R
Core members and several find the default installation below $HOME and
in a versioned directory less than ideal as well.  But it ensures
writeability. Which I cannot do easily from the package.

So maybe the change was too abrupt, and I think I may revert it. I
generally prefer for packagers like myself to not divert from upstream
unless they have good reasom or are unintrusive (and eg the added
tab-completion we have here is both).  But leaving newbies without
installable directories is bad, as is possibly hiding existing
installations.

I am a little pressed for time (at useR!) and system (main server is
ill, as is backup machine) but I should get a 3.4.1-2 out.

I'd welcome other comments, for or against.

Dirk
#
Hi,
Could the package make /usr/local/lib/R/site-library owned by a dedicated 
group, e.g. rlibs, so people could just adduser $USER rlibs? I do not see why 
R users should read logs, which is what the adm group was apparently made for.

And then, this tip could be given when install.packages() fails, in addition 
to the option to create a user (but not version) specific library. Usually R 
libraries are not version specific, even though in the case of R 3.0.0 and 
3.4.0 compatibility was interrupted.
I also like the idea of multiuser libraries, and in general the idea is that 
the most current package versions play well together, and are better then 
earlier versions. If this should not be the case, this should be an 
exceptional situation, which may warrant a user specific library.

So an alternative to reverting the change would be to keep your change, but to 
give better support for people that 

a) have user (and version) specific libraries that were created from within R 
in the past ("do you want to create a personal library instead")

b) want to use personal libraries for some good reason

via this list and maybe also via improved messages from R for the case that 
install.package() fails.
 
Johannes
#
Hi,

As comments are welcome I will give my two cents and a patch suggestion :-)

2017-07-06 10:42 GMT+02:00 Dirk Eddelbuettel <edd at debian.org>:
There are several entangled issues:
1. What should be the default places where packages are installed?
2. What should be the default places where packages are loaded from?
3. Ensuring that with the default R installation any user can install
packages

For 1., I believe packages should be installed at:
- /usr/lib/R/library, if they are given by a .deb package (core R packages)
- /usr/lib/R/site-library/, if they are given by a .deb package (other
packages)
- /usr/local/lib/R/site-library/, if the user has permissions
- ~/R/x86_64-pc-linux-gnu-library/3.4, (or somewhere else under $HOME,
otherwise)

For 2., I believe packages should be loaded from:
- ~/R/x86_64-pc-linux-gnu-library/3.4
- /usr/local/lib/R/site-library/
- /usr/lib/R/site-library/
- /usr/lib/R/library

I believe with 3.4.1-1 Dirk tried to move towards that configuration, and
for some reasons we ended in a situation where the R library in the HOME
directory was not considered to write into and was not considered to load
packages from.

This is how R works:

- When we use `library("package_name")`, the package_name is searched in
the directories given by .libPaths().

- When we use `install.packages("package_name")`, by default the package is
installed in the first element of .libPaths().

These two things make it very tricky to find a solution to 1, because:

- We can't change .libPaths() as we would break with how packages are
searched
- We may try to change install.packages() so instead of trying to install
to .libPaths()[1] by default it tries /usr/local/lib/R/site-library first.
We don't know how many things may break because they rely on that
assumption, we would need to test CRAN packages like checkpoint and packrat
and worse we don't know if we break some custom setups from users.
- We can't remove the home directory from .libPaths() because some users
won't be able to install packages there anymore nor find their packages,
unless they choose to do that in their Renviron file.

Considering this, the safest approach is to remove the home directory
approach from .libPaths() as Dirk tried to do. The main problem is that
there the personal directory approach failed and there was not enough
documentation in install.packages() to fix that.

Maybe a help message in install.packages would have helped. I attach a
DEP-3 compliant patch as a proposal. It works on my computer. I'm no expert
so your review is very welcome. The patch does a small change in how
install.packages works:

If the package can't be installed into .libPaths()[1]
(/usr/local/lib/R/site-library/), and if the user directory is NA (because
it is not set anymore in /etc/Renviron) then:
  * the default user directory is used instead: ~/R/%p-library/%v
  * and "Would you like to use \\n~/R/%p-library/%v \nto in the future to
install packages into?" is asked. if the answer is "yes", then it is
appended to the ~/.Renviron file.
I hope this helps, I am missing the useR conference, I hope I can assist
next year :-)

Best,

Sergio
#
Dirk,
My apologies. In my original bug report, I honestly thought that the
change was unintentional.
I agree, but I think that an important part of the multi-user paradigm
is to limit the ability of users to affect each other in non-explicit
ways.
The security issue would arise if R's default configuration allowed any
user (or any user with a shell) to write to /usr/loca/lib/R/site-
library. My understanding is that that's what would need to happen in
order to make R package installation work out of the box with
R_LIBS_USER unset by default.

I guess it could be an installation-time configuration option in
Debian.
I understand, and I've used a variety of settings as well. In
particular, even a decade later, I remember 2.5.0 coming out with
R_LIBS_USER and a sensible default, and things Just Working on all of a
sudden. (This was on a cluster, and during the 32- to 64-bit transition
as well, so automatically loading libraries with the right architecture
was a godsend, replacing a cumbersome workaround of my own.) Perhaps
that's why I'm so passionate.
I appreciate that it's less than ideal, but it also seems to me that
status quo ante already fulfils the goal of encouraging the use of the
site-library: it checks if the site-library is writeable, and only if
it's not, offers to create a personal library as a fallback.

As far as I can tell, knocking out that last step doesn't help those
users who want to use the site library (since they already already know
to make it readable, so they won't even see the offer to create a
personal library), but it does hurt those users who want to use the
personal library, by requiring them to take extra steps. Is the goal
here to "nudge" users towards the site-library by making it harder to
use the personal library?
Part of my concern is that this change might propagate to Debian-
derivatives like Ubuntu, hitting more naive users. 

				Best Regards,
				Pavel

P.S. Something else just occurred to me, but I don't have time to test
it at the moment: R --vanilla ignores a lot of environment settings;
which of the workarounds described still work?
#
On Thu, Jul 06, 2017 at 11:43:22AM +0200, Johannes Ranke wrote:
It could. I need to re-read the Debian Policy on this. In the past I
hesitated to create a new group _globally_ just for our one
application language. I don't think Python, Ruby, ... do either. Maybe
still better as a local policy?
And users can currently get this behaviour by setting the env.var (or
altering .libPaths() from their .Rprofile or or or)
Questions on install are tough. And discouraged.  Also, installing
user is root, we really want to ask each iser...
I like a lot of the things you suggest, but I am not sure I know a
good way to get there just yet.

Easiest short-term (non-)fix is the reenable the R default even if
many of us do not like it.  We may need it as fallback.

Dirk
#
On Thu, Jul 06, 2017 at 02:02:52PM +0200, Sergio Oller wrote:
That's both the same -- what .libPaths() shows, and which can be set
in several ways.
Yes, and that went wrong this time.
The first three (in reverse order) have been our default for 15 or years.

The fourth is what R adds (even if a few people have reservations) and
it provides a fallback.
See above. We have that.
We can. See help(Startup) and other places. But setting good default are hard.
Some of this way work, but we need some testing first.

In the near term, reverting the patch may be easiest.

Dirk

  
    
#
On 7 July 2017 at 05:06, Dirk Eddelbuettel <edd at debian.org> wrote:

            
If the first element isn't writable, does `install.packages(.)` move to the
next?
The problem of "finding a good default" affects any package that supports
optional add-ons (TeX Live, Octave, Netbeans Java apps that use modules,
etc).   In some cases, add-ons require versions of dll's that conflict with
standard
packages.  For TeX Live, we have distro packages that follow the distro
policies and also the CTAN version that puts everything in one place.
Red Hat has "software collections" which go in `/opt/rh` and `scl enable
<collection> ...`
that runs `...` with an environment tailored to support the  chosen
collection.

I suspect there are use cases for R that would encounter fewer problems
by creating a new directory tree to contain the complete R system and
libraries. Maybe those are already being handled with virtualization.
Without clear policies in place, the future is different and testing
may not solve the future problems.

1. make things as simple as possible for new or casual users.

A big part of the value of R is in the number of users, so never
do anything that might discourage new users.

2. diligence is needed to avoid barriers to installing complete R systems
in a private location.  This gets around things the R developers and
packagers don't control, such as site policies that prevent users from
creating new groups or installing libraries to `/usr/local/lib/R`.

If lots of people to this an "R Live" package system might be
useful.
Take the easy route -- I'm sure you have other ways to use
your time and there doesn't seem to be a quick fix that works for everyone.
2 days later
#
Version 3.4.1-2 of the r-base-core package reverts to the old behaviour.

Strictly personally speaking, I find that somewhat distateful -- but it is
the upstream default. I now set this in ~/.Renviron :

    ## Override possible ~/R/@p/@v directory
    R_LIBS_USER=""

and on the machines at work I may just do what I did before edit
/etc/R/Renviron or try the above via /etc/R/Renviron.site.

Thanks to everybody who chimed in, especially those who managed to stay calm.

Dirk