CRAN installer for macOS - directory permissions
I'd really rather have just one library on my system. In special circumstances I sometimes want to have two different versions of a package installed, and then I need an extra library, but normally I want just one, because it reduces confusion and prevents errors that can in the worst case be catastrophic. (An example is given down below.)
I think we should distinguish here between ?what one wants personally? and ?how R works on most systems? and ?why a user library exists in the first place?. Both Windows and Linux don?t allow ?normal users? to write into the system lib and require a user library (which to me is the right approach). -> Aligning macOS to this state would simplify things for users. Also user libraries are a quite common thing across many languages. R is a bit special here in that it ships much of it ?default? functionality split in ?base? and ?recommended? packages of which some are mainly there for historic reasons (arguably). I really don?t think that everything should be in one library, simply for the fact that users can easily destroy the system-wide installation by this.
Users *should* interact with recommended packages. As I said before, recommended packages are contributed packages, and they can be updated between R releases. If they are updated, in normal circumstances users *should* update the default copy. If your proposal makes this harder, then that's a strong negative in my opinion.
I didn?t say they should not. But they should install updated recommended packages into the user lib. Updated versions of rec. pkgs in there will be take precedence when loading. This is how things are working on Win and Linux since ever (?). Why would users need to update the default copy in the system lib?
Few MacOS users would be confused by what happens on Linux. Most MacOS users never use Linux; the ones who do are the more sophisticated ones.
I strongly disagree. Modern users don?t just live on one operating system these days. They switch between multiple ones even during the day (e.g. by using R in a central RStudio Workbench installation which usually run on Linux) and a local installation on their machine running Win or Mac. This is in fact the standard for thousands of people - and I am in contact with many of such in my daily R consulting/system administration work. Also even if you are a ?one OS only? user and you aim to switch at some point, after many years?you would want to have things work the same way in your new OS, wouldn?t you?
This is impossible, unless by "minor version upgrade" you mean "patch version upgrade". An installed package for x.y.z is not guaranteed to work in a minor upgrade to x.(y+1).0, only in a patch version upgrade to x.y.(z+n). As I said already, a patch version upgrade should migrate packages (or at least offer to do so).
I was not aiming to reuse those for the updated minor version but preserving them for the previous minor version to be able to switch to this. This is actually a quite common task and I do this almost on a daily base (yes really). E.g. right now I am switching between 4.1.3 and 4.2.0 (the former for projects, the latter for pkg dev).
* It makes the user interface simpler and less error prone. It is a disaster if a package fixes a serious bug, I install the update, then because I have multiple libraries installed, I use the old one without realizing it.
I think you?re confusing things here. If you install an updated version it will go into your preferred library and will be available to you afterwards. You cannot ?just use the old one?.
It should be harder to end up with two versions of a package installed than just one. Your proposal says I have to jump through the sudo hoop to update recommended packages, or any other packages that I (running as admin) decided should be visible system-wide. That's a flaw.
You should not update recommended pkgs in the system lib, it should not be touched by a normal user. Just interact with your user library and be happy. This is not a flaw at all, this is how it works on Win and Linux and many people are happy with it. And ?jump through the sudo hoop? - is it really a problem to call `sudo` once and put in your PW if you aim to interact with the system lib? If this is a real argument than my time spent arguing about technical details feels worthless?
Perhaps a solution to the flaw would be for the package installer to warn users that they are about to install a second version instead of replacing the defective one, and offer to elevate privileges so the replacement could take place. But that's not a great user interface design, and (since most users always answer yes to such questions) not really any safer than the current design.
I think the concept of a user lib is just fine. If at all, those changes should be implemented and discussed across OS for R and not implemented in one OS only. The way R is installed and behaves on an admin level on Win/Linux/Mac essentially differs in many points but for no real reason often, i.e. the behavior could be aligned in many places.
That's such a rare need that RSwitch is sufficient. I've only used it a few times in the last decade, so it doesn't really make sense to make the install process more confusing to accommodate that need.
I think this is quite subjective, I use such tools all day. And I know many other people who do. One major point is reproducibility (among others) which is otherwise quite hard to achieve. The mileage varies here and also how people use R. I think allowing people to be as flexible as possible should be the goal, no matter how much of this flexibility is used in the end. Having multiple versions installed side-by-side is so much easier for other programming languages and R could really do better here in my opinion. Cheers Patrick
On 2 May 2022, at 22:28, Duncan Murdoch wrote:
On 02/05/2022 3:11 p.m., Patrick Schratz wrote:
Thanks Jim, that?s a very well phrased summary!
Duncan,
So when I install R using the MacOS installer, where should it be
installed, and where should it install packages?
There is only one place the official CRAN installer will install things into which is |/Library/Frameworks/R.framework|.
However, the proposal is not about the install location of /R/ or /R.app/ (which resides in |/Applications/R.app|) but about base and recommended packages (which go into the ?system lib?) and ?other? packages.
The installer also installs two kinds of R packages: base and
recommended ones. Base packages are closely tied to the internals
and can't be updated without updating everything, so it makes sense
to install them in the system library if R itself is going there.
Both base and recommended packaged are placed in |/Library/Frameworks/R.framework/Resources/library|.
And yes, base packages should not be touched while recommended packages can be. Yet, the discussion is not about these two but about the location where all other packages go into which the user install /after/ R has been installed.
But what about recommended packages? They are contributed packages,
and they are often updated between R releases. Should they go by
default into a user library?
Recommended packages can live in both system and user library. If a user library exists, the package instance within the user library will be loaded as it is first in the library path (|.libPaths()|).
I'd really rather have just one library on my system. In special circumstances I sometimes want to have two different versions of a package installed, and then I need an extra library, but normally I want just one, because it reduces confusion and prevents errors that can in the worst case be catastrophic. (An example is given down below.)
One practical issue is that if users upgrade from one minor version to the next (e.g. 4.1 to 4.2), they loose /all/ packages because in this scenario the system lib is overwritten.
One could argue that this is
needed as packages are not compatible between minor versions anyhow (which is true) but this is not the overall/actual point: actually ?the point? is manifold:
* prevent users from accidentally deleting/interacting with base and
recommended packages
Users *should* interact with recommended packages. As I said before, recommended packages are contributed packages, and they can be updated between R releases. If they are updated, in normal circumstances users *should* update the default copy. If your proposal makes this harder, then that's a strong negative in my opinion.
* prompt the creation of a user lib by default by not allowing
unauthorized (sudo) write actions into the system lib
* align the overall experience/flow with R on Linux and by this reduce
confusion for users
Few MacOS users would be confused by what happens on Linux. Most MacOS users never use Linux; the ones who do are the more sophisticated ones.
* retain user packages for a specific minor version between minor
version upgrades and by this simplify the existence of multiple R
versions side-by-side (this would even open the door for multiple R
patch versions of the same minor side by side, which is not
officially supported by the CRAN installer for macOS but only
possible with tools like RSwitch <https://rud.is/rswitch/> or rcli
<https://rcli.pat-s.me/> )
This is impossible, unless by "minor version upgrade" you mean "patch version upgrade". An installed package for x.y.z is not guaranteed to work in a minor upgrade to x.(y+1).0, only in a patch version upgrade to x.y.(z+n). As I said already, a patch version upgrade should migrate packages (or at least offer to do so).
On the flip side I am having a hard time to fine arguments which would speak against the change. What I took away so far from Simon?s replies (and apologies if I am wrong here) was mainly along the lines of
* makes admin work easier (I don?t understand this point)
* local admin users should be able to write into directories owned by
|admin| (yes and I am not voting against this, but they should use
explicitly authenticate with |sudo|)
* a bit of ?it has been like this since XY, let?s keep it like this?
(apologies if I misinterpreted this potentially :))
I offered one more above: * It makes the user interface simpler and less error prone. It is a disaster if a package fixes a serious bug, I install the update, then because I have multiple libraries installed, I use the old one without realizing it. It should be harder to end up with two versions of a package installed than just one. Your proposal says I have to jump through the sudo hoop to update recommended packages, or any other packages that I (running as admin) decided should be visible system-wide. That's a flaw. Perhaps a solution to the flaw would be for the package installer to warn users that they are about to install a second version instead of replacing the defective one, and offer to elevate privileges so the replacement could take place. But that's not a great user interface design, and (since most users always answer yes to such questions) not really any safer than the current design.
There are additional things which could be simplified WRT to the macOS CRAN installer (e.g. the addition of the option to change the root install location to allow for other install locations than |/Library/Frameworks/R.framework/Resources/library| which would allow the installation of multiple R patch versions side-by-side) but this is clearly a separate discussion.
That's such a rare need that RSwitch is sufficient. I've only used it a few times in the last decade, so it doesn't really make sense to make the install process more confusing to accommodate that need. Duncan Murdoch
I am happy to see that more people are joining the discussion and I would also happy to create an official proposal for such a change (if this is required and backed up by more people than just myself).
Cheers
Patrick
On 2 May 2022, at 20:00, Duncan Murdoch wrote:
On 02/05/2022 1:08 p.m., Jim Hester wrote:
I agree with Patrick that if the macOS behavior was more like the
default on linux it would benefit most users on macOS. His
proposal to
change the group writability of the system library seems a good
one to
me.
The current behavior (installing by default into the system
library)
is quite surprising to users when they install a new version of R
(even just a new patch version) and they lose their entire package
library. It causes some users to avoid updating R frequently as a
result.
As far as R.app having an option for this, that is good,
unfortunately
many users aren't using R.app, often they are either using R
through
RStudio or using the command line version of R directly. In these
cases it is not as obvious they could install packages to a user
library.
I think migrating packages is something the installer could do.
I'm not sure I understand the details of the proposed change. I'm in
the admin group on my laptop, because I'm the only user. So when I
install R using the MacOS installer, where should it be installed,
and where should it install packages?
I would find it strange if the installer wanted to install R.app
anywhere but the system /Applications folder. That's where almost
everything else I use gets installed.
The installer also installs two kinds of R packages: base and
recommended ones. Base packages are closely tied to the internals
and can't be updated without updating everything, so it makes sense
to install them in the system library if R itself is going there.
But what about recommended packages? They are contributed packages,
and they are often updated between R releases. Should they go by
default into a user library?
Duncan Murdoch
Jim
On Sun, May 1, 2022 at 9:12 AM Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
On 30/04/2022 2:58 p.m., Patrick Schratz wrote:
They don't go there "silently" as in unnoticed - they go
there if
you instruct R to do so. That's why there is an explicit
choice in
the Installer. Otherwise regular R rules apply.
Where is this choice in the installer? I don?t see a
menu/setting which
users could change to install packages into a user lib
instead of the
system lib (if they are part of the |admin| group).
To me, they go there if the lib is writable - and ?the
common? R user
does not know that the system lib is writeable by default.
I think Simon is talking about the package installer in
R.app and you
are talking about the installer for R itself. The package
installer
dialog in R.app has a pretty clear section called Install
Location.
Duncan Murdoch
It only does so for admin users. Unlike "managed" unix
systems, on
macOS you have essentially two situations:
On a "personal" machine (like laptop) the user is the
main user and
admin. Therefore it makes a lot more sense for the user
to use a
single location for managing packages which is at the
system level.
This also allows easy R upgrades. In addition, locations
in user
home raise a lot of issues (see the various discussion
where this
bites people on Windows) due to interactions with
software that does
mirroring, backups etc.. Hence this approach "just
works" as one
would expect on a Mac.
To be clear: I don?t question the system-wide
installation and I think
this is good as is (this also happens on Linux).
I am questioning the group write permissions for the
system lib.
If the user wishes to use his/her private library, it is
trivial -
just click on "At User Level" and from then on all
packages are
managed user's local library just like on any other
platform.
I might be missing something obvious but where is this
option ?at the
user level?? I assumed you?re talking about the official
|.dmg|
installer - which does not have such an option AFAICS?
On a "managed" Mac the user is not an admin and
therefore the
behavior makes no difference. The status quo just makes
it easier
for admins to manage the shared library, but in reality
this doesn't
matter as one would assume the admins know what they are
doing.
I disagree on this, especially the point ?makes it
easier for admins to
manage the shared library?.
Admins should (and will) always be able to manage the
system lib /after/
authenticating (as on Linux). The authentication step
does not really
make a difference in practice for admins and is required
in almost all
places where system-wide changes are desired.
This is also the /core/ of the whole discussion: 775 vs
755 (WRT to
directory permissions).
I don?t see any (strong) reason that would result in 775
> 755.
(If so, then the default should probably also be changed
on Linux.)
In a previous message, Kevin Ushey also agreed on the
point that
explicit authorization should be required to write into
the system lib.
I assume there might be many more people who would
actually agree on
this 755 being preferable in this situation.
How can a proposal be phrased to reconsider this setting
that is
evaluated by a representative group of people?
I am not claiming to be right but I?d be interested in a
multi-person
evaluation of this setting rather than keeping this at a
person-to-person discussion level.
Well, having administered company-wide R installations
in large
companies for almost two decades I'd strongly disagree.
As an admin
you want as few user-installed packages as possible,
because they
are guaranteed to cause problems. You want to limit this
for things
like development of packages where you want the stable
version
globally and development version locally (and this is
not just me -
have a look how the top tech companies manage their
software). You
have a reliable, stable central location - if you don't
do that then
you'll have n libraries to manage for n users which is
absolutely
not sustainable as users will break their libraries and
you cannot
even upgrade R. Also having a central, shared library is
crucial for
collaboration. Unlike in Python in R it actually works
since R and
CRAN doesn't allow randomly breaking reverse-dependencies.
As a system engineer and admin myself (for several ?large?
companies/institutions), I kindly disagree on your view
here.
User packages are not a problem but /a feature/,
everyone can install
the versions they need for their project.
They don?t interfere with packages from other users and
are not forced
go with the update interval of an admin.
With the additional use of renv (thanks Kevin!),
redundancy is highly
reduced as a shared cache can be used from with users
can simply use
symlinks rather than installing the x-th copy of the
same R package
version. But this is partly off-topic WRT to the actual
discussion.
Overall this sub-discussion part might come down to the
philosophy of
having a ?centrally managed, unflexible admin
installation? or a
?centrally managed, partly-flexible admin installation?
where only the R
versions and system libs are managed but users have the
freedom to
install any R packages they want.
Also in ?my? philosophy, it?s not about ?upgrading? R
and removing the
previous version but adding new versions as they come in
and keeping
previous ones - for the purpose of reproducibility.
I usually keep the latest patch version of a minor
version and aim to
provide a consistent R environment for various minor
versions where
users are guaranteed to be able to work with that minor
version in a
flexible way (i.e. by installing user packages as they
want) for many
years ahead.
As mentioned before and above I disagree. The proposal
doesn't
matter for managed Macs but would negatively affect
users that are
single-user admins and since that is typically the case
for the
majority of Mac R users (as they typically are on their
personal
machines) I don't see any upshot. All it would do is to
prevent
typical R users to install packages directly.
How would it affect single-user admins in a negative way?
They can
* still install packages per R minor version into a
dedicated user library
* install multiple R minor versions side by side
* actually enjoy the same behavior as on Linux
All it would do is to prevent typical R users to install
packages
directly.
I don?t understand this point. It would behave similar
as on Linux,
where users are prompted to create a user library (on
first use and if
non exists yet).
As you can see, the overall discussion topic is quite
important to me
and I am still convinced that the current state on macOS
is suboptimal.
Thanks for your time and sharing your thoughts.
Cheers
Patrick
On 25 Apr 2022, at 1:46, Simon Urbanek wrote:
Patrick,
sorry fo the delayed reply - this was not a quick e-mail
so I had to
find time after the release :)
On Apr 3, 2022, at 8:26 PM, Patrick Schratz
<patrick.schratz at gmail.com> wrote:
Hi Simon,
thanks for your extensive reply.
The choice is deliberate: the admin group on macOS
corresponds
to users that are allowed to install system-wide
software so it
allows all admins on the machine to install packages
which is
the expected way on macOS.
I think this choice is unfortunate as it contrasts with
existing
behavior on other platforms where one needs to explicitly
request admin privileges by either using sudo or
starting R as
an admin.
On macOS, packages just go into the system lib ?silently?
because of the write permissions granted via the admin
group.
See also my comments further down for more details on this.
They don't go there "silently" as in unnoticed - they go
there if
you instruct R to do so. That's why there is an explicit
choice in
the Installer. Otherwise regular R rules apply.
Also the versioning of the R framework as x.y is also
deliberate
- upgrading R to a new patch version does *not* require
re-installation of packages, they work by design so in
fact the
system location is the safest way to do that. Also note
that
packages are never removed by the installer.
Thanks, I am aware that a patch update does not require a
reinstallation as the packages are functional across
minor versions.
I checked again and was indeed wrong, patch updates from
the
CRAN installer do not remove additional custom packages
in the
system lib.
I was confused by some custom approaches of mine which
cause
this behavior - sorry for this!
So out of the items listed in "The problem" they are all
not
true with the exception of the comparison with the other
platforms, but even that difference is very subtle as it
only
affects the default on the first installation and not
regular
use (and I'm, not even sure it that is true since admin
users
can still install in the system location on other
platforms).
On Linux you would need to do explicitly invoke sudo R
to allow
writing into the system lib.
The issue on macOS is that a regular start of R allows
system
lib write permissions.
In my current view I think a similar behavior as on
Linux would
be great, i.e. to explicitly having to request admin
privileges
for this task.
It only does so for admin users. Unlike "managed" unix
systems, on
macOS you have essentially two situations:
On a "personal" machine (like laptop) the user is the
main user and
admin. Therefore it makes a lot more sense for the user
to use a
single location for managing packages which is at the
system level.
This also allows easy R upgrades. In addition, locations
in user
home raise a lot of issues (see the various discussion
where this
bites people on Windows) due to interactions with
software that does
mirroring, backups etc.. Hence this approach "just
works" as one
would expect on a Mac. If the user wishes to use his/her
private
library, it is trivial - just click on "At User Level"
and from then
on all packages are managed user's local library just
like on any
other platform.
On a "managed" Mac the user is not an admin and
therefore the
behavior makes no difference. The status quo just makes
it easier
for admins to manage the shared library, but in reality
this doesn't
matter as one would assume the admins know what they are
doing.
I don?t understand the part ?as it only affects the
default on
the first installation and not regular use? of your reply -
could you clarify this?
Unless a user creates a user lib manually, packages will
always
go into the system lib - not only on first use - but I
might be
misunderstanding your comment here.
I would argue that the current setup tends to be a lot
safer
than the alternatives, because it allows commonly used
packages
to be installed at the system level and private packages
to be
installed at user level. This is also the design
typically used
on shared machines, where you separate local packages
from user
packages where local ones are installed by
administrators - so
exactly the same setup. Moreover R upgrades are a lot
cleaner,
since you can easily upgrade all system packages at once
so you
don't have to worry about individual users having stale
packages
- the biggest problem for admins.
Yes and no.
Sometimes system admins want to install certain packages
globally - however, I never do so for the following reason:
Often this will lead to multiple package installations,
i.e. one
in the syslib and one in the user lib (if the user
installs the
package again for some reason which quite often happens).
Often these duplicated packages will have different
versions and
users are confused which one is actually loaded (the
user lib
one is as it is first in the path).
Aside from this practical point, Macs are rarely used in a
shared way.
And even if, I?d highly favor having to request write
permissions into the syslib rather it happening by default.
Imagine a scenario where the admin of a shared Mac
constantly
writes into the syslib (because this is the default). This
syslib is then also used by other non-admin users on the
system.
I don?t think this is a desired scenario and might cause
lot?s
of confusion (not even mentioning the fact if all people
in this
scenario are aware what?s going on given that this is a
niche
topic).
Here I think a one-time central installation of R and
then only
working with user libs (as on Linux) would be preferable.
Well, having administered company-wide R installations
in large
companies for almost two decades I'd strongly disagree.
As an admin
you want as few user-installed packages as possible,
because they
are guaranteed to cause problems. You want to limit this
for things
like development of packages where you want the stable
version
globally and development version locally (and this is
not just me -
have a look how the top tech companies manage their
software). You
have a reliable, stable central location - if you don't
do that then
you'll have n libraries to manage for n users which is
absolutely
not sustainable as users will break their libraries and
you cannot
even upgrade R. Also having a central, shared library is
crucial for
collaboration. Unlike in Python in R it actually works
since R and
CRAN doesn't allow randomly breaking reverse-dependencies.
From a technical perspective, I know that setting
root:root on
macOS is not possible. My proposed change to 755 (and
leaving
root:admin) would however exactly mimic this (and the
one of
Linux installs) behavior:
? admins would need to do sudo R to install into the
system library
? otherwise they are prompted to create a user library
Which downsides would this approach have? Currently I
don?t see
any. It would even harmonize CRAN installer behavior across
platforms.
I'd be happy to hear from more Mac user if there are
reasons to
change the default, but as I outlined the choices were
deliberate after weighting the pros and cons. In my view
the
major issue with the proposal it that is would prevent
sharing
of packages, make R upgrades a lot harder and prevent admin
users from using the current tools for package
management - and
that includes the ability to separate system and user
packages
on single-user machines.
I?ll try to vision the practical changes of this:
? Patch update experience would not change as custom
packages
will be in the user lib for the respective minor version
(by
default)
? Admins are still able to install into the system lib when
using sudo R
? AFAICS admins will still be able to separate system
and user
packages as they can use sudo R for syslib installs. To
me, the
proposed change would even make the behaviour more clear
than
before (which requires to create a hidden folder (user
lib) in
the right place to actually use a user lib).
Let me know if I overlook something - but currently I
don?t see
any downside but various positive impacts.
As mentioned before and above I disagree. The proposal
doesn't
matter for managed Macs but would negatively affect
users that are
single-user admins and since that is typically the case
for the
majority of Mac R users (as they typically are on their
personal
machines) I don't see any upshot. All it would do is to
prevent
typical R users to install packages directly.
Last, I wanted to ask if the source code for the CRAN
installer
is publicly available? I could not find it and would be
interested to take a look into it. If this is not
possible for
some reason, I would also be interested in getting to
know the
reason for this decision.
Everything is in the R SVN, the R build and release
system is in
https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4
<https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4>
<https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4
<https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4>>
and Apple Installer packaging is in
https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/packaging
<https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/packaging>
<https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/packaging
<https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/packaging>>
and the relevant postflight script is in
https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/packaging/scripts-R-fw/postflight
<https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/packaging/scripts-R-fw/postflight>
<https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/packaging/scripts-R-fw/postflight
<https://svn.r-project.org/R-dev-web/trunk/QA/Simon/R4/packaging/scripts-R-fw/postflight>>
On Apr 13, 2022, at 8:43 PM, Patrick Schratz
<patrick.schratz at gmail.com> wrote:
Related to this Q: Are the macOS CRAN policies actively
discussed by a team of people (who might eventually also be
willing to share their thoughts or could be addressed
with such
questions) or are you solely responsible for it?
CRAN is an entire team, so yes, but as for anything
Mac-related it
includes R-core and other stake holders that have
expressed interest
before (e.g. Bioconductor). Obviously, this (R-SIG-Mac)
is also a
good place as that includes anyone who cares about R on
macOS.
Cheers,
Simon
_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mac
<https://stat.ethz.ch/mailman/listinfo/r-sig-mac>
_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mac
<https://stat.ethz.ch/mailman/listinfo/r-sig-mac>
-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://stat.ethz.ch/pipermail/r-sig-mac/attachments/20220503/ad2ddf56/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 870 bytes Desc: OpenPGP digital signature URL: <https://stat.ethz.ch/pipermail/r-sig-mac/attachments/20220503/ad2ddf56/attachment-0001.sig>