I've made a small enhancement to R that would help developers better control what versions of code we're using where. Basically, to load a package in R, one currently does:
library(whateverPackage)
and with the enhancement, you can ensure that you're getting at least version X of the package:
library(whateverPackage, version=3.14)
Reasons one might want this include:
* you know that in version X some bug was fixed
* you know that in version X some feature was added
* that's the first version you've actually tested it with & you don't want to vouch for earlier versions without testing
* you develop on one machine & deploy on another machine you don't control, and you want runtime checks that the sysadmin installed what they were supposed to install
In general, I have an interest in helping R get better at various things that would help it play in a "production environment", for various values of that term. =)
The attached patch is made against revision 58980 of https://svn.r-project.org/R/trunk . I think this is the first patch I've submitted to the R core, so please let me know if anything's amiss, or of course if there are reservations about the approach.
Thanks.
--
Ken Williams, Senior Research Scientist
WindLogics
http://windlogics.com
CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
[patch] giving library() a 'version' argument
10 messages · Brian Ripley, Duncan Murdoch, Ken Williams +1 more
On 12-04-11 11:28 AM, Ken Williams wrote:
I've made a small enhancement to R that would help developers better control what versions of code we're using where. Basically, to load a package in R, one currently does:
library(whateverPackage)
and with the enhancement, you can ensure that you're getting at least version X of the package:
library(whateverPackage, version=3.14)
Reasons one might want this include:
* you know that in version X some bug was fixed
* you know that in version X some feature was added
* that's the first version you've actually tested it with& you don't want to vouch for earlier versions without testing
* you develop on one machine& deploy on another machine you don't control, and you want runtime checks that the sysadmin installed what they were supposed to install
I don't really see the need for this. Packages already have a scheme for requiring a particular version of a package, so this would only be useful in scripts run outside of packages. But what if your script requires a particular (perhaps obsolete) version of a package? This change only puts a lower bound on the version number, and version requirements can be more elaborate than that. I think my advice would be: 1. Put your code in a package, and use the version specifications there. 2. If you must write it in a script, then put a version test at the top, using packageVersion(). Duncan Murdoch
In general, I have an interest in helping R get better at various things that would help it play in a "production environment", for various values of that term. =) The attached patch is made against revision 58980 of https://svn.r-project.org/R/trunk . I think this is the first patch I've submitted to the R core, so please let me know if anything's amiss, or of course if there are reservations about the approach. Thanks. -- Ken Williams, Senior Research Scientist WindLogics http://windlogics.com CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
A very important point is that library() *had* a 'version' argument for
several years, and this is not what it did. So Mr Williams needs to do
his homework ....
From such a version of R:
version: A character string denoting a version number of the package
to be loaded, for use with _versioned installs_: see the
section later in this document.
...
On 12/04/2012 13:21, Duncan Murdoch wrote:
On 12-04-11 11:28 AM, Ken Williams wrote:
I've made a small enhancement to R that would help developers better control what versions of code we're using where. Basically, to load a package in R, one currently does: library(whateverPackage) and with the enhancement, you can ensure that you're getting at least version X of the package: library(whateverPackage, version=3.14) Reasons one might want this include: * you know that in version X some bug was fixed * you know that in version X some feature was added * that's the first version you've actually tested it with& you don't want to vouch for earlier versions without testing * you develop on one machine& deploy on another machine you don't control, and you want runtime checks that the sysadmin installed what they were supposed to install
I don't really see the need for this. Packages already have a scheme for requiring a particular version of a package, so this would only be useful in scripts run outside of packages. But what if your script requires a particular (perhaps obsolete) version of a package? This change only puts a lower bound on the version number, and version requirements can be more elaborate than that. I think my advice would be: 1. Put your code in a package, and use the version specifications there. 2. If you must write it in a script, then put a version test at the top, using packageVersion(). Duncan Murdoch
In general, I have an interest in helping R get better at various things that would help it play in a "production environment", for various values of that term. =) The attached patch is made against revision 58980 of https://svn.r-project.org/R/trunk . I think this is the first patch I've submitted to the R core, so please let me know if anything's amiss, or of course if there are reservations about the approach. Thanks. -- Ken Williams, Senior Research Scientist WindLogics http://windlogics.com CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
-----Original Message----- From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] Sent: Thursday, April 12, 2012 7:54 AM To: Duncan Murdoch Cc: Ken Williams; r-devel at r-project.org Subject: Re: [Rd] [patch] giving library() a 'version' argument A very important point is that library() *had* a 'version' argument for several years, and this is not what it did.
That is unfortunate. So such a mechanism would need to use a different argument name.
For completeness in this thread, I dug up the fact that it seems to have been removed in the 2.9.0 release:
o Support for versioned installs (R CMD INSTALL --with-package-versions
and install.packages(installWithVers = TRUE)) has been removed.
Packages installed with versioned names will be ignored.
I'll address Duncan's concerns in a separate message.
-Ken
CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}}
-----Original Message----- From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Sent: Thursday, April 12, 2012 7:22 AM To: Ken Williams Cc: r-devel at r-project.org Subject: Re: [Rd] [patch] giving library() a 'version' argument On 12-04-11 11:28 AM, Ken Williams wrote:
Reasons one might want this include: * you know that in version X some bug was fixed * you know that in version X some feature was added * that's the first version you've actually tested it with & you don't want to vouch for earlier versions without testing * you develop on one machine & deploy on another machine you don't control, and you want runtime checks that the sysadmin installed what they were supposed to install
I don't really see the need for this. Packages already have a scheme for requiring a particular version of a package, so this would only be useful in scripts run outside of packages.
The main distinction here is that the existing package mechanism enforces version requirements at *install* time, but this mechanism enforces it at *run* time. So this indeed applies well to scripts run outside packages, but it's also useful inside packages when they're loading their dependencies at runtime. I was trying to illustrate that with the 4 bullet points above (especially the last one) but I should have said so explicitly. It can happen very easily that constraints that were satisfied at install time get out of whack by subsequent package installations, but the violations go undetected. The result can be breakage, whether dramatic or subtle. The main hats targeted here are really people (like me, of course) who are trying to "productionize" results, not so much people who are doing offline analysis. In a production system
But what if your script requires a particular (perhaps obsolete) version of a package? This change only puts a lower bound on the version number, and version requirements can be more elaborate than that.
Certainly true; this was meant as a first iteration, and support for the more elaborate requirements specifications could certainly be added. The more elaborate specs actually illustrate the need for a runtime mechanism nicely - if code X (which may be a package, or a script, it doesn't matter) requires exactly version 3.14 of package B, and someone in the production team upgrades version 3.14 to version 3.78 because "it's faster" or "it's less buggy" or "we just like to have the latest version of everything all the time", then someone needs to be alerted to the problem. One alternative solution would be to use a full-fledged package management system like RPM or Deb to track all the dependencies, but yikes, that doesn't sound fun.
I think my advice would be: 1. Put your code in a package, and use the version specifications there. 2. If you must write it in a script, then put a version test at the top, using packageVersion().
Certainly those are alternatives, but to us they are somewhat unsatisfactory. The first option doesn't help with the crux of the problem, which is runtime enforcement. The second is essentially the same solution I've proposed, but doesn't help anyone outside our organization who has the same problem.
-Ken
CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}}
On 12/04/2012 11:11 AM, Ken Williams wrote:
-----Original Message----- From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Sent: Thursday, April 12, 2012 7:22 AM To: Ken Williams Cc: r-devel at r-project.org Subject: Re: [Rd] [patch] giving library() a 'version' argument On 12-04-11 11:28 AM, Ken Williams wrote:
> > Reasons one might want this include: > > * you know that in version X some bug was fixed > * you know that in version X some feature was added > * that's the first version you've actually tested it with& you don't want to > vouch for earlier versions without testing > * you develop on one machine& deploy on another machine you don't > control, and you want runtime checks that the sysadmin installed what > they were supposed to install
I don't really see the need for this. Packages already have a scheme for requiring a particular version of a package, so this would only be useful in scripts run outside of packages.
The main distinction here is that the existing package mechanism enforces version requirements at *install* time, but this mechanism enforces it at *run* time.
I haven't tested it, but according to the documentation in Writing R Extensions, the dependencies are enforced at the time library() is called.
So this indeed applies well to scripts run outside packages, but it's also useful inside packages when they're loading their dependencies at runtime. I was trying to illustrate that with the 4 bullet points above (especially the last one) but I should have said so explicitly.
If the docs are wrong (or I misread them), you could equally put a run-time version test into the .onLoad function in a package.
It can happen very easily that constraints that were satisfied at install time get out of whack by subsequent package installations, but the violations go undetected. The result can be breakage, whether dramatic or subtle. The main hats targeted here are really people (like me, of course) who are trying to "productionize" results, not so much people who are doing offline analysis. In a production system
But what if your script requires a particular (perhaps obsolete) version of a package? This change only puts a lower bound on the version number, and version requirements can be more elaborate than that.
Certainly true; this was meant as a first iteration, and support for the more elaborate requirements specifications could certainly be added. The more elaborate specs actually illustrate the need for a runtime mechanism nicely - if code X (which may be a package, or a script, it doesn't matter) requires exactly version 3.14 of package B, and someone in the production team upgrades version 3.14 to version 3.78 because "it's faster" or "it's less buggy" or "we just like to have the latest version of everything all the time", then someone needs to be alerted to the problem. One alternative solution would be to use a full-fledged package management system like RPM or Deb to track all the dependencies, but yikes, that doesn't sound fun.
But a single line at the top of the script would fix this:
stopifnot(packageVersion("foo") == "3.14")
Making the library() function more elaborate doesn't seem to add anything.
I think my advice would be: 1. Put your code in a package, and use the version specifications there. 2. If you must write it in a script, then put a version test at the top, using packageVersion().
Certainly those are alternatives, but to us they are somewhat unsatisfactory. The first option doesn't help with the crux of the problem, which is runtime enforcement. The second is essentially the same solution I've proposed, but doesn't help anyone outside our organization who has the same problem.
Another problem with putting this into library() is that packages aren't always loaded by library(): there is require(), and there are implicit loads triggered by dependencies of other packages. Duncan Murdoch
-----Original Message----- From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Sent: Thursday, April 12, 2012 12:27 PM To: Ken Williams Cc: r-devel at r-project.org Subject: Re: [Rd] [patch] giving library() a 'version' argument I haven't tested it, but according to the documentation in Writing R Extensions, the dependencies are enforced at the time library() is called.
Oh, I hadn't suspected that. I can look into testing that, if it's true then of course that changes this all. I probably won't be able to do that for a few days because I'll be traveling though. I've never noticed a package failing to load at runtime because its prereq-version dependency wasn't met though.
[...]
But a single line at the top of the script would fix this:
stopifnot(packageVersion("foo") == "3.14")
For the most common use case, that would look more like:
stopifnot(compareVersion(packageVersion("foo"), "3.14") < 0)
which gets less declarative, and I'd argue less clear about exactly what it's trying to enforce.
And I can see myself (& presumably others) getting that comparison operator backwards a lot, having to look it up each time or copy-paste it from other code.
And then that still doesn't add nice error messages, that would be yet more code.
*And*, it doesn't actually behave correctly if the package is already loaded by other code, because it might have been loaded from a different location than the one that would be found in the packageVersion() call. (Or am I maybe wrong about what packageVersion() does in that case? I don't think the docs specify that behavior.)
For prior art on this whole concept, a useful precedent is the 'use()' function in Perl, which accepts a version argument, even though there is also robust version checking at installation/testing time.
Another problem with putting this into library() is that packages aren't always loaded by library(): there is require(), and there are implicit loads triggered by dependencies of other packages.
That's not really a problem. If someone wants to enforce a runtime dependency, they stick the enforcement line into their code, and it will correctly stop if the criterion is not met.
-Ken
CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}}
On 4/12/12 10:11 AM, Ken Williams wrote:
On 4/12/12 7:22 AM, Duncan Murdoch wrote:
[SNIP] ... The main hats targeted here are really people (like me, of course) who are trying to "productionize" results, not so much people who are doing offline analysis. In a production system
But what if your script requires a particular (perhaps obsolete) version of a package? This change only puts a lower bound on the version number, and version requirements can be more elaborate than that.
Certainly true; this was meant as a first iteration, and support for the more elaborate requirements specifications could certainly be added. The more elaborate specs actually illustrate the need for a runtime mechanism nicely - if code X (which may be a package, or a script, it doesn't matter) requires exactly version 3.14 of package B, and someone in the production team upgrades version 3.14 to version 3.78 because "it's faster" or "it's less buggy" or "we just like to have the latest version of everything all the time", then someone needs to be alerted to the problem. One alternative solution would be to use a full-fledged package management system like RPM or Deb to track all the dependencies, but yikes, that doesn't sound fun.
I appreciate your contribution of both time and energy.
But I think the existing library() method is sufficient without
this modification. It's essentially syntactic sugar for:
library(MASS); stopifnot(packageVersion("MASS") >= "7.3"))
If your package requirements are that exacting, it would be far
simpler to just download all the specific versions to a single
directory and put that directory first in .libPaths().
Prayer never hurt either...
Our style here is to add sessionInfo() to the end of all scripts
and Sweave documents. As such we could reproduce exactly if
required. But I believe it would be impossible to track the
dependencies meaningfully across time.
On 12/04/2012 1:46 PM, Ken Williams wrote:
-----Original Message----- From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Sent: Thursday, April 12, 2012 12:27 PM To: Ken Williams Cc: r-devel at r-project.org Subject: Re: [Rd] [patch] giving library() a 'version' argument I haven't tested it, but according to the documentation in Writing R Extensions, the dependencies are enforced at the time library() is called.
Oh, I hadn't suspected that. I can look into testing that, if it's true then of course that changes this all. I probably won't be able to do that for a few days because I'll be traveling though. I've never noticed a package failing to load at runtime because its prereq-version dependency wasn't met though.
[...]
But a single line at the top of the script would fix this:
stopifnot(packageVersion("foo") == "3.14")
For the most common use case, that would look more like:
stopifnot(compareVersion(packageVersion("foo"), "3.14")< 0)
The compareVersion call doesn't need to be explicit, i.e. you'll get the
same result from
stopifnot(packageVersion("foo")< "3.14")
which looks pretty clear to me. It works in some quick tests,
recognizing that rgl version 0.92.879 is bigger than 0.92.100 but less
than 0.92.1000.
Duncan Murdoch
which gets less declarative, and I'd argue less clear about exactly what it's trying to enforce. And I can see myself (& presumably others) getting that comparison operator backwards a lot, having to look it up each time or copy-paste it from other code. And then that still doesn't add nice error messages, that would be yet more code. *And*, it doesn't actually behave correctly if the package is already loaded by other code, because it might have been loaded from a different location than the one that would be found in the packageVersion() call. (Or am I maybe wrong about what packageVersion() does in that case? I don't think the docs specify that behavior.) For prior art on this whole concept, a useful precedent is the 'use()' function in Perl, which accepts a version argument, even though there is also robust version checking at installation/testing time.
Another problem with putting this into library() is that packages aren't always loaded by library(): there is require(), and there are implicit loads triggered by dependencies of other packages.
That's not really a problem. If someone wants to enforce a runtime dependency, they stick the enforcement line into their code, and it will correctly stop if the criterion is not met. -Ken CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
-----Original Message----- From: Roebuck,Paul L [mailto:proebuck at mdanderson.org] Sent: Thursday, April 12, 2012 1:03 PM To: R-devel Cc: Ken Williams Subject: Re: [Rd] [patch] giving library() a 'version' argument On 4/12/12 10:11 AM, Ken Williams wrote:
On 4/12/12 7:22 AM, Duncan Murdoch wrote:
[SNIP] ... The main hats targeted here are really people (like me, of course) who are trying to "productionize" results, not so much people who are doing offline analysis. In a production system
But what if your script requires a particular (perhaps obsolete) version of a package? This change only puts a lower bound on the version number, and version requirements can be more elaborate than that.
Certainly true; this was meant as a first iteration, and support for the more elaborate requirements specifications could certainly be added. The more elaborate specs actually illustrate the need for a runtime mechanism nicely - if code X (which may be a package, or a script, it doesn't matter) requires exactly version 3.14 of package B, and someone in the production team upgrades version 3.14 to version 3.78 because "it's faster" or "it's less buggy" or "we just like to have the latest version of everything all the time", then someone needs to be alerted to the problem. One alternative solution would be to use a full-fledged package management system like RPM or Deb to track all the
dependencies, but yikes, that doesn't sound fun.
I appreciate your contribution of both time and energy.
But I think the existing library() method is sufficient without this modification.
It's essentially syntactic sugar for:
library(MASS); stopifnot(packageVersion("MASS") >= "7.3"))
I was about to write back & say "that's not correct, if '7.10' is installed, a string comparison will do the wrong thing."
But apparently it does the *right* thing, because 'numeric_version' class implements the comparison operator.
I'd still prefer to "Huffman-code it" to something shorter, to encourage people to use it, but I can see why others could consider it good enough.
I could contribute a doc patch to the 'numeric_version' man page to make it clearer what's available. The 3 comparisons there happen to turn out the same way when done as a string comparison.
I also do still have a question about what packageVersion() does when a package is already loaded - does it go look for it again, or does it check the version of what's already loaded? A doc patch could help here too.
-Ken
CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}}