Skip to content

[patch] giving library() a 'version' argument

10 messages · Brian Ripley, Duncan Murdoch, Ken Williams +1 more

#
I've made a small enhancement to R that would help developers better control what versions of code we're using where.  Basically, to load a package in R, one currently does:

                library(whateverPackage)

and with the enhancement, you can ensure that you're getting at least version X of the package:

                library(whateverPackage, version=3.14)

Reasons one might want this include:

  * you know that in version X some bug was fixed
  * you know that in version X some feature was added
  * that's the first version you've actually tested it with & you don't want to vouch for earlier versions without testing
  * you develop on one machine & deploy on another machine you don't control, and you want runtime checks that the sysadmin installed what they were supposed to install

In general, I have an interest in helping R get better at various things that would help it play in a "production environment", for various values of that term. =)

The attached patch is made against revision 58980 of https://svn.r-project.org/R/trunk .  I think this is the first patch I've submitted to the R core, so please let me know if anything's amiss, or of course if there are reservations about the approach.

Thanks.

--
Ken Williams, Senior Research Scientist
WindLogics
http://windlogics.com



CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
#
On 12-04-11 11:28 AM, Ken Williams wrote:
I don't really see the need for this.  Packages already have a scheme 
for requiring a particular version of a package, so this would only be 
useful in scripts run outside of packages.  But what if your script 
requires a particular (perhaps obsolete) version of a package?  This 
change only puts a lower bound on the version number, and version 
requirements can be more elaborate than that.

I think my advice would be:

1.  Put your code in a package, and use the version specifications there.

2.  If you must write it in a script, then put a version test at the 
top, using packageVersion().

Duncan Murdoch
#
A very important point is that library() *had* a 'version' argument for 
several years, and this is not what it did.  So Mr Williams needs to do 
his homework ....

 From such a version of R:


  version: A character string denoting a version number of the package
           to be loaded, for use with _versioned installs_: see the
           section later in this document.
...
On 12/04/2012 13:21, Duncan Murdoch wrote:

  
    
#
That is unfortunate.  So such a mechanism would need to use a different argument name.

For completeness in this thread, I dug up the fact that it seems to have been removed in the 2.9.0 release:

    o   Support for versioned installs (R CMD INSTALL --with-package-versions
        and install.packages(installWithVers = TRUE)) has been removed.
        Packages installed with versioned names will be ignored.

I'll address Duncan's concerns in a separate message.

 -Ken


CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}}
#
The main distinction here is that the existing package mechanism enforces version requirements at *install* time, but this mechanism enforces it at *run* time.  So this indeed applies well to scripts run outside packages, but it's also useful inside packages when they're loading their dependencies at runtime.  I was trying to illustrate that with the 4 bullet points above (especially the last one) but I should have said so explicitly.

It can happen very easily that constraints that were satisfied at install time get out of whack by subsequent package installations, but the violations go undetected.  The result can be breakage, whether dramatic or subtle.

The main hats targeted here are really people (like me, of course) who are trying to "productionize" results, not so much people who are doing offline analysis.  In a production system
Certainly true; this was meant as a first iteration, and support for the more elaborate requirements specifications could certainly be added.

The more elaborate specs actually illustrate the need for a runtime mechanism nicely - if code X (which may be a package, or a script, it doesn't matter) requires exactly version 3.14 of package B, and someone in the production team upgrades version 3.14 to version 3.78 because "it's faster" or "it's less buggy" or "we just like to have the latest version of everything all the time", then someone needs to be alerted to the problem.  One alternative solution would be to use a full-fledged package management system like RPM or Deb to track all the dependencies, but yikes, that doesn't sound fun.
Certainly those are alternatives, but to us they are somewhat unsatisfactory.  The first option doesn't help with the crux of the problem, which is runtime enforcement.  The second is essentially the same solution I've proposed, but doesn't help anyone outside our organization who has the same problem.

 -Ken

CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}}
#
On 12/04/2012 11:11 AM, Ken Williams wrote:
I haven't tested it, but according to the documentation in Writing R 
Extensions, the dependencies are enforced at the time library() is called.
If the docs are wrong (or I misread them), you could equally put a 
run-time version test into the .onLoad function in a package.
But a single line at the top of the script would fix this:

stopifnot(packageVersion("foo") == "3.14")

Making the library() function more elaborate doesn't seem to add anything.
Another problem with putting this into library() is that packages aren't 
always loaded by library():  there is require(), and there are implicit 
loads triggered by dependencies of other packages.

Duncan Murdoch
#
Oh, I hadn't suspected that.  I can look into testing that, if it's true then of course that changes this all.  I probably won't be able to do that for a few days because I'll be traveling though.

I've never noticed a package failing to load at runtime because its prereq-version dependency wasn't met though.
For the most common use case, that would look more like:

    stopifnot(compareVersion(packageVersion("foo"), "3.14") < 0)

which gets less declarative, and I'd argue less clear about exactly what it's trying to enforce.

And I can see myself (& presumably others) getting that comparison operator backwards a lot, having to look it up each time or copy-paste it from other code.

And then that still doesn't add nice error messages, that would be yet more code.

*And*, it doesn't actually behave correctly if the package is already loaded by other code, because it might have been loaded from a different location than the one that would be found in the packageVersion() call.  (Or am I maybe wrong about what packageVersion() does in that case?  I don't think the docs specify that behavior.)


For prior art on this whole concept, a useful precedent is the 'use()' function in Perl, which accepts a version argument, even though there is also robust version checking at installation/testing time.
That's not really a problem.  If someone wants to enforce a runtime dependency, they stick the enforcement line into their code, and it will correctly stop if the criterion is not met.

 -Ken

CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}}
#
On 4/12/12 10:11 AM, Ken Williams wrote:

            
I appreciate your contribution of both time and energy.

But I think the existing library() method is sufficient without
this modification. It's essentially syntactic sugar for:

library(MASS); stopifnot(packageVersion("MASS") >= "7.3"))

If your package requirements are that exacting, it would be far
simpler to just download all the specific versions to a single
directory and put that directory first in .libPaths().

Prayer never hurt either...

Our style here is to add sessionInfo() to the end of all scripts
and Sweave documents. As such we could reproduce exactly if
required. But I believe it would be impossible to track the
dependencies meaningfully across time.
#
On 12/04/2012 1:46 PM, Ken Williams wrote:
The compareVersion call doesn't need to be explicit, i.e. you'll get the 
same result from

stopifnot(packageVersion("foo")<  "3.14")


which looks pretty clear to me.  It works in some quick tests, 
recognizing that rgl version 0.92.879 is bigger than 0.92.100 but less 
than 0.92.1000.

Duncan Murdoch
#
I was about to write back & say "that's not correct, if '7.10' is installed, a string comparison will do the wrong thing."

But apparently it does the *right* thing, because 'numeric_version' class implements the comparison operator.

I'd still prefer to "Huffman-code it" to something shorter, to encourage people to use it, but I can see why others could consider it good enough.

I could contribute a doc patch to the 'numeric_version' man page to make it clearer what's available.  The 3 comparisons there happen to turn out the same way when done as a string comparison.

I also do still have a question about what packageVersion() does when a package is already loaded - does it go look for it again, or does it check the version of what's already loaded?  A doc patch could help here too.

 -Ken

CONFIDENTIALITY NOTICE: This e-mail message is for the s...{{dropped:7}}