Skip to content

declare and validate options

5 messages · Duncan Murdoch, Antoine Fabri, Jiří Moravec

#
Dear r-devel,

options() are basically global variables and they come with several issues:
* they're not really truly owned by a package aside from loose naming
conventions
* they're not validated
* their documentation is not standard, and they're often not documented at
all, it's hard to know what options exist
* in practice they're sometimes used for internal purposes, which is at
odds with their global nature and contribute to the mess, I think they can
almost always be replaced by objects under a `globals` environment in the
namespace, it's just a bit more work

I tried to do as much as possible with static analysis using my package opt
but it can only go so far : https://github.com/moodymudskipper/opt

I think we can do a bit better and that it's not necessarily so complex,
here's a draft of possible design :

We could have something like this in a package to register options along
with an optional validator, triggered on `options(..)` (or a new function).

# similar to registerS3method() :
registerOption("mypkg.my_option1")
registerOption("mypkg.my_option2", function(x) stopifnot(is.numeric(x))
# maybe a `default` arg too to avoid the .onLoad() gymnastics and invisible
NULL options

* validation is a breaking change so we'd have an environment variable to
opt in
* validation occurs when an option is set AND the namespace is already
loaded (so we can still set options without loading a namespace) OR it
occurs later when an applicable namespace is loaded
* if we register an option that has already been registered by another
package, we get a message, the validator of the last loaded namespace is
used, in practice due to naming conventions it doesn't really happen, CRAN
could also enforce naming conventions for new packages
* New packages must use registerOption() if they define options, and there
must be a standard documentation page for those, separately or together
(with aliases), accessible with `?mypkg.my_option1` etc...

This could certainly be done in different ways and I'd love to hear about
other ideas or obstacles to improvements in this area.

Thanks,

Antoine
#
On 29/03/2024 10:52 a.m., Antoine Fabri wrote:
I think there are too many packages that would need changes under this 
scheme.

A more easily achievable improvement would be to provide functions to 
support registration, validation and documentation, and leave it up to 
the package author to call those.  This wouldn't give you validation at 
the time a user set an option, but could make it easier to validate when 
the package retrieved the value:  specify rules in one place, then 
retrieve from multiple places, without needing to duplicate the rules.

If those functions could be made simple enough and bulletproof and were 
widely adopted, maybe they'd be copied into one of the base packages, 
but really the only need for that would be to support validation on 
setting, rather than validation on retrieval.

Duncan Murdoch
#
There would be zero if the registration of options is not required for
packages first uploaded on CRAN before the feature is implemented.
If an option is not registered no validation is triggered and nothing
breaks even if we opt in the behavior.
Sure but realistically few maintainers will opt-in for more restrictions.
if posit did something on those lines maybe it would have a chance but
otherwise I don't see an optional feature like this spread very far.
Or we need this package to make working with options really really much
easier for themselves as developers, not just beneficial for users in the
long run.

Le ven. 29 mars 2024 ? 16:25, Duncan Murdoch <murdoch.duncan at gmail.com> a
?crit :

  
  
#
On 29/03/2024 11:59 a.m., Antoine Fabri wrote:
Sorry, I missed that.  Then the objection is that this would require 
CRAN to apply two different sets of rules on submissions. When a 
resubmission arrived, they'd need to look in the archive to find out 
which set of rules applied to it.  They do a bit of that now 
(determining if a submission is a resubmission, for example), but this 
would be a bigger change.  I don't think date of first submission is 
ever currently used.
If this is something that you want CRAN to force on package authors, 
then you need to give some hard evidence that it will fix things that 
cause trouble.  But if you only apply the rule to new packages, not 
updates to old ones, it's hard to believe that it will really make much 
difference, though it will still be extra work for CRAN and R Core.
That should be a goal regardless of who does it.

Think about the development of the pipe operator:  it was in magrittr 
(and I think another package, but I forget the name) first, was widely 
adopted, then a simpler version was brought into base R.

Duncan Murdoch
#
re pipe: It was actually discussed on this mailing list long before 
magrittr, and various pipe operators existed in various packages for a 
long time.
 From outside observer it really seems that it was magrittr that 
popularized pipe and this popularity managed to get it into base R.


re options: It seems that the popular way to handle them is with an 
internal environment: https://r-pkgs.org/data.html#sec-data-state

We have discussed this in `import` package and came to the conclusion 
that this is much better than hooking into the global `options()`.
This solves multiple mentioned issues, such as them being local to the 
package and thus preventing possible conflict,
and you could easily write setter that will automatically perform 
validation.
I adopted this approach at my work and it leads to simpler and more 
robust code.

I would welcomed a type safety in R, but as general trend, such as in 
function definitions.
But I don't see much value shoehorning them only into options.

-- Jirka
On 30/03/24 06:05, Duncan Murdoch wrote: