Dear r-devel, options() are basically global variables and they come with several issues: * they're not really truly owned by a package aside from loose naming conventions * they're not validated * their documentation is not standard, and they're often not documented at all, it's hard to know what options exist * in practice they're sometimes used for internal purposes, which is at odds with their global nature and contribute to the mess, I think they can almost always be replaced by objects under a `globals` environment in the namespace, it's just a bit more work I tried to do as much as possible with static analysis using my package opt but it can only go so far : https://github.com/moodymudskipper/opt I think we can do a bit better and that it's not necessarily so complex, here's a draft of possible design : We could have something like this in a package to register options along with an optional validator, triggered on `options(..)` (or a new function). # similar to registerS3method() : registerOption("mypkg.my_option1") registerOption("mypkg.my_option2", function(x) stopifnot(is.numeric(x)) # maybe a `default` arg too to avoid the .onLoad() gymnastics and invisible NULL options * validation is a breaking change so we'd have an environment variable to opt in * validation occurs when an option is set AND the namespace is already loaded (so we can still set options without loading a namespace) OR it occurs later when an applicable namespace is loaded * if we register an option that has already been registered by another package, we get a message, the validator of the last loaded namespace is used, in practice due to naming conventions it doesn't really happen, CRAN could also enforce naming conventions for new packages * New packages must use registerOption() if they define options, and there must be a standard documentation page for those, separately or together (with aliases), accessible with `?mypkg.my_option1` etc... This could certainly be done in different ways and I'd love to hear about other ideas or obstacles to improvements in this area. Thanks, Antoine
declare and validate options
5 messages · Duncan Murdoch, Antoine Fabri, Jiří Moravec
On 29/03/2024 10:52 a.m., Antoine Fabri wrote:
Dear r-devel, options() are basically global variables and they come with several issues: * they're not really truly owned by a package aside from loose naming conventions * they're not validated * their documentation is not standard, and they're often not documented at all, it's hard to know what options exist * in practice they're sometimes used for internal purposes, which is at odds with their global nature and contribute to the mess, I think they can almost always be replaced by objects under a `globals` environment in the namespace, it's just a bit more work I tried to do as much as possible with static analysis using my package opt but it can only go so far : https://github.com/moodymudskipper/opt I think we can do a bit better and that it's not necessarily so complex, here's a draft of possible design : We could have something like this in a package to register options along with an optional validator, triggered on `options(..)` (or a new function). # similar to registerS3method() : registerOption("mypkg.my_option1") registerOption("mypkg.my_option2", function(x) stopifnot(is.numeric(x)) # maybe a `default` arg too to avoid the .onLoad() gymnastics and invisible NULL options * validation is a breaking change so we'd have an environment variable to opt in * validation occurs when an option is set AND the namespace is already loaded (so we can still set options without loading a namespace) OR it occurs later when an applicable namespace is loaded * if we register an option that has already been registered by another package, we get a message, the validator of the last loaded namespace is used, in practice due to naming conventions it doesn't really happen, CRAN could also enforce naming conventions for new packages * New packages must use registerOption() if they define options, and there must be a standard documentation page for those, separately or together (with aliases), accessible with `?mypkg.my_option1` etc... This could certainly be done in different ways and I'd love to hear about other ideas or obstacles to improvements in this area.
I think there are too many packages that would need changes under this scheme. A more easily achievable improvement would be to provide functions to support registration, validation and documentation, and leave it up to the package author to call those. This wouldn't give you validation at the time a user set an option, but could make it easier to validate when the package retrieved the value: specify rules in one place, then retrieve from multiple places, without needing to duplicate the rules. If those functions could be made simple enough and bulletproof and were widely adopted, maybe they'd be copied into one of the base packages, but really the only need for that would be to support validation on setting, rather than validation on retrieval. Duncan Murdoch
I think there are too many packages that would need changes under this scheme.
There would be zero if the registration of options is not required for packages first uploaded on CRAN before the feature is implemented. If an option is not registered no validation is triggered and nothing breaks even if we opt in the behavior.
If those functions could be made simple enough and bulletproof and were widely adopted, maybe they'd be copied into one of the base packages,
Sure but realistically few maintainers will opt-in for more restrictions. if posit did something on those lines maybe it would have a chance but otherwise I don't see an optional feature like this spread very far. Or we need this package to make working with options really really much easier for themselves as developers, not just beneficial for users in the long run. Le ven. 29 mars 2024 ? 16:25, Duncan Murdoch <murdoch.duncan at gmail.com> a ?crit :
On 29/03/2024 10:52 a.m., Antoine Fabri wrote:
Dear r-devel, options() are basically global variables and they come with several
issues:
* they're not really truly owned by a package aside from loose naming conventions * they're not validated * their documentation is not standard, and they're often not documented
at
all, it's hard to know what options exist * in practice they're sometimes used for internal purposes, which is at odds with their global nature and contribute to the mess, I think they
can
almost always be replaced by objects under a `globals` environment in the namespace, it's just a bit more work I tried to do as much as possible with static analysis using my package
opt
but it can only go so far : https://github.com/moodymudskipper/opt I think we can do a bit better and that it's not necessarily so complex, here's a draft of possible design : We could have something like this in a package to register options along with an optional validator, triggered on `options(..)` (or a new
function).
# similar to registerS3method() :
registerOption("mypkg.my_option1")
registerOption("mypkg.my_option2", function(x) stopifnot(is.numeric(x))
# maybe a `default` arg too to avoid the .onLoad() gymnastics and
invisible
NULL options * validation is a breaking change so we'd have an environment variable to opt in * validation occurs when an option is set AND the namespace is already loaded (so we can still set options without loading a namespace) OR it occurs later when an applicable namespace is loaded * if we register an option that has already been registered by another package, we get a message, the validator of the last loaded namespace is used, in practice due to naming conventions it doesn't really happen,
CRAN
could also enforce naming conventions for new packages * New packages must use registerOption() if they define options, and
there
must be a standard documentation page for those, separately or together (with aliases), accessible with `?mypkg.my_option1` etc... This could certainly be done in different ways and I'd love to hear about other ideas or obstacles to improvements in this area.
I think there are too many packages that would need changes under this scheme. A more easily achievable improvement would be to provide functions to support registration, validation and documentation, and leave it up to the package author to call those. This wouldn't give you validation at the time a user set an option, but could make it easier to validate when the package retrieved the value: specify rules in one place, then retrieve from multiple places, without needing to duplicate the rules. If those functions could be made simple enough and bulletproof and were widely adopted, maybe they'd be copied into one of the base packages, but really the only need for that would be to support validation on setting, rather than validation on retrieval. Duncan Murdoch
On 29/03/2024 11:59 a.m., Antoine Fabri wrote:
I think there are too many packages that would need changes under this
scheme.
There would be zero if the registration of options is not required for
packages first uploaded on CRAN before the feature is implemented.
If an option is not registered no validation is triggered and nothing
breaks even if we opt in the behavior.
Sorry, I missed that. Then the objection is that this would require CRAN to apply two different sets of rules on submissions. When a resubmission arrived, they'd need to look in the archive to find out which set of rules applied to it. They do a bit of that now (determining if a submission is a resubmission, for example), but this would be a bigger change. I don't think date of first submission is ever currently used.
If those functions could be made simple enough and bulletproof and were
widely adopted, maybe they'd be copied into one of the base packages,
Sure but realistically few maintainers will opt-in for more restrictions.
If this is something that you want CRAN to force on package authors, then you need to give some hard evidence that it will fix things that cause trouble. But if you only apply the rule to new packages, not updates to old ones, it's hard to believe that it will really make much difference, though it will still be extra work for CRAN and R Core.
if posit did something on those lines maybe it would have a chance but otherwise I don't see an optional feature like this spread very far. Or we need this package to make working with options really really much easier for themselves as developers, not just beneficial for users in the long run.
That should be a goal regardless of who does it. Think about the development of the pipe operator: it was in magrittr (and I think another package, but I forget the name) first, was widely adopted, then a simpler version was brought into base R. Duncan Murdoch
Le?ven. 29 mars 2024 ??16:25, Duncan Murdoch <murdoch.duncan at gmail.com
<mailto:murdoch.duncan at gmail.com>> a ?crit?:
On 29/03/2024 10:52 a.m., Antoine Fabri wrote:
> Dear r-devel,
>
> options() are basically global variables and they come with
several issues:
> * they're not really truly owned by a package aside from loose naming
> conventions
> * they're not validated
> * their documentation is not standard, and they're often not
documented at
> all, it's hard to know what options exist
> * in practice they're sometimes used for internal purposes, which
is at
> odds with their global nature and contribute to the mess, I think
they can
> almost always be replaced by objects under a `globals`
environment in the
> namespace, it's just a bit more work
>
> I tried to do as much as possible with static analysis using my
package opt
> but it can only go so far :
>
> I think we can do a bit better and that it's not necessarily so
complex,
> here's a draft of possible design :
>
> We could have something like this in a package to register
options along
> with an optional validator, triggered on `options(..)` (or a new
function).
>
> # similar to registerS3method() :
> registerOption("mypkg.my_option1")
> registerOption("mypkg.my_option2", function(x)
stopifnot(is.numeric(x))
> # maybe a `default` arg too to avoid the .onLoad() gymnastics and
invisible
> NULL options
>
> * validation is a breaking change so we'd have an environment
variable to
> opt in
> * validation occurs when an option is set AND the namespace is
already
> loaded (so we can still set options without loading a namespace)
OR it
> occurs later when an applicable namespace is loaded
> * if we register an option that has already been registered by
another
> package, we get a message, the validator of the last loaded
namespace is
> used, in practice due to naming conventions it doesn't really
happen, CRAN
> could also enforce naming conventions for new packages
> * New packages must use registerOption() if they define options,
and there
> must be a standard documentation page for those, separately or
together
> (with aliases), accessible with `?mypkg.my_option1` etc...
>
> This could certainly be done in different ways and I'd love to
hear about
> other ideas or obstacles to improvements in this area.
>
I think there are too many packages that would need changes under this
scheme.
A more easily achievable improvement would be to provide functions to
support registration, validation and documentation, and leave it up to
the package author to call those.? This wouldn't give you validation at
the time a user set an option, but could make it easier to validate
when
the package retrieved the value:? specify rules in one place, then
retrieve from multiple places, without needing to duplicate the rules.
If those functions could be made simple enough and bulletproof and were
widely adopted, maybe they'd be copied into one of the base packages,
but really the only need for that would be to support validation on
setting, rather than validation on retrieval.
Duncan Murdoch
re pipe: It was actually discussed on this mailing list long before magrittr, and various pipe operators existed in various packages for a long time. From outside observer it really seems that it was magrittr that popularized pipe and this popularity managed to get it into base R. re options: It seems that the popular way to handle them is with an internal environment: https://r-pkgs.org/data.html#sec-data-state We have discussed this in `import` package and came to the conclusion that this is much better than hooking into the global `options()`. This solves multiple mentioned issues, such as them being local to the package and thus preventing possible conflict, and you could easily write setter that will automatically perform validation. I adopted this approach at my work and it leads to simpler and more robust code. I would welcomed a type safety in R, but as general trend, such as in function definitions. But I don't see much value shoehorning them only into options. -- Jirka
On 30/03/24 06:05, Duncan Murdoch wrote:
On 29/03/2024 11:59 a.m., Antoine Fabri wrote:
??? I think there are too many packages that would need changes under this ??? scheme. There would be zero if the registration of options is not required for packages first uploaded on CRAN before the feature is implemented. If an option is not registered no validation is triggered and nothing breaks even if we opt in the behavior.
Sorry, I missed that.? Then the objection is that this would require CRAN to apply two different sets of rules on submissions. When a resubmission arrived, they'd need to look in the archive to find out which set of rules applied to it.? They do a bit of that now (determining if a submission is a resubmission, for example), but this would be a bigger change.? I don't think date of first submission is ever currently used.
??? If those functions could be made simple enough and bulletproof and were ??? widely adopted, maybe they'd be copied into one of the base packages, Sure but realistically few maintainers will opt-in for more restrictions.
If this is something that you want CRAN to force on package authors, then you need to give some hard evidence that it will fix things that cause trouble.? But if you only apply the rule to new packages, not updates to old ones, it's hard to believe that it will really make much difference, though it will still be extra work for CRAN and R Core.
if posit did something on those lines maybe it would have a chance but otherwise I don't see an optional feature like this spread very far. Or we need this package to make working with options really really much easier for themselves as developers, not just beneficial for users in the long run.
That should be a goal regardless of who does it. Think about the development of the pipe operator:? it was in magrittr (and I think another package, but I forget the name) first, was widely adopted, then a simpler version was brought into base R. Duncan Murdoch
Le?ven. 29 mars 2024 ??16:25, Duncan Murdoch <murdoch.duncan at gmail.com <mailto:murdoch.duncan at gmail.com>> a ?crit?: ??? On 29/03/2024 10:52 a.m., Antoine Fabri wrote: ???? > Dear r-devel, ???? > ???? > options() are basically global variables and they come with ??? several issues: ???? > * they're not really truly owned by a package aside from loose naming ???? > conventions ???? > * they're not validated ???? > * their documentation is not standard, and they're often not ??? documented at ???? > all, it's hard to know what options exist ???? > * in practice they're sometimes used for internal purposes, which ??? is at ???? > odds with their global nature and contribute to the mess, I think ??? they can ???? > almost always be replaced by objects under a `globals` ??? environment in the ???? > namespace, it's just a bit more work ???? > ???? > I tried to do as much as possible with static analysis using my ??? package opt ???? > but it can only go so far : ??? https://github.com/moodymudskipper/opt ??? <https://github.com/moodymudskipper/opt> ???? > ???? > I think we can do a bit better and that it's not necessarily so ??? complex, ???? > here's a draft of possible design : ???? > ???? > We could have something like this in a package to register ??? options along ???? > with an optional validator, triggered on `options(..)` (or a new ??? function). ???? > ???? > # similar to registerS3method() : ???? > registerOption("mypkg.my_option1") ???? > registerOption("mypkg.my_option2", function(x) ??? stopifnot(is.numeric(x)) ???? > # maybe a `default` arg too to avoid the .onLoad() gymnastics and ??? invisible ???? > NULL options ???? > ???? > * validation is a breaking change so we'd have an environment ??? variable to ???? > opt in ???? > * validation occurs when an option is set AND the namespace is ??? already ???? > loaded (so we can still set options without loading a namespace) ??? OR it ???? > occurs later when an applicable namespace is loaded ???? > * if we register an option that has already been registered by ??? another ???? > package, we get a message, the validator of the last loaded ??? namespace is ???? > used, in practice due to naming conventions it doesn't really ??? happen, CRAN ???? > could also enforce naming conventions for new packages ???? > * New packages must use registerOption() if they define options, ??? and there ???? > must be a standard documentation page for those, separately or ??? together ???? > (with aliases), accessible with `?mypkg.my_option1` etc... ???? > ???? > This could certainly be done in different ways and I'd love to ??? hear about ???? > other ideas or obstacles to improvements in this area. ???? > ??? I think there are too many packages that would need changes under this ??? scheme. ??? A more easily achievable improvement would be to provide functions to ??? support registration, validation and documentation, and leave it up to ??? the package author to call those.? This wouldn't give you validation at ??? the time a user set an option, but could make it easier to validate ??? when ??? the package retrieved the value:? specify rules in one place, then ??? retrieve from multiple places, without needing to duplicate the rules. ??? If those functions could be made simple enough and bulletproof and were ??? widely adopted, maybe they'd be copied into one of the base packages, ??? but really the only need for that would be to support validation on ??? setting, rather than validation on retrieval. ??? Duncan Murdoch
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel