Skip to content

[R-pkg-devel] visible binding for '<<-' assignment

14 messages · Ben Bolker, Joshua Ulrich, William Dunlap +3 more

#
Hi, all. I am developing a package that includes some global variables.
Because these are non-ASCII, I have escaped them. But then because these
are difficult to read, I want to provide an easy way for users to unescape
all of them up front. Thus I have code like to create and save the data in
global variables in one file:

pali_vowels <-
  c("a", "\u0101", "i", "\u012b", "u", "\u016b", "e", "o")
pali_consonants <-
  c("k", "kh", "g", "gh", "\u1e45",
    "c", "ch", "j", "jh", "\u00f1",
    "\u1e6d", "\u1e6dh", "\u1e0d", "\u1e0dh", "\u1e47",
    "t", "th", "d", "dh", "n",
    "p", "ph", "b", "bh", "m",
    "y", "r", "l", "v", "s", "h", "\u1e37", "\u1e43")
pali_alphabet <-c(pali_vowels, pali_consonants)
use_data(pali_alphabet, overwrite = TRUE)

and then I try to export a function like this in another file:

pali_string_fix <- function() {
  pali_alphabet <<-
     stringi::stri_unescape_unicode(pali_alphabet)
  # Several more of these...
  }

The idea is that users can run pali_string_fix() once when they load the
package and then they won't need to deal with all the Unicode escape
sequences after that.

However, this is getting rejected by the CRAN checks with the message:

* checking R code for possible problems ... [4s] NOTE
pali_string_fix: no visible binding for '<<-' assignment to
  'pali_alphabet'

I'm guessing this is because the data and the function are defined in
different files, so even though those globals are defined by my package,
that isn't obvious when the check is run on this code.

Does anyone have advice for how to fix this?

     Dan

.
-------------------------
Dan Zigmond
djz at shmonk.com
#
Store the cached data in an environment within the package:

pali_data <- new.env(parent = emptyenv())

pali_string_fix <- function() {
  pali_data$alphabet <-
     stringi::stri_unescape_unicode(pali_alphabet)
...
}

Gabor
On Thu, Sep 3, 2020 at 9:33 PM Dan Zigmond <djz at shmonk.com> wrote:
#
Thanks, Gabor. I want these to be easily available to package users though
? that's why they are in the package. So I would rather not "hide" them in
a local environment. This is fundamentally a data package, so access to
this data is the primary point of installing it.

Is there any other solution?

     Dan

.
--------------------------
Dan Zigmond
djz at shmonk.com
On Thu, Sep 3, 2020 at 1:40 PM G?bor Cs?rdi <csardi.gabor at gmail.com> wrote:

            

  
  
#
Is there a reason that this slightly more explicit version wouldn't work?

pali_string_fix <- function() {
     assign("pali_alphabet", stringi::stri_unescape_unicode(pali_alphabet),
            .GlobalEnv)
}
On 9/3/20 5:25 PM, Dan Zigmond wrote:
#
On Thu, Sep 3, 2020 at 4:36 PM Ben Bolker <bbolker at gmail.com> wrote:
Using assign will also cause R CMD check to throw a NOTE that you will
need to explain upon pkg submission to CRAN.

  
    
#
https://cran.r-project.org/web/packages/policies.html
- Packages should not modify the global environment (user?s workspace).

Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Sep 3, 2020 at 2:36 PM Ben Bolker <bbolker at gmail.com> wrote:
#
Given that both trigger a NOTE, is there a reason to favor the assign
solution over just using <<-?

     Dan

.
--------------------------
Dan Zigmond
djz at shmonk.com



On Thu, Sep 3, 2020 at 2:46 PM Joshua Ulrich <josh.m.ulrich at gmail.com>
wrote:

  
  
#
On 03/09/2020 4:31 p.m., Dan Zigmond wrote:
You shouldn't be doing that.  Write a function that returns those 
results, and tell the user that if they store them in a global variable 
named "string_fixes" (or whatever), then your function will use their 
values instead of your own built in ones.  You should never write to the 
global environment, but you can read from it.

Duncan Murdoch
#
I get that, but these variables are created by the package. It's a data
package so the whole point is to provide access to the data. I'm just
trying to provide an option to make the data more readable since I can't
include Unicode strings directly in the package. In other words, these
variables (eg, pali_alphabet) will already exist when the user attaches the
package ? but is there a way I can tweak them after the package has been
loaded?

     Dan

.
--------------------------
Dan Zigmond
djz at shmonk.com
On Thu, Sep 3, 2020 at 2:56 PM William Dunlap <wdunlap at tibco.com> wrote:

            

  
  
#
On Thu, Sep 3, 2020 at 10:25 PM Dan Zigmond <djz at shmonk.com> wrote:
Well, if you want to put a cache in a package, then the way I showed
you works well.

Possibly more importantly, maybe I misunderstood something, but
stringi::stri_unescape_unicode() is not doing anything or your
character vectors, because they do not contain escaped characters:

fixed <- stringi::stri_unescape_unicode(pali_alphabet)
identical(pali_alphabet, fixed)
#> TRUE

Gabor
#
OK, trying again.

   Would it work to save the unescaped versions in a .RData file as in 
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Data-in-packages 
?  Presumably the problems with non-ASCII variables arise when they show 
up in a text-format (e.g. .R) file, not when they are read from a 
binary-format file?

   Then, if you use LazyData: yes in the DESCRIPTION file (this may be 
the default?), these should automatically be accessible to users when 
the package is loaded?
On 9/3/20 4:31 PM, Dan Zigmond wrote:
#
I chose a bad example. :-) Trust me that I have a bunch of strings with
escaped Unicode.

It seems the consensus is that I should not try to do what I'm trying to
do. I think instead I'll just document how users can fix the escaping if
they want to, since it's not very hard anyway.

     Dan

.
--------------------------
Dan Zigmond
djz at shmonk.com
On Thu, Sep 3, 2020 at 2:59 PM G?bor Cs?rdi <csardi.gabor at gmail.com> wrote:

            

  
  
#
That was where I started, but for some reason that triggered a WARNING
about these non-ASCII characters, which seemed worse. :-)

     Dan

.
--------------------------
Dan Zigmond
djz at shmonk.com
On Thu, Sep 3, 2020 at 3:26 PM Ben Bolker <bbolker at gmail.com> wrote:

            

  
  
#
You can include Unicode strings in a package, either as data, or just
as character vectors. Or character vectors returned by functions. Many
packages do that. E.g. cli:::symbol_utf8$tick is Unicode (not
exported, but it could be).

The only restriction is that the source file must be ASCII, i.e. you
need to create these vectors with `\u` escapes. For example the tick
in cli is created like this:
https://github.com/r-lib/cli/blob/e3ca34656f5bb82df63bfc1c741e75acc79b13d9/R/symbol.R#L27

When you print it (e.g. with something like
`cat(cli:::symbol_utf8$tick)`) the proper Unicode character is
printed, as long as the platform supports it. E.g. on macOS:

? cli:::symbol_utf8$tick
[1] "?"

Gabor
On Thu, Sep 3, 2020 at 11:26 PM Dan Zigmond <djz at shmonk.com> wrote: