Skip to content

pipes and setNames

6 messages · Avi Gross, Gabor Grothendieck, Gabriel Becker +2 more

#
When trying to transform names in a pipeline one can do the following
where for this example we are making names upper case.

  BOD |> (\(x) setNames(x, toupper(names(x))))()

but that seems a bit ugly and verbose.

1. One possibility is to enhance setNames to allow a function as a
second argument.  In that case one could write:

  BOD |> setNames(toupper)

2. One can already do the following with the existing `with` but is
quite verbose:
  BOD |> list() |> setNames(".") |> with(setNames(., toupper(names(.))))
but could be made simpler with a utility function.

This utility function is not as good for setNames but would still
result in shorter code than the anonymous function in the example at
the top of this email and is more general so it would also apply in
other situations too.  Here R would define a function with. (note dot
at end) which would be defined and used as follows.

  with. <- function(data, expr, ...) {
    eval(substitute(expr), list(. = data), enclos = parent.frame())
  }

  BOD |> with.(setNames(., toupper(names(.))))

with. is not as efficient as straight pipes but in many cases such as
this it does not really matter and one just wants to get it done
without the parenthesis laden anonymous function.

Having both of these two would be nice to make it easier to use R pipes.
#
Gabor,

It is always interesting to see suggestions for how to extend R, especially what is suggested to 
be base R. But generally extensions need to be needed not just wanted and the ramifications 
must be studied.

I am curious if you looked at existing pipe-like implementations in various packages to see how or 
if they support such functionality.

There are many things we can wish for but with some care if nonstandard evaluation is allowed.

For example, should we allow multiple instances of the underscore (or dot) character to each be 
replaced by the same input? That can be tricky. 

But a trivial solution is to not use the anonymous function but write your own small accessory function. 
I wrote this trivial one:

renamer <- function(x, fun) { names(x) <- fun(names(x)); x }


You can then do this:


x |> renamer(toupper)


Or use tolower or sort or lots of other functions with no arguments.

It is easy to generalize this to functions that allow additional arguments:


renamer2 <- function(x, fun, ...) { names(x) <- fun(names(x), ...); x }


x |> renamer2(sort, decreasing=TRUE)


Clearly this is not a general solution and a suggestion that I might like is to allow a
compound condition in the pipeline that lets you capture the argument into a named 
variable and then use it as you wish. BUT in a very real sense the anonymous 
function syntax gives you something like that even if a tad ugly to you. However 
things that look wrong to you or may make no sense to others may well 
be avoided. 

So your suggestion for an extension to existing functions to allow not repeating the 
name twice makes sense but R supports many attributes you may want to be 
able to manipulate in a pipeline including classes, dimensions, column names, 
row names and much more you can add on your own arbitrarily.

What method might be more generalizable to solve many such problems 
if used along with the new or previous pipes?






-----Original Message-----
From: Gabor Grothendieck <ggrothendieck at gmail.com>
To: r-devel at r-project.org <r-devel at r-project.org>
Sent: Sun, Apr 17, 2022 8:21 am
Subject: [Rd] pipes and setNames


When trying to transform names in a pipeline one can do the following

where for this example we are making names upper case.



? BOD |> (\(x) setNames(x, toupper(names(x))))()



but that seems a bit ugly and verbose.



1. One possibility is to enhance setNames to allow a function as a

second argument.? In that case one could write:



? BOD |> setNames(toupper)



2. One can already do the following with the existing `with` but is

quite verbose:

? BOD |> list() |> setNames(".") |> with(setNames(., toupper(names(.))))

but could be made simpler with a utility function.



This utility function is not as good for setNames but would still

result in shorter code than the anonymous function in the example at

the top of this email and is more general so it would also apply in

other situations too.? Here R would define a function with. (note dot

at end) which would be defined and used as follows.



? with. <- function(data, expr, ...) {

? ? eval(substitute(expr), list(. = data), enclos = parent.frame())

? }



? BOD |> with.(setNames(., toupper(names(.))))



with. is not as efficient as straight pipes but in many cases such as

this it does not really matter and one just wants to get it done

without the parenthesis laden anonymous function.



Having both of these two would be nice to make it easier to use R pipes.
#
This is a suggestion for base.  Workarounds using packages are not relevant.

Setting names is something that is done a lot.

On Sun, Apr 17, 2022 at 4:49 PM Avi Gross via R-devel
<r-devel at r-project.org> wrote:

  
    
#
Hi Gabor,

Just my 2c on a few things:

I have to say it feels weird/wrong to me to have setNames do anything
other than, well, set the names. Its a low level setter, in OOP parlance,
in my mind. That is not to say that there shouldn't or can't be another
function called, I don't' know, transform_names, which sets the names of an
object to a function of the existing ones. one could even get "weirder"
with it:

names_apply <- function(X, FUN, ...) names(X) <- vapply(names(X), FUN, "",
...)

As a practical matter, I have to say one of the core benefits of pipes is
legibility of the code, and so I wonder, honestly, if

ucase_names <- function(x)  setNames(x, toupper(names(x))

BOD |> ... |> ucase_names()

Isn't overall more desirable code anyway? I have to say I think I would
always write the above rather than having an anonymous function in the
middle of a pipeline, myself.

The issue with with. I think, is that as I understand it the native pipe *by
intentional design* does not involve non-standard evaluation. It is a
parser transformation. While restrictive compared to what magrittr users
are used to, there are benefits to this that Luke has thought very hard
about (as he does before doing anything). R providing with. as you're
describing, would essentially walk back that design and muddy the waters by
advancing  weird hybrid situation where a parser transformation is done but
then after that NSE is done anyway in common cases. I won't speak for Luke
in terms of what he might think of such an idea, but on the face of it that
seems like it would be pretty odd, to me.


 Also, I have to say even if i'm wrong about everything above, such a
function should definitely not be called with. The period is the smallest
displayable glyph, AFAIK, and having the names of two *related* functions
differ only by a trailing period is practically begging for people quickly
reading code to mistake which is in use.

Best,
~G


On Mon, Apr 18, 2022 at 11:00 AM Gabor Grothendieck <ggrothendieck at gmail.com>
wrote:

  
  
#
I agree that would be nice, but this task is indeed manageable with
one-line function definition as mentioned above.

For reference, take a look at dplyr::rename_with which does something
nearly identical as your proposal does. Note that even tidyverse setNames
implementation does not allow for functions nor lambdas.

Best,
Jan


Dne po 18. 4. 2022 21:50 u?ivatel Gabriel Becker <gabembecker at gmail.com>
napsal:

  
  
#
On Tue, Apr 19, 2022 at 1:56 AM Jan Net?k <netikja at gmail.com> wrote:
The closest equivalent to setNames() is rlang::set_names() which does
allow you to supply a character vector, a function, or NULL:

``` r
library(rlang)

x <- c(a = 1, b = 2, c = 3)
x |> set_names(c("x", "y", "z"))
#> x y z
#> 1 2 3
x |> set_names(toupper)
#> A B C
#> 1 2 3
x |> set_names(NULL)
#> [1] 1 2 3
```

When called with only one argument it uses the vector values for the names:

``` r
x |> set_names()
#> 1 2 3
#> 1 2 3
```

(this is often useful before lapply() and friends, if the vector is,
e.g. a vector of paths)

It also supports ... so you can avoid `c()` for simple cases:

```r
x |> set_names("x", "y", "z")
#> x y z
#> 1 2 3
```

And is a little stricter when it comes to the length of supplied names:

``` r
x |> set_names("x", "y")
#> Error in `set_names()`:
#> ! The size of `nm` (2) must be compatible with the size of `x` (3).
x |> setNames(c("x", "y"))
#>    x    y <NA>
#>    1    2    3
```

Hadley