Give update.formula() an option not to simplify or reorder the result -- request for comments

Hi Abs,

Re: your last point:
You made an interesting comment.

This is not
always the desired behavior, because formulas are increasingly used
for purposes other than specifying linear models.
Can I ask what these purposes are?
Not sure how relevant these are/what Pavel was referring to specifically,
but there are a few alternative uses that I'm familiar with in the
tidyverse packages.

Since formulas store both an expression and an environment they're really
useful for complex evaluation. rlang's "quosures" are a subclass of formula
<https://adv-r.hadley.nz/evaluation.html#quosure-impl>.

Othewise the main tidyverse use is a shorthand for specifying anonymous
functions (this is used extensively, particularly in purrr). From
?dplyr::mutate_at:
# You can also pass formulas to create functions on the spot, purrr-style:
starwars %>% mutate_at(c("height", "mass"), ~scale2(., na.rm = TRUE))

Also see ?dplyr::case_when:
x <- 1:50
case_when(
  x %% 35 == 0 ~ "fizz buzz",
  x %% 5 == 0 ~ "fizz",
  x %% 7 == 0 ~ "buzz",
  TRUE ~ as.character(x)
)

And in base R, formulas are used in the plotting functions, e.g.:
## boxplot on a formula:
boxplot(count ~ spray, data = InsectSprays, col = "lightgray")

Cheers,
Danny

Hi Pavel
(Back On List)

And my two cents...

At this time, the update.formula() method always performs a number of
transformations on the results, eliminating redundant variables and
reordering interactions to be after the main effects.
This the proposal is to add an option simplify= (defaulting to TRUE,
for backwards compatibility) that if FALSE will skip the simplification
step.
Any thoughts? One particular question that Martin raised is whether the
UI should be just a single logical argument, or something else.
Firstly, note that the constructor for formula objects behaves differently
to the update method, so I think any changes should be consistent between
the two functions.
#constructor - doesn't simplify
y ~ x + x
y ~ x + x
#update method - does simplify
update (y ~ x, ~. + x)
y ~ x

Interestingly, this doesn't simplify.
update (y ~ I (x), ~. + x)
y ~ I(x) + x

I think that simplification could mean different things.
So, there could be something like:
update (y ~ x, ~. + x, strip=FALSE)
y ~ I (2 * x)

I don't know how easy that would be to implement.
(Symbolic computation on par with computer algebra systems is a discussion
in itself...).
And you could have one argument (say, method="simplify") rather than two or
more logical arguments.

It would also be possible to allow partial forms of simplification, by
specifying which terms should be collapsed, however, I doubt any possible
usefulness of this, would justify the complexity.
However, feel free to disagree.

You made an interesting comment.

This is not
always the desired behavior, because formulas are increasingly used
for purposes other than specifying linear models.
Can I ask what these purposes are?

kind regards
Abs

        [[alternative HTML version deleted]]

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Give update.formula() an option not to simplify or reorder the result -- request for comments

Thread (5 messages)