Give update.formula() an option not to simplify or reorder the result -- request for comments
Hi Abs, Re: your last point:
You made an interesting comment.
This is not always the desired behavior, because formulas are increasingly used for purposes other than specifying linear models.
Can I ask what these purposes are?
Not sure how relevant these are/what Pavel was referring to specifically, but there are a few alternative uses that I'm familiar with in the tidyverse packages. Since formulas store both an expression and an environment they're really useful for complex evaluation. rlang's "quosures" are a subclass of formula <https://adv-r.hadley.nz/evaluation.html#quosure-impl>. Othewise the main tidyverse use is a shorthand for specifying anonymous functions (this is used extensively, particularly in purrr). From ?dplyr::mutate_at: # You can also pass formulas to create functions on the spot, purrr-style: starwars %>% mutate_at(c("height", "mass"), ~scale2(., na.rm = TRUE)) Also see ?dplyr::case_when: x <- 1:50 case_when( x %% 35 == 0 ~ "fizz buzz", x %% 5 == 0 ~ "fizz", x %% 7 == 0 ~ "buzz", TRUE ~ as.character(x) ) And in base R, formulas are used in the plotting functions, e.g.: ## boxplot on a formula: boxplot(count ~ spray, data = InsectSprays, col = "lightgray") Cheers, Danny
On Mon, May 20, 2019 at 12:12 PM Abby Spurdle <spurdle.a at gmail.com> wrote:
Hi Pavel (Back On List) And my two cents...
At this time, the update.formula() method always performs a number of transformations on the results, eliminating redundant variables and reordering interactions to be after the main effects. This the proposal is to add an option simplify= (defaulting to TRUE, for backwards compatibility) that if FALSE will skip the simplification step. Any thoughts? One particular question that Martin raised is whether the UI should be just a single logical argument, or something else.
Firstly, note that the constructor for formula objects behaves differently to the update method, so I think any changes should be consistent between the two functions.
#constructor - doesn't simplify y ~ x + x
y ~ x + x
#update method - does simplify update (y ~ x, ~. + x)
y ~ x Interestingly, this doesn't simplify.
update (y ~ I (x), ~. + x)
y ~ I(x) + x I think that simplification could mean different things. So, there could be something like:
update (y ~ x, ~. + x, strip=FALSE)
y ~ I (2 * x) I don't know how easy that would be to implement. (Symbolic computation on par with computer algebra systems is a discussion in itself...). And you could have one argument (say, method="simplify") rather than two or more logical arguments. It would also be possible to allow partial forms of simplification, by specifying which terms should be collapsed, however, I doubt any possible usefulness of this, would justify the complexity. However, feel free to disagree. You made an interesting comment.
This is not always the desired behavior, because formulas are increasingly used for purposes other than specifying linear models.
Can I ask what these purposes are?
kind regards
Abs
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel