Pavel N. Krivitsky
2019-May-17 07:31 UTC
[Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments
Dear All, Martin Maechler has asked me to send this to R-devel for discussion after I submitted it as an enhancement request ( https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17563). At this time, the update.formula() method always performs a number of transformations on the results, eliminating redundant variables and reordering interactions to be after the main effects. This is not always the desired behaviour, because formulas are increasingly used for purposes other than specifying linear models. This the proposal is to add an option simplify= (defaulting to TRUE, for backwards compatibility) that if FALSE will skip the simplification step. That is,> update(a~b:c+b, .~.+b) # default: simplify=TRUEa ~ b + b:c> update(a~b:c+b, .~.+b, simplify=FALSE) # results are a mock-upa ~ b:c + b + b>From what I can tell, this can be accomplished by skipping the secondline of the implementation of update.formula() ("out <- formula(terms.formula(tmp, simplify = TRUE))"). Any thoughts? One particular question that Martin raised is whether the UI should be just a single logical argument, or something else. Best Regards, Pavel -- Pavel Krivitsky Lecturer in Statistics National Institute of Applied Statistics Research Australia (NIASRA) School of Mathematics and Applied Statistics | Building 39C Room 154 University of Wollongong NSW 2522 Australia T +61 2 4221 3713 Web (NIASRA): http://niasra.uow.edu.au/index.html Web (Personal): http://www.krivitsky.net/research ORCID: 0000-0002-9101-3362 NOTICE: This email is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Please consider the environment before printing this email.
Abby Spurdle
2019-May-20 02:11 UTC
[Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments
Hi Pavel (Back On List) And my two cents...> At this time, the update.formula() method always performs a number of > transformations on the results, eliminating redundant variables and > reordering interactions to be after the main effects. > This the proposal is to add an option simplify= (defaulting to TRUE, > for backwards compatibility) that if FALSE will skip the simplification > step. > Any thoughts? One particular question that Martin raised is whether the > UI should be just a single logical argument, or something else.Firstly, note that the constructor for formula objects behaves differently to the update method, so I think any changes should be consistent between the two functions.> #constructor - doesn't simplify > y ~ x + xy ~ x + x> #update method - does simplify > update (y ~ x, ~. + x)y ~ x Interestingly, this doesn't simplify.> update (y ~ I (x), ~. + x)y ~ I(x) + x I think that simplification could mean different things. So, there could be something like:> update (y ~ x, ~. + x, strip=FALSE)y ~ I (2 * x) I don't know how easy that would be to implement. (Symbolic computation on par with computer algebra systems is a discussion in itself...). And you could have one argument (say, method="simplify") rather than two or more logical arguments. It would also be possible to allow partial forms of simplification, by specifying which terms should be collapsed, however, I doubt any possible usefulness of this, would justify the complexity. However, feel free to disagree. You made an interesting comment.> This is not > always the desired behavior, because formulas are increasingly used > for purposes other than specifying linear models.Can I ask what these purposes are? kind regards Abs [[alternative HTML version deleted]]
Danny Smith
2019-May-20 03:23 UTC
[Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments
Hi Abs, Re: your last point:> You made an interesting comment. >> > This is not > > always the desired behavior, because formulas are increasingly used > > for purposes other than specifying linear models. > > Can I ask what these purposes are?Not sure how relevant these are/what Pavel was referring to specifically, but there are a few alternative uses that I'm familiar with in the tidyverse packages. Since formulas store both an expression and an environment they're really useful for complex evaluation. rlang's "quosures" are a subclass of formula <https://adv-r.hadley.nz/evaluation.html#quosure-impl>. Othewise the main tidyverse use is a shorthand for specifying anonymous functions (this is used extensively, particularly in purrr). From ?dplyr::mutate_at: # You can also pass formulas to create functions on the spot, purrr-style: starwars %>% mutate_at(c("height", "mass"), ~scale2(., na.rm = TRUE)) Also see ?dplyr::case_when: x <- 1:50 case_when( x %% 35 == 0 ~ "fizz buzz", x %% 5 == 0 ~ "fizz", x %% 7 == 0 ~ "buzz", TRUE ~ as.character(x) ) And in base R, formulas are used in the plotting functions, e.g.: ## boxplot on a formula: boxplot(count ~ spray, data = InsectSprays, col = "lightgray") Cheers, Danny On Mon, May 20, 2019 at 12:12 PM Abby Spurdle <spurdle.a at gmail.com> wrote:> Hi Pavel > (Back On List) > > And my two cents... > > > At this time, the update.formula() method always performs a number of > > transformations on the results, eliminating redundant variables and > > reordering interactions to be after the main effects. > > This the proposal is to add an option simplify= (defaulting to TRUE, > > for backwards compatibility) that if FALSE will skip the simplification > > step. > > Any thoughts? One particular question that Martin raised is whether the > > UI should be just a single logical argument, or something else. > > Firstly, note that the constructor for formula objects behaves differently > to the update method, so I think any changes should be consistent between > the two functions. > > #constructor - doesn't simplify > > y ~ x + x > y ~ x + x > > #update method - does simplify > > update (y ~ x, ~. + x) > y ~ x > > Interestingly, this doesn't simplify. > > update (y ~ I (x), ~. + x) > y ~ I(x) + x > > I think that simplification could mean different things. > So, there could be something like: > > update (y ~ x, ~. + x, strip=FALSE) > y ~ I (2 * x) > > I don't know how easy that would be to implement. > (Symbolic computation on par with computer algebra systems is a discussion > in itself...). > And you could have one argument (say, method="simplify") rather than two or > more logical arguments. > > It would also be possible to allow partial forms of simplification, by > specifying which terms should be collapsed, however, I doubt any possible > usefulness of this, would justify the complexity. > However, feel free to disagree. > > You made an interesting comment. > > > This is not > > always the desired behavior, because formulas are increasingly used > > for purposes other than specifying linear models. > > Can I ask what these purposes are? > > > kind regards > Abs > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Abby Spurdle
2019-May-24 22:59 UTC
[Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments
> Martin Maechler has asked me to send this to R-devel for discussion > after I submitted it as an enhancement request ( > https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17563).I think R needs to provide more support for CAS-style symbolic computation. That is, support by either the R language itself or the standard packages, or both. (And certainly not by interfacing with another interpreted language). Obviously, I don't speak for R Core. However, this is how I would like to see R move in the future. ...improved symbolic and symbolic-numeric computation... I think any changes to formula objects or their methods, should be congruent with these symbolic improvements. [[alternative HTML version deleted]]
Thomas Mailund
2019-May-25 05:40 UTC
[Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments
With a bit of meta programming that manipulates expressions, I don?t think this would be difficult to implement in a package. Well, as difficult as it is to implement a CAS, but not harder. I wrote some code for symbolic differentiation ? I don?t remember where I put it ? and that was easy. But that is because differentiation is just a handful of rules and then the chain rule. I don?t have the skills for handling more complex symbolic manipulation, but anyone who could add it to the language could also easily add it as a package, I think. Whether in a standard package or not, I have no preference whatsoever. Cheers Thomas On 25 May 2019 at 00.59.44, Abby Spurdle (spurdle.a at gmail.com<mailto:spurdle.a at gmail.com>) wrote:> Martin Maechler has asked me to send this to R-devel for discussion > after I submitted it as an enhancement request ( > https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17563).I think R needs to provide more support for CAS-style symbolic computation. That is, support by either the R language itself or the standard packages, or both. (And certainly not by interfacing with another interpreted language). Obviously, I don't speak for R Core. However, this is how I would like to see R move in the future. ...improved symbolic and symbolic-numeric computation... I think any changes to formula objects or their methods, should be congruent with these symbolic improvements. [[alternative HTML version deleted]] ______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]]
Martin Maechler
2019-Jun-07 14:11 UTC
[Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments
Trying to revive, possibly conclude a forgotten thread ...>>>>> Abby Spurdle >>>>> on Mon, 20 May 2019 14:11:47 +1200 writes:> Hi Pavel > (Back On List) > And my two cents... >> At this time, the update.formula() method always performs a number of >> transformations on the results, eliminating redundant variables and >> reordering interactions to be after the main effects. >> This the proposal is to add an option simplify= (defaulting to TRUE, >> for backwards compatibility) that if FALSE will skip the simplification >> step. >> Any thoughts? One particular question that Martin raised is whether the >> UI should be just a single logical argument, or something else. > Firstly, note that the constructor for formula objects behaves differently > to the update method, so I think any changes should be consistent between > the two functions. Not so easily: The ` ~ ` constructor does not sensibly (in my opinion) get optional arguments, whereas Pavel was suggesting a new *optional* argument to update.formula() >> #constructor - doesn't simplify >> y ~ x + x > y ~ x + x >> #update method - does simplify >> update (y ~ x, ~. + x) > y ~ x > Interestingly, this doesn't simplify. >> update (y ~ I (x), ~. + x) > y ~ I(x) + x well, I hope so: The whole point of I(.) is to *not* be identical (but close) to its argument. > I think that simplification could mean different things. Good point, I tend to agree, with the above, (whereas I'm less happy with this example : ) > So, there could be something like: >> update (y ~ x, ~. + x, strip=FALSE) > y ~ I (2 * x) > I don't know how easy that would be to implement. > (Symbolic computation on par with computer algebra systems is a discussion > in itself...). > And you could have one argument (say, method="simplify") rather than two or > more logical arguments. Yes exactly; given what we've heard till now, I'd also go for a new argument (possibly 'method') which should be a string (and keep the current behavior as default), ideally here with a match.arg() setup. > It would also be possible to allow partial forms of simplification, by > specifying which terms should be collapsed, however, I doubt any possible > usefulness of this, would justify the complexity. > However, feel free to disagree. > You made an interesting comment. >> This is not >> always the desired behavior, because formulas are increasingly used >> for purposes other than specifying linear models. > Can I ask what these purposes are? > kind regards > Abs
Apparently Analagous Threads
- Give update.formula() an option not to simplify or reorder the result -- request for comments
- DPLYR Multiple Mutate Statements On Same DataFrame
- DPLYR Multiple Mutate Statements On Same DataFrame
- Making R CMD nicer
- [R-pkg-devel] Three-argument S3method declaration does not seem to affect dispatching from inside the package.