Gabor Grothendieck
2021-Jan-15 12:27 UTC
[Rd] brief update on the pipe operator in R-devel
These are documented but still seem like serious deficiencies:> f <- function(x, y) x + 10*y > 3 |> x => f(x, x)Error in f(x, x) : pipe placeholder may only appear once> 3 |> x => f(1+x, 1)Error in f(1 + x, 1) : pipe placeholder must only appear as a top-level argument in the RHS call Also note: ?"=>" No documentation for ?=>? in specified packages and libraries: you could try ???=>? On Tue, Dec 22, 2020 at 5:28 PM <luke-tierney at uiowa.edu> wrote:> > It turns out that allowing a bare function expression on the > right-hand side (RHS) of a pipe creates opportunities for confusion > and mistakes that are too risky. So we will be dropping support for > this from the pipe operator. > > The case of a RHS call that wants to receive the LHS result in an > argument other than the first can be handled with just implicit first > argument passing along the lines of > > mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))() > > It was hoped that allowing a bare function expression would make this > more convenient, but it has issues as outlined below. We are exploring > some alternatives, and will hopefully settle on one soon after the > holidays. > > The basic problem, pointed out in a comment on Twitter, is that in > expressions of the form > > 1 |> \(x) x + 1 -> y > 1 |> \(x) x + 1 |> \(y) x + y > > everything after the \(x) is parsed as part of the body of the > function. So these are parsed along the lines of > > 1 |> \(x) { x + 1 -> y } > 1 |> \(x) { x + 1 |> \(y) x + y } > > In the first case the result is assigned to a (useless) local > variable. Someone writing this is more likely to have intended to > assign the result to a global variable, as this would: > > (1 |> \(x) x + 1) -> y > > In the second case the 'x' in 'x + y' refers to the local variable 'x' > in the first RHS function. Someone writing this is more likely to have > meant > > (1 |> \(x) x + 1) |> \(y) x + y > > with 'x' in 'x + y' now referring to a global variable: > > > x <- 2 > > 1 |> \(x) x + 1 |> \(y) x + y > [1] 3 > > (1 |> \(x) x + 1) |> \(y) x + y > [1] 4 > > These issues arise with any approach in R that allows a bare function > expression on the RHS of a pipe operation. It also arises in other > languages with pipe operators. For example, here is the last example > in Julia: > > julia> x = 2 > 2 > julia> 1 |> x -> x + 1 |> y -> x + y > 3 > julia> ( 1 |> x -> x + 1 ) |> y -> x + y > 4 > > Even though proper use of parentheses can work around these issues, > the likelihood of making mistakes that are hard to track down is too > high. So we will disallow the use of bare function expressions on the > right hand side of a pipe. > > Best, > > luke > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tierney at uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
Gabor, Although it might be nice if all imagined cases worked, there are many ways to work around and get the results you want. You may want to consider that it is easier to recognize the symbol you use (x in the examples) if it is alone and used only exactly once and it the list of function arguments. If you want the x used multiple times, you can make a function that accepts the x once and then invokes another function and reuses the x as often as needed. Similarly for 1+x. I do not know if the above choice was made to make it easier and faster to apply the above, or to avoid possible bad edge cases. Have you tested other ideas like: 3 |> x => f(x=5) Or 3 |> x => f(x, y=x) I mean ones where a default is supplied, not that it makes much sense here? I am thinking of the concept of substitution as is often done for text or symbols. Often the substitution is done for the first instance found unless you specify you want a global change. In your examples, if only the first use of x would be replaced, the second naked x being left alone would be an error. If all instances were changed, what anomalies might happen? Giving a vector of length 1 containing the number 3 seems harmless enough to duplicate. But the pipeline can send all kinds of interesting data structures through including data.frames and arbitrary objects. -----Original Message----- From: R-devel <r-devel-bounces at r-project.org> On Behalf Of Gabor Grothendieck Sent: Friday, January 15, 2021 7:28 AM To: Tierney, Luke <luke-tierney at uiowa.edu> Cc: R-devel at r-project.org Subject: Re: [Rd] brief update on the pipe operator in R-devel These are documented but still seem like serious deficiencies:> f <- function(x, y) x + 10*y > 3 |> x => f(x, x)Error in f(x, x) : pipe placeholder may only appear once> 3 |> x => f(1+x, 1)Error in f(1 + x, 1) : pipe placeholder must only appear as a top-level argument in the RHS call Also note: ?"=>" No documentation for ?=>? in specified packages and libraries: you could try ???=>? On Tue, Dec 22, 2020 at 5:28 PM <luke-tierney at uiowa.edu> wrote:> > It turns out that allowing a bare function expression on the > right-hand side (RHS) of a pipe creates opportunities for confusion > and mistakes that are too risky. So we will be dropping support for > this from the pipe operator. > > The case of a RHS call that wants to receive the LHS result in an > argument other than the first can be handled with just implicit first > argument passing along the lines of > > mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))() > > It was hoped that allowing a bare function expression would make this > more convenient, but it has issues as outlined below. We are exploring > some alternatives, and will hopefully settle on one soon after the > holidays. > > The basic problem, pointed out in a comment on Twitter, is that in > expressions of the form > > 1 |> \(x) x + 1 -> y > 1 |> \(x) x + 1 |> \(y) x + y > > everything after the \(x) is parsed as part of the body of the > function. So these are parsed along the lines of > > 1 |> \(x) { x + 1 -> y } > 1 |> \(x) { x + 1 |> \(y) x + y } > > In the first case the result is assigned to a (useless) local > variable. Someone writing this is more likely to have intended to > assign the result to a global variable, as this would: > > (1 |> \(x) x + 1) -> y > > In the second case the 'x' in 'x + y' refers to the local variable 'x' > in the first RHS function. Someone writing this is more likely to have > meant > > (1 |> \(x) x + 1) |> \(y) x + y > > with 'x' in 'x + y' now referring to a global variable: > > > x <- 2 > > 1 |> \(x) x + 1 |> \(y) x + y > [1] 3 > > (1 |> \(x) x + 1) |> \(y) x + y > [1] 4 > > These issues arise with any approach in R that allows a bare function > expression on the RHS of a pipe operation. It also arises in other > languages with pipe operators. For example, here is the last example > in Julia: > > julia> x = 2 > 2 > julia> 1 |> x -> x + 1 |> y -> x + y > 3 > julia> ( 1 |> x -> x + 1 ) |> y -> x + y > 4 > > Even though proper use of parentheses can work around these issues, > the likelihood of making mistakes that are hard to track down is too > high. So we will disallow the use of bare function expressions on the > right hand side of a pipe. > > Best, > > luke > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tierney at uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel