thr3ads.net - R devel - [Rd] New pipe operator [Dec 2020]

If this information is useful, please help other people find it:
Share via:

Kevin Ushey

2020-Dec-07 19:02 UTC

[Rd] New pipe operator

IMHO the use of anonymous functions is a very clean solution to the
placeholder problem, and the shorthand lambda syntax makes it much
more ergonomic to use. Pipe implementations that crawl the RHS for
usages of `.` are going to be more expensive than the alternatives. It
is nice that the `|>` operator is effectively the same as a regular R
function call, and given the identical semantics could then also be
reasoned about the same way regular R function calls are.

I also agree usages of the `.` placeholder can make the code more
challenging to read, since understanding the behavior of a piped
expression then requires scouring the RHS for usages of `.`, which can
be challenging in dense code. Piping to an anonymous function makes
the intent clear to the reader: the programmer is likely piping to an
anonymous function because they care where the argument is used in the
call, and so the reader of code should be aware of that.

Best,
Kevin

On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:>
> On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch <murdoch.duncan at
gmail.com> wrote:
> > An advantage of the current implementation is that it's simple and
easy
> > to understand.  Once you make it a user-modifiable binary operator,
> > things will go kind of nuts.
> >
> > For example, I doubt if there are many users of magrittr's pipe
who
> > really understand its subtleties, e.g. the example in Luke's paper
where
> > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1,
2). (And
> > I could add 1 %>% c(c(.), 2, .) and  1 %>% c(c(.), 2, . + 2)  to
> > continue the fun.)
>
> The rule is not so complicated.  Automatic insertion is done unless
> you use dot in the top level function or if you surround it with
> {...}.  It really makes sense since if you use gsub(pattern,
> replacement, .) then surely you don't want automatic insertion and if
> you surround it with { ... } then you are explicitly telling it not
> to.
>
> Assuming the existence of placeholders a possible simplification would
> be to NOT do automatic insertion if { ... } is used and to use it
> otherwise although personally having used it for some time I find the
> existing rule in magrittr generally does what you want.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Gabor Grothendieck

2020-Dec-07 19:23 UTC

head link

[Rd] New pipe operator

On Mon, Dec 7, 2020 at 2:02 PM Kevin Ushey <kevinushey at gmail.com>
wrote:>
> IMHO the use of anonymous functions is a very clean solution to the
> placeholder problem, and the shorthand lambda syntax makes it much
> more ergonomic to use. Pipe implementations that crawl the RHS for
> usages of `.` are going to be more expensive than the alternatives. It
You wouldn't have to crawl the expression.  This does it at the syntax
level.

  e <- quote( { gsub("x", "y", .) } )
  c(e[[1]], quote(. <- LHS), e[-1])

Gabriel Becker

2020-Dec-07 22:09 UTC

head link

[Rd] New pipe operator

On Mon, Dec 7, 2020 at 11:05 AM Kevin Ushey <kevinushey at gmail.com>
wrote:
> IMHO the use of anonymous functions is a very clean solution to the
> placeholder problem, and the shorthand lambda syntax makes it much
> more ergonomic to use. Pipe implementations that crawl the RHS for
> usages of `.` are going to be more expensive than the alternatives. It
> is nice that the `|>` operator is effectively the same as a regular R
> function call, and given the identical semantics could then also be
> reasoned about the same way regular R function calls are.
>
I agree. That said, one thing that maybe could be done, though I'm not
super convinced its needed, is make a "curry-stuffed pipe", where
something
like

LHS |^pipearg^> RHS(arg1 = 5, arg3 = 7)

Would parse to

RHS(pipearg = LHS, arg1 = 5, arg3 = 7)


(Assuming we could get the parser to handle |^bla^> correctly)

For argument position issues would be sufficient. For more complicated
expressions, e.g., those that would use the placeholder multiple times or
inside compound expressions, requiring anonymous functions seems quite
reasonable to me. And honestly, while I kind of like it, I'm not sure if
that "stuffed pipe" expression (assuming we could get the parser to
capture
it correctly) reads to me as nicer than the following, anyway.

LHS |> \(x) RHS(arg1 = 5, pipearg = x, arg3 = 7)

~G
>
> I also agree usages of the `.` placeholder can make the code more
> challenging to read, since understanding the behavior of a piped
> expression then requires scouring the RHS for usages of `.`, which can
> be challenging in dense code. Piping to an anonymous function makes
> the intent clear to the reader: the programmer is likely piping to an
> anonymous function because they care where the argument is used in the
> call, and so the reader of code should be aware of that.
>
> Best,
> Kevin
>
>
>
> On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
> >
> > On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch <murdoch.duncan at
gmail.com>
> wrote:
> > > An advantage of the current implementation is that it's
simple and easy
> > > to understand.  Once you make it a user-modifiable binary
operator,
> > > things will go kind of nuts.
> > >
> > > For example, I doubt if there are many users of magrittr's
pipe who
> > > really understand its subtleties, e.g. the example in Luke's
paper
> where
> > > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1,
1, 2). (And
> > > I could add 1 %>% c(c(.), 2, .) and  1 %>% c(c(.), 2, . +
2)  to
> > > continue the fun.)
> >
> > The rule is not so complicated.  Automatic insertion is done unless
> > you use dot in the top level function or if you surround it with
> > {...}.  It really makes sense since if you use gsub(pattern,
> > replacement, .) then surely you don't want automatic insertion and
if
> > you surround it with { ... } then you are explicitly telling it not
> > to.
> >
> > Assuming the existence of placeholders a possible simplification would
> > be to NOT do automatic insertion if { ... } is used and to use it
> > otherwise although personally having used it for some time I find the
> > existing rule in magrittr generally does what you want.
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
	[[alternative HTML version deleted]]

Hadley Wickham

2020-Dec-08 13:40 UTC

head link

[Rd] New pipe operator

I just wanted to pipe in here (HA HA) to say that I agree with Kevin.
I've never loved the complicated magrittr rule (which has personally
tripped me up a couple of times) and I think the compact inline
function syntax provides a more general solution. It is a bit more
typing, and it will require a little time for your eyes to get used to
the new syntax, but overall I think it's a better solution.

In general, I think the base pipe does an excellent job of taking what
we've learned from 6 years of magrittr, keeping what has been most
successful while discarding complications around the edges.

Hadley

On Mon, Dec 7, 2020 at 1:05 PM Kevin Ushey <kevinushey at gmail.com>
wrote:>
> IMHO the use of anonymous functions is a very clean solution to the
> placeholder problem, and the shorthand lambda syntax makes it much
> more ergonomic to use. Pipe implementations that crawl the RHS for
> usages of `.` are going to be more expensive than the alternatives. It
> is nice that the `|>` operator is effectively the same as a regular R
> function call, and given the identical semantics could then also be
> reasoned about the same way regular R function calls are.
>
> I also agree usages of the `.` placeholder can make the code more
> challenging to read, since understanding the behavior of a piped
> expression then requires scouring the RHS for usages of `.`, which can
> be challenging in dense code. Piping to an anonymous function makes
> the intent clear to the reader: the programmer is likely piping to an
> anonymous function because they care where the argument is used in the
> call, and so the reader of code should be aware of that.
>
> Best,
> Kevin
>
>
>
> On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
> >
> > On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch <murdoch.duncan at
gmail.com> wrote:
> > > An advantage of the current implementation is that it's
simple and easy
> > > to understand.  Once you make it a user-modifiable binary
operator,
> > > things will go kind of nuts.
> > >
> > > For example, I doubt if there are many users of magrittr's
pipe who
> > > really understand its subtleties, e.g. the example in Luke's
paper where
> > > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1,
1, 2). (And
> > > I could add 1 %>% c(c(.), 2, .) and  1 %>% c(c(.), 2, . +
2)  to
> > > continue the fun.)
> >
> > The rule is not so complicated.  Automatic insertion is done unless
> > you use dot in the top level function or if you surround it with
> > {...}.  It really makes sense since if you use gsub(pattern,
> > replacement, .) then surely you don't want automatic insertion and
if
> > you surround it with { ... } then you are explicitly telling it not
> > to.
> >
> > Assuming the existence of placeholders a possible simplification would
> > be to NOT do automatic insertion if { ... } is used and to use it
> > otherwise although personally having used it for some time I find the
> > existing rule in magrittr generally does what you want.
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
http://hadley.nz

R devel - Dec 2020 - New pipe operator

[Rd] New pipe operator

[Rd] New pipe operator

[Rd] New pipe operator

[Rd] New pipe operator