thr3ads.net - R devel - [Rd] New pipe operator [Dec 2020]

If this information is useful, please help other people find it:
Share via:

Gabor Grothendieck

2020-Dec-07 18:31 UTC

[Rd] New pipe operator

On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch <murdoch.duncan at
gmail.com> wrote:> An advantage of the current implementation is that it's simple and easy
> to understand.  Once you make it a user-modifiable binary operator,
> things will go kind of nuts.
>
> For example, I doubt if there are many users of magrittr's pipe who
> really understand its subtleties, e.g. the example in Luke's paper
where
> 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1, 2).
(And
> I could add 1 %>% c(c(.), 2, .) and  1 %>% c(c(.), 2, . + 2)  to
> continue the fun.)
The rule is not so complicated.  Automatic insertion is done unless
you use dot in the top level function or if you surround it with
{...}.  It really makes sense since if you use gsub(pattern,
replacement, .) then surely you don't want automatic insertion and if
you surround it with { ... } then you are explicitly telling it not
to.

Assuming the existence of placeholders a possible simplification would
be to NOT do automatic insertion if { ... } is used and to use it
otherwise although personally having used it for some time I find the
existing rule in magrittr generally does what you want.

Kevin Ushey

2020-Dec-07 19:02 UTC

head link

[Rd] New pipe operator

IMHO the use of anonymous functions is a very clean solution to the
placeholder problem, and the shorthand lambda syntax makes it much
more ergonomic to use. Pipe implementations that crawl the RHS for
usages of `.` are going to be more expensive than the alternatives. It
is nice that the `|>` operator is effectively the same as a regular R
function call, and given the identical semantics could then also be
reasoned about the same way regular R function calls are.

I also agree usages of the `.` placeholder can make the code more
challenging to read, since understanding the behavior of a piped
expression then requires scouring the RHS for usages of `.`, which can
be challenging in dense code. Piping to an anonymous function makes
the intent clear to the reader: the programmer is likely piping to an
anonymous function because they care where the argument is used in the
call, and so the reader of code should be aware of that.

Best,
Kevin

On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:>
> On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch <murdoch.duncan at
gmail.com> wrote:
> > An advantage of the current implementation is that it's simple and
easy
> > to understand.  Once you make it a user-modifiable binary operator,
> > things will go kind of nuts.
> >
> > For example, I doubt if there are many users of magrittr's pipe
who
> > really understand its subtleties, e.g. the example in Luke's paper
where
> > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1,
2). (And
> > I could add 1 %>% c(c(.), 2, .) and  1 %>% c(c(.), 2, . + 2)  to
> > continue the fun.)
>
> The rule is not so complicated.  Automatic insertion is done unless
> you use dot in the top level function or if you surround it with
> {...}.  It really makes sense since if you use gsub(pattern,
> replacement, .) then surely you don't want automatic insertion and if
> you surround it with { ... } then you are explicitly telling it not
> to.
>
> Assuming the existence of placeholders a possible simplification would
> be to NOT do automatic insertion if { ... } is used and to use it
> otherwise although personally having used it for some time I find the
> existing rule in magrittr generally does what you want.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Gabriel Becker

2020-Dec-07 22:04 UTC

head link

[Rd] New pipe operator

On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck <ggrothendieck at
gmail.com>
wrote:
> On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch <murdoch.duncan at
gmail.com>
> wrote:
> > An advantage of the current implementation is that it's simple and
easy
> > to understand.  Once you make it a user-modifiable binary operator,
> > things will go kind of nuts.
> >
> > For example, I doubt if there are many users of magrittr's pipe
who
> > really understand its subtleties, e.g. the example in Luke's paper
where
> > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1,
2). (And
> > I could add 1 %>% c(c(.), 2, .) and  1 %>% c(c(.), 2, . + 2)  to
> > continue the fun.)
>
> The rule is not so complicated.  Automatic insertion is done unless
> you use dot in the top level function or if you surround it with
> {...}.  It really makes sense since if you use gsub(pattern,
> replacement, .) then surely you don't want automatic insertion and if
> you surround it with { ... } then you are explicitly telling it not
> to.
>
>This is the point that I believe Duncan is trying to make (and I agree
with) though. Consider the question "after piping LHS into RHS, what is the
first argument in the resulting call?".

For the base pipe, the answer, completely unambiguously, is LHS. Full stop.
That is easy to understand.

For magrittr the answer is "Well, it depends, let me see your RHS
expression, is it wrapped in braces? If not, are you using the placeholder?
If you are using the placeholder, where/how are you using it?".

That is inherently much more complicated. Yes, you understand how the
magrittr pipe behaves, and yes you find it very convenient. Thats great,
but neither of those things equate to simplicity. They just mean that you,
a very experienced pipe user, carry around the cognitive load necessary to
have that understanding.

More concretely, the current base pipe  is extremely simple, all it does i

   1. Figure out RHS exprssion call
         1. If RHS is an anonymous function declaration, construct a call
         to it for a new RHS
      2. Insert LHS expression into first argument position of RHS call
      expression

Done. And (1) would be removed if anonymous functions required () after
them, which would be consistent, and even simpler, but kind of annoying. I
think it is a good compromise which is guaranteed to be safe because
anonymous functions are something the parser recognizes.  Either way, if
that was dropped, what |> does would be *entirely* trivial to understand
and explain. With a single sentence.

I had the equivalent pseudocode for the magrittr pipe written out here but
honestly felt like overkill that came across as mean, so I'll leave that as
an exercise to interested readers.

~G
> Assuming the existence of placeholders a possible simplification would
> be to NOT do automatic insertion if { ... } is used and to use it
> otherwise although personally having used it for some time I find the
> existing rule in magrittr generally does what you want.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
	[[alternative HTML version deleted]]

R devel - Dec 2020 - New pipe operator

[Rd] New pipe operator

[Rd] New pipe operator

[Rd] New pipe operator