thr3ads.net - R help - [R] Pipe operator [Jan 2023]

If this information is useful, please help other people find it:
Share via:

Sorkin, John

2023-Jan-03 16:48 UTC

[R] Pipe operator

I am trying to understand the reason for existence of the pipe operator, %>%,
and when one should use it. It is my understanding that the operator sends the
file to the left of the operator to the function immediately to the right of the
operator:

c(1:10) %>% mean results in a value of 5.5 which is exactly the same as the
result one obtains using the mean function directly, viz. mean(c(1:10)). What is
the reason for having two syntactically different but semantically identical
ways to call a function? Is one more efficient than the other? Does one use less
memory than the other?

P.S. Please forgive what might seem to be a question with an obvious answer. I
am a programmer dinosaur. I have been programming for more than 50 years. When I
started programming in the 1960s the only pipe one spoke about was a bong.

John

Christopher Ryan

2023-Jan-03 16:57 UTC

head link

[R] [External Email] Pipe operator

I think there are probably a number of purposes for (advantages to?)
the pipe operator. One is that it can avoid nested operations:

plot(mean(sqrt(c(1:10))))  ## this is my silly example code

which can get difficult to read.  This is arguably easier to read and
understand:

c(1:10) %>% sqrt() %>% mean() %>% plot()

As the chain of operations become longer, and as each "link" in the
chain becomes more complex, the value of the pipe approach, compared
to deep nesting in parentheses, increases, in my view.

--Chris Ryan

On Tue, Jan 3, 2023 at 11:48 AM Sorkin, John <jsorkin at
som.umaryland.edu> wrote:>
> I am trying to understand the reason for existence of the pipe operator,
%>%, and when one should use it. It is my understanding that the operator
sends the file to the left of the operator to the function immediately to the
right of the operator:
>
> c(1:10) %>% mean results in a value of 5.5 which is exactly the same as
the result one obtains using the mean function directly, viz. mean(c(1:10)).
What is the reason for having two syntactically different but semantically
identical ways to call a function? Is one more efficient than the other? Does
one use less memory than the other?
>
> P.S. Please forgive what might seem to be a question with an obvious
answer. I am a programmer dinosaur. I have been programming for more than 50
years. When I started programming in the 1960s the only pipe one spoke about was
a bong.
>
> John
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ivan Calandra

2023-Jan-03 16:58 UTC

head link

[R] Pipe operator

Dear John,

some more experienced users might give you a different and more helpful 
answer, but I was not really convinced by the pipe operator until I 
tried it out, for the same reasons as you.

In my opinion, the pipe operator is there only to improve the 
readability of your code. Think about e.g. format()ing or round()ing the 
example you gave: you start having a lot of imbricated functions and it 
becomes difficult to read (because of lots of brackets, commas and so 
on, and it gets worse when adding arguments). The pipe operator makes it 
clearer.
An alternative to the pipe operator with good readability is creating 
intermediary objects, but you create a lot of useless objects. Depending 
on the size of the objects, it could become problematic.

Somehow, I just ended up paraphrasing Wickham & Grolemund 
(https://r4ds.had.co.nz/pipes.html); they explain the advantages much 
better than I can.

In any case, once I started using it, I realized that all the pros for 
the pipe operator are real and now I like using it!

Best,
Ivan

	*LEIBNIZ-ZENTRUM*
*F?R ARCH?OLOGIE*

*Dr. Ivan CALANDRA*
**Imaging Lab

MONREPOS Archaeological Research Centre, Schloss Monrepos
56567 Neuwied, Germany

T: +49 2631 9772 243
T: +49 6131 8885 543
ivan.calandra at leiza.de

leiza.de <http://www.leiza.de/>
<http://www.leiza.de/>
ORCID <https://orcid.org/0000-0003-3816-6359>
ResearchGate
<https://www.researchgate.net/profile/Ivan_Calandra>

LEIZA is a foundation under public law of the State of 
Rhineland-Palatinate and the City of Mainz. Its headquarters are in 
Mainz. Supervision is carried out by the Ministry of Science and Health 
of the State of Rhineland-Palatinate. LEIZA is a research museum of the 
Leibniz Association.

On 03/01/2023 17:48, Sorkin, John wrote:> I am trying to understand the reason for existence of the pipe operator,
%>%, and when one should use it. It is my understanding that the operator
sends the file to the left of the operator to the function immediately to the
right of the operator:
>
> c(1:10) %>% mean results in a value of 5.5 which is exactly the same as
the result one obtains using the mean function directly, viz. mean(c(1:10)).
What is the reason for having two syntactically different but semantically
identical ways to call a function? Is one more efficient than the other? Does
one use less memory than the other?
>
> P.S. Please forgive what might seem to be a question with an obvious
answer. I am a programmer dinosaur. I have been programming for more than 50
years. When I started programming in the 1960s the only pipe one spoke about was
a bong.
>
> John
>
> ______________________________________________
> R-help at r-project.org  mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ebert,Timothy Aaron

2023-Jan-03 17:07 UTC

head link

[R] Pipe operator

The pipe shortens code and results in fewer variables because you do not have to
save intermediate steps. Once you get used to the idea it is useful. Note that
there is also the |> pipe that is part of base R. As far as I know it does
the same thing as %>%, or at my level of programing I have not encountered a
difference.

Tim

-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Sorkin, John
Sent: Tuesday, January 3, 2023 11:49 AM
To: 'R-help Mailing List' <r-help at r-project.org>
Subject: [R] Pipe operator

[External Email]

I am trying to understand the reason for existence of the pipe operator, %>%,
and when one should use it. It is my understanding that the operator sends the
file to the left of the operator to the function immediately to the right of the
operator:

c(1:10) %>% mean results in a value of 5.5 which is exactly the same as the
result one obtains using the mean function directly, viz. mean(c(1:10)). What is
the reason for having two syntactically different but semantically identical
ways to call a function? Is one more efficient than the other? Does one use less
memory than the other?

P.S. Please forgive what might seem to be a question with an obvious answer. I
am a programmer dinosaur. I have been programming for more than 50 years. When I
started programming in the 1960s the only pipe one spoke about was a bong.

John

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu%7C73edce5d4e084253a39008daedaa653f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638083613362415015%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fV9Ca3OAleDX%2BwuPJIONYStrAdaQhXTsq61jh2pLtDY%3D&reserved=0
PLEASE do read the posting guide
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C73edce5d4e084253a39008daedaa653f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638083613362415015%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YUnV9kE1RcbB3BwM5gKwKwc3qNKhIVNFtxOxKmpbGrQ%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

Greg Snow

2023-Jan-03 17:35 UTC

head link

[R] Pipe operator

To expand a little on Christopher's answer.

The short answer is that having the different syntaxes can lead to
more readable code (when used properly).

Note that there are now 2 different (but somewhat similar) pipes
available in R (there could be more in some package(s) that I don't
know about, but will just talk about the main 2).

The %>% pipe comes from the magrittr package, but many other packages
now import that package.  But you need to load the magrittr package,
either directly or indirectly, before you can use that pipe.  The
magrittr pipe is a function call, so there is small increase in time
and memory for using it, but it is a small fraction of a second and a
few bytes of memory, so you probably will not notice the increased
usage.

The core R language now has a built in pipe |> which is handled by the
parser, so no extra function calls and you do not need to load any
extra packages (though you need a somewhat recent version of R, within
the last year or so).

The built-in |> pipe is a little pickier, you need to include the
parentheses in a function call, e.g. 1:10 |> mean() where the magrittr
pipe can work with that call or the function without parentheses, e.g.
1:10 %>% mean or 1:10 %>% mean(), this makes %>% a little easier to
work with anonymous functions.  If the previous return needs to be
passed to an argument other than the first, then %>% uses "." and
|>
uses "_".

The magrittr package has additional versions of the pipe and some
functions that wrap around common operators to make it easier to use
them with pipes, so there are still advantages to loading that package
if any of those are helpful.

For a simple case like your example, the pipe probably does not help
with readability much, but as we string more function calls together.
For example, here are 3 ways to compute the geometric mean of the data
in a vector "x":

exp(mean(log(x)))

logx <- log(x)
mlx <- mean(logx)
exp(mtx)

x |>
   log() |>
   mean() |>
   exp()

These all do the same thing, but the first option is read from the
middle outward (which can be tricky) and is even more complicated if
you use additional arguments to any of the functions.
The second option reads top down, but requires creating intermediate
variables.  The last reads similar to the second, but without the
extra variables.  Spreading the series of function calls across
multiple rows makes it easier to read and easily lets you insert a
line like `print() |>` for debugging or checking intermediate results,
and single lines can easily be commented out to skip that step.

I have found myself using code like the following to compute a table,
print it, and compute the proportions all in one step:

table(f, g) |>
  print() |>
  prop.table()

The pipes also work very well with the tidyverse, or even the tidy
data ideas without those packages where we use a single function for
each change, e.g. start with a data frame, select a subset of the
columns, filter to a subset of the rows, mutate a column, join to
another data frame, then pass the final result to a modeling function
like `lm` (and then pass that result to a summary function).  This is
nicely readable when each step is its own line.

On Tue, Jan 3, 2023 at 9:49 AM Sorkin, John <jsorkin at som.umaryland.edu>
wrote:>
> I am trying to understand the reason for existence of the pipe operator,
%>%, and when one should use it. It is my understanding that the operator
sends the file to the left of the operator to the function immediately to the
right of the operator:
>
> c(1:10) %>% mean results in a value of 5.5 which is exactly the same as
the result one obtains using the mean function directly, viz. mean(c(1:10)).
What is the reason for having two syntactically different but semantically
identical ways to call a function? Is one more efficient than the other? Does
one use less memory than the other?
>
> P.S. Please forgive what might seem to be a question with an obvious
answer. I am a programmer dinosaur. I have been programming for more than 50
years. When I started programming in the 1960s the only pipe one spoke about was
a bong.
>
> John
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Gregory (Greg) L. Snow Ph.D.
538280 at gmail.com

@vi@e@gross m@iii@g oii gm@ii@com

2023-Jan-03 17:40 UTC

head link

[R] Pipe operator

John,

The topic has indeed been discussed here endlessly but new people still
stumble upon it.

Until recently, the formal R language did not have a built-in pipe
functionality. It was widely used through an assortment of packages and
there are quite a few variations on the theme including different
implementations.

Most existing code does use the operator %>% but there is now a built-in
|>
operator that is generally faster but is not as easy to use in a few cases.

Please forget the use of the word FILE here. Pipes are a form of syntactic
sugar that generally is about the FIRST argument to a function. They are NOT
meant to be used just for the trivial case you mention where indeed there is
an easy way to do things. Yes, they work in such situations. But consider a
deeply nested expression like this:

Result <- round(max(cos(x), 3.14159/4), 3)

There are MANY deeper nested expressions like this commonly used. The above
can be written linearly as in

Temp1 <- cos(x)
Temp2 <- max(Temp1, 3.14159/4)
Result <- round(Temp2, 3)

Translation, for some variable x, calculate the cosine and take the maximum
value of it as compared to pi/4 and round the result to three decimal
places. Not an uncommon kind of thing to do and sometimes you can nest such
things many layers deep and get hopelessly confused if not done somewhat
linearly.

What pipes allow is to write this closer to the second way while not seeing
or keeping any temporary variables around. The goal is to replace the FIRST
argument to a function with whatever resulted as the value of the previous
expression. That is often a vector or data.frame or list or any kind of
object but can also be fairly complex as in a list of lists of matrices.

So you can still start with cos(x) OR you can write this where the x is
removed from within and leaves cos() empty:

x %>% cos
or
x |> cos()

In the previous version of pipes the parentheses after cos() are optional if
there are no additional arguments but the new pipe requires them.

So continuing the above, using multiple lines, the pipe looks like:

Result <-
  x %>%
  cos() %>%
  max(3.14159/4) %>%
  round(3)

This gives the same result but is arguably easier for some to read and
follow. Nobody forces you to use it and for simple cases, most people don't.

There is a grouping of packages called the tidyverse that makes heavy use of
pipes routine as they made most or all their functions such that the first
argument is the one normally piped to and it can be very handy to write code
that says, read in your data into a variable (a data.frame or tibble often)
and PIPE IT to a function that renames some columns and PIPE the resulting
modified object to a function that retains only selected rows and pipe that
to a function that drops some of the columns and pipe that to a function
that groups the items or sorts them and pipe that to a function that does a
join with another object or generates a report or so many other things.

So the real answer is that piping is another WAY of doing things from a
programmers perspective. Underneath it all, it is mostly syntactic sugar and
the interpreter rearranges your code and performs the steps in what seems
like a different order at times. Generally, you do not need to care.



-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Sorkin, John
Sent: Tuesday, January 3, 2023 11:49 AM
To: 'R-help Mailing List' <r-help at r-project.org>
Subject: [R] Pipe operator

I am trying to understand the reason for existence of the pipe operator,
%>%, and when one should use it. It is my understanding that the operator
sends the file to the left of the operator to the function immediately to
the right of the operator:

c(1:10) %>% mean results in a value of 5.5 which is exactly the same as the
result one obtains using the mean function directly, viz. mean(c(1:10)).
What is the reason for having two syntactically different but semantically
identical ways to call a function? Is one more efficient than the other?
Does one use less memory than the other? 

P.S. Please forgive what might seem to be a question with an obvious answer.
I am a programmer dinosaur. I have been programming for more than 50 years.
When I started programming in the 1960s the only pipe one spoke about was a
bong.  

John

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Uwe Ligges

2023-Jan-03 22:32 UTC

head link

[R] Pipe operator

R is a functional language, hence the pipe operator is not needed.
Also it makes the code unreadable as it is less obvious how a call stack 
looks like and what the arguments to the function calls are.

It is relevant for a shell for piping text streams.

If people cannot live without the pipe operator (and I wonder why you 
want to add a level of complexity, as it is more obfuscated what the 
actual function calls are), please use R's internal one, as it is known 
by the parser and hence debugging etc is better integrated.

Best,
Uwe Ligges



On 03.01.2023 17:48, Sorkin, John wrote:> I am trying to understand the reason for existence of the pipe operator,
%>%, and when one should use it. It is my understanding that the operator
sends the file to the left of the operator to the function immediately to the
right of the operator:
> 
> c(1:10) %>% mean results in a value of 5.5 which is exactly the same as
the result one obtains using the mean function directly, viz. mean(c(1:10)).
What is the reason for having two syntactically different but semantically
identical ways to call a function? Is one more efficient than the other? Does
one use less memory than the other?
> 
> P.S. Please forgive what might seem to be a question with an obvious
answer. I am a programmer dinosaur. I have been programming for more than 50
years. When I started programming in the 1960s the only pipe one spoke about was
a bong.
> 
> John
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Richard O'Keefe

2023-Jan-04 00:37 UTC

head link

[R] Pipe operator

The simplest and best answer is "fashion".
In FSharp,> (|>);;val it: ('a -> ('a -> 'b) -> 'b)
The ability to turn f x y into y |> f x
makes perfect sense in a programming language
where Currying (representing a function of n
arguments as a function of 1 argument that
returns a function of n-1 arguments, similarly
represented) is a way of life.  It can result
in code that is more readable.  And it is
pretty much unavoidable:
let x |> f = f x
is definable in the language.

In programming languages like Erlang and R,
where Currying is *not* a way of life, the
matter is otherwise.

Really, it's all about whether you talk like Luke
or like Yoda talk, it's not about what you say or
efficiency or anything but perceived readability.

On Wed, 4 Jan 2023 at 05:49, Sorkin, John <jsorkin at som.umaryland.edu>
wrote:
> I am trying to understand the reason for existence of the pipe operator,
> %>%, and when one should use it. It is my understanding that the
operator
> sends the file to the left of the operator to the function immediately to
> the right of the operator:
>
> c(1:10) %>% mean results in a value of 5.5 which is exactly the same as
> the result one obtains using the mean function directly, viz.
> mean(c(1:10)). What is the reason for having two syntactically different
> but semantically identical ways to call a function? Is one more efficient
> than the other? Does one use less memory than the other?
>
> P.S. Please forgive what might seem to be a question with an obvious
> answer. I am a programmer dinosaur. I have been programming for more than
> 50 years. When I started programming in the 1960s the only pipe one spoke
> about was a bong.
>
> John
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Milan Glacier

2023-Jan-04 04:45 UTC

head link

[R] Pipe operator

With 50 years of programming experience, just think about how useful
pipe operator is in shell scripting. The output of previous call becomes
the input of next call... Genious idea from our beloved unix
conversion...


On 01/03/23 16:48, Sorkin, John wrote:>I am trying to understand the reason for existence of the pipe operator,
%>%, and when one should use it. It is my understanding that the operator
sends the file to the left of the operator to the function immediately to the
right of the operator:
>
>c(1:10) %>% mean results in a value of 5.5 which is exactly the same as
the result one obtains using the mean function directly, viz. mean(c(1:10)).
What is the reason for having two syntactically different but semantically
identical ways to call a function? Is one more efficient than the other? Does
one use less memory than the other?
>
>P.S. Please forgive what might seem to be a question with an obvious answer.
I am a programmer dinosaur. I have been programming for more than 50 years. When
I started programming in the 1960s the only pipe one spoke about was a bong.
>
>John

R help - Jan 2023 - Pipe operator

[R] Pipe operator

[R] [External Email] Pipe operator

[R] Pipe operator

[R] Pipe operator

[R] Pipe operator

[R] Pipe operator

[R] Pipe operator

[R] Pipe operator

[R] Pipe operator