thr3ads.net - R devel - [Rd] should base R have a piping operator ? [Oct 2019]

If this information is useful, please help other people find it:
Share via:

Iñaki Ucar

2019-Oct-05 16:30 UTC

[Rd] should base R have a piping operator ?

On Sat, 5 Oct 2019 at 17:15, Hugh Marera <hugh.marera at gmail.com>
wrote:>
> How is your argument different to, say,  "Should dplyr or data.table
be
> part of base R as they are the most popular data science packages and they
> are used by a large number of users?"
Two packages with many features, dozens of functions and under heavy
development to fix bugs, add new features and improve performance, vs.
a single operator with a limited and well-defined functionality, and a
reference implementation that hasn't changed in years (but certainly
hackish in a way that probably could only be improved from R itself).

Can't you really spot the difference?

I?aki

Hugh Marera

2019-Oct-05 17:53 UTC

head link

[Rd] should base R have a piping operator ?

I exaggerated the comparison for effect. However, it is not very difficult
to find functions in dplyr or data.table or indeed other packages that one
may wish to be in base R. Examples, for me, could include
data.table::fread, dplyr::group_by & dplyr::summari[sZ]e combo, etc. Also,
the "popularity" of magrittr::`%>%` is mostly attributable to the
tidyverse
(an advanced superset of R). Many R users don't even know that they are
installing the magrittr package.

On Sat, Oct 5, 2019 at 6:30 PM I?aki Ucar <iucar at fedoraproject.org>
wrote:
> On Sat, 5 Oct 2019 at 17:15, Hugh Marera <hugh.marera at gmail.com>
wrote:
> >
> > How is your argument different to, say,  "Should dplyr or
data.table be
> > part of base R as they are the most popular data science packages and
> they
> > are used by a large number of users?"
>
> Two packages with many features, dozens of functions and under heavy
> development to fix bugs, add new features and improve performance, vs.
> a single operator with a limited and well-defined functionality, and a
> reference implementation that hasn't changed in years (but certainly
> hackish in a way that probably could only be improved from R itself).
>
> Can't you really spot the difference?
>
> I?aki
>
	[[alternative HTML version deleted]]

Iñaki Ucar

2019-Oct-05 18:08 UTC

head link

[Rd] should base R have a piping operator ?

On Sat, 5 Oct 2019 at 19:54, Hugh Marera <hugh.marera at gmail.com>
wrote:>
> [...] it is not very difficult to find functions in dplyr or data.table or
indeed other packages that one may wish to be in base R. Examples, for me, could
include data.table::fread
You have utils::read.table and the like.
> dplyr::group_by & dplyr::summari[sZ]e combo
base::tapply, base::by, stats::aggregate.
> [...] Many R users don't even know that they are installing the
magrittr package.
And that's one of the reasons why the proposal makes sense. Another
one is that the pipe plays well with many base R functions, such as
subset, transform, merge, aggregate and reshape.

I?aki

Ant F

2019-Oct-05 18:26 UTC

head link

[Rd] should base R have a piping operator ?

Yes but this exageration precisely misses the point.

Concerning your examples:

* I love fread but I think it makes a lot of subjective choices that are
best associated with a package. I think it
changed a lot with time and can still change, and we have great developers
willing to maintain it and be reactive
regarding feature requests or bug reports

*.group_by() adds a class that works only (or mostly) with tidyverse verbs,
that's very easy to dismiss it as an inclusion in base R.

* summarize is an alternative to aggregate, that would be very confusing to
have both

Now to be fair to your argument we could think of other functions such as
data.table::rleid() which I believe base R misses deeply,
and there is nothing wrong with packaged functions making their way to base
R.

Maybe there's an existing list of criteria for inclusion, in base R but if
not I can make one up for the sake of this discussion :) :
* 1) the functionality should not already exist
* 2) the function should be general enough
* 3) the function should have a large amount of potential of users
* 4) the function should be robust, and not require extensive maintenance
* 5) the function should be stable, we shouldn't expect new features ever 2
months
* 6) the function should have an intuitive interface in the context of the
rest ot base R

I guess 1 and 6 could be held against my proposal, because :
(1) everything can be done without pipes
(6) They are somewhat surprising (though with explicit dots not that much,
and not more surprising than say `bquote()`)

In my opinion the + offset the -.

I wouldn't advise taking magrittr's pipe (providing the license allows
so)
for instance, because it makes a lot of design choices and has a complex
behavior, what I propose is 2 lines of code very unlikely to evolve or
require maintenance.

Antoine

PS: I just receive the digest once a day so If you don't "reply
all" I can
only react later.

Le sam. 5 oct. 2019 ? 19:54, Hugh Marera <hugh.marera at gmail.com> a
?crit :
> I exaggerated the comparison for effect. However, it is not very difficult
> to find functions in dplyr or data.table or indeed other packages that one
> may wish to be in base R. Examples, for me, could include
> data.table::fread, dplyr::group_by & dplyr::summari[sZ]e combo, etc.
Also,
> the "popularity" of magrittr::`%>%` is mostly attributable to
the tidyverse
> (an advanced superset of R). Many R users don't even know that they are
> installing the magrittr package.
>
> On Sat, Oct 5, 2019 at 6:30 PM I?aki Ucar <iucar at
fedoraproject.org> wrote:
>
>> On Sat, 5 Oct 2019 at 17:15, Hugh Marera <hugh.marera at
gmail.com> wrote:
>> >
>> > How is your argument different to, say,  "Should dplyr or
data.table be
>> > part of base R as they are the most popular data science packages
and
>> they
>> > are used by a large number of users?"
>>
>> Two packages with many features, dozens of functions and under heavy
>> development to fix bugs, add new features and improve performance, vs.
>> a single operator with a limited and well-defined functionality, and a
>> reference implementation that hasn't changed in years (but
certainly
>> hackish in a way that probably could only be improved from R itself).
>>
>> Can't you really spot the difference?
>>
>> I?aki
>>
>
	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more possibly parallel threads

R devel - Oct 2019 - should base R have a piping operator ?

[Rd] should base R have a piping operator ?

[Rd] should base R have a piping operator ?

[Rd] should base R have a piping operator ?

[Rd] should base R have a piping operator ?

Seemingly Similar Threads