Hadley Wickham
2017-Mar-19 02:39 UTC
[Rd] RFC: (in-principle) native unquoting for standard evaluation
Would this return a quosure? (i.e. a single sided formula that captures both expression and environment). That's the data structure we've adopted in tidyeval as it already has some built in support. Hadley On Friday, March 17, 2017, Michael Lawrence <lawrence.michael at gene.com> wrote:> Interesting idea. Lazy and non-standard evaluation is going to happen; the > language needs a way to contain it. > > I'll extend the proposal so that prefixing a formal argument with @ in > function() marks the argument as auto-quoting, so it arrives as a language > object without use of substitute(). Kind of like how '*' in C declares a > pointer and dereferences one. > > subset <- function(x, @subset, ...) { } > > This should make it easier to implement such functions, simplify > compilation, and allow detection of potential quoting errors through static > analysis. > > Michael > > On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <jono at jcarroll.com.au > <javascript:;>> > wrote: > > > (please be gentle, it's my first time) > > > > I am interested in discussions (possibly reiterating past threads -- > > searching didn't turn up much) on the possibility of supporting standard > > evaluation unquoting at the language level. This has been brought up in a > > recent similar thread here [1] and on Twitter [2] where I proposed the > > following desired (in-principle) syntax > > > > f <- function(col1, col2, new_col_name) { > > mtcars %>% mutate(@new_col_name = @col1 + @col2) > > } > > > > or closer to home > > > > x <- 1:10; y <- "x" > > data.frame(z = @y) > > > > where @ would be defined as a unary prefix operator which substitutes the > > quoted variable name in-place, to allow more flexibility of NSE functions > > within a programming context. This mechanism exists within MySQL [3] (and > > likely other languages) and could potentially be extremely useful. > Several > > alternatives have been incorporated into packages (most recently work > > on tidyeval) none of which appear to fully match the simplicity of the > > above, and some of which cut a forceful path through the syntax tree. > > > > The exact syntax isn't my concern at the moment (@ vs unquote() or other, > > though the first requires user-supplied native prefix support within the > > language, as per [1]) and neither is the exact way in which this would be > > achieved (well above my pay grade). The practicality of @ being on the > LHS > > of `=` is also of a lesser concern (likely greater complexity) than the > > RHS. > > > > I hear there exists (justified) reluctance to add new syntax to the > > language, but I think this has sufficient merit (and a growing number of > > workarounds) to warrant continued discussion. > > > > With kindest regards, > > > > - Jonathan. > > > > [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html > > [2] https://twitter.com/carroll_jono/status/842142292253196290 > > [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org <javascript:;> mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org <javascript:;> mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- http://hadley.nz [[alternative HTML version deleted]]
Jonathan Carroll
2017-Mar-19 03:14 UTC
[Rd] RFC: (in-principle) native unquoting for standard evaluation
Firstly, credit where due: the lazyeval NSE vignette [1] covers so many of
the angles that this proposal needs to address and is extremely well
written (even if it has been superseded). The @ prefix I'm proposing is a
drop-in replacement for `uq()` (as used in that vignette) but for which the
`f_eval()` and `~` steps would not be required by the author/user.
This is proposed as an admittedly naive suggestion which fails to account
for the subtleties raised in [1] such as unquoting of multiple arguments
and scope selection. I am hoping that the discussion can cover how best to
address those matters.
The significant hurdles (apart from implementation which I cannot speak to)
that are dealt with in lazyeval (and presumably tidyeval) seem to be:
- a prefix can be attached to only a single object, so the extra_args
example from [1] would not be possible. I'm not certain why the unquoting
of the variable would not still be possible with the form
variable = "x"
mean(@variable, na.rm = TRUE, trim = 0.9)
since I'm proposing that the call need not be a formula (I may be way off
on this interpretation).
- I am proposing that the new syntax be able to achieve the example
f <- function(col1, col2, new_col_name) {
mtcars %>% mutate(@new_col_name = @col1 + @col2)
}
but this is ambiguous if there is, say, an object "mpg" within that
function scope. [1] handles this with .env and .data pronouns but this
doesn't seem possible with just a prefix. One solution may be to have @@
and @ representing these two options.
I appreciate the significant work that has gone into the tidyverse packages
which use NSE and my intention is not to downplay any of that. I would just
like to be able to use the language more efficiently, so native access to
the unquoting seems like a step forward.
Kindest regards,
- Jonathan.
[1] https://cran.r-project.org/web/packages/lazyeval/vignettes/lazyeval.html
On Sun, Mar 19, 2017 at 1:09 PM, Hadley Wickham <h.wickham at gmail.com>
wrote:
> Would this return a quosure? (i.e. a single sided formula that captures
> both expression and environment). That's the data structure we've
adopted
> in tidyeval as it already has some built in support.
>
> Hadley
>
> On Friday, March 17, 2017, Michael Lawrence <lawrence.michael at
gene.com>
> wrote:
>
>> Interesting idea. Lazy and non-standard evaluation is going to happen;
the
>> language needs a way to contain it.
>>
>> I'll extend the proposal so that prefixing a formal argument with @
in
>> function() marks the argument as auto-quoting, so it arrives as a
language
>> object without use of substitute(). Kind of like how '*' in C
declares a
>> pointer and dereferences one.
>>
>> subset <- function(x, @subset, ...) { }
>>
>> This should make it easier to implement such functions, simplify
>> compilation, and allow detection of potential quoting errors through
>> static
>> analysis.
>>
>> Michael
>>
>> On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <jono at
jcarroll.com.au>
>> wrote:
>>
>> > (please be gentle, it's my first time)
>> >
>> > I am interested in discussions (possibly reiterating past threads
--
>> > searching didn't turn up much) on the possibility of
supporting standard
>> > evaluation unquoting at the language level. This has been brought
up in
>> a
>> > recent similar thread here [1] and on Twitter [2] where I proposed
the
>> > following desired (in-principle) syntax
>> >
>> > f <- function(col1, col2, new_col_name) {
>> > mtcars %>% mutate(@new_col_name = @col1 + @col2)
>> > }
>> >
>> > or closer to home
>> >
>> > x <- 1:10; y <- "x"
>> > data.frame(z = @y)
>> >
>> > where @ would be defined as a unary prefix operator which
substitutes
>> the
>> > quoted variable name in-place, to allow more flexibility of NSE
>> functions
>> > within a programming context. This mechanism exists within MySQL
[3]
>> (and
>> > likely other languages) and could potentially be extremely useful.
>> Several
>> > alternatives have been incorporated into packages (most recently
work
>> > on tidyeval) none of which appear to fully match the simplicity of
the
>> > above, and some of which cut a forceful path through the syntax
tree.
>> >
>> > The exact syntax isn't my concern at the moment (@ vs
unquote() or
>> other,
>> > though the first requires user-supplied native prefix support
within the
>> > language, as per [1]) and neither is the exact way in which this
would
>> be
>> > achieved (well above my pay grade). The practicality of @ being on
the
>> LHS
>> > of `=` is also of a lesser concern (likely greater complexity)
than the
>> > RHS.
>> >
>> > I hear there exists (justified) reluctance to add new syntax to
the
>> > language, but I think this has sufficient merit (and a growing
number of
>> > workarounds) to warrant continued discussion.
>> >
>> > With kindest regards,
>> >
>> > - Jonathan.
>> >
>> > [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
>> > [2] https://twitter.com/carroll_jono/status/842142292253196290
>> > [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
> --
> http://hadley.nz
>
[[alternative HTML version deleted]]
Michael Lawrence
2017-Mar-19 05:51 UTC
[Rd] RFC: (in-principle) native unquoting for standard evaluation
Yes, it would bind the language object to the environment, like an R-level promise (but "promise" of course refers specifically to just _lazy_ evaluation). For the uqs() thing, expanding calls like that is somewhat orthogonal to NSE. It would be nice in general to be able to write something like mean(x, extra_args...) without resorting to do.call(mean, c(list(x), extra_args)). If we had that then uqs() would just be the combination of unquote and expansion, i.e., mean(x, @extra_args...). The "..." postfix would not work since it's still a valid symbol name, but we could come up with something. Michael On Sat, Mar 18, 2017 at 7:39 PM, Hadley Wickham <h.wickham at gmail.com> wrote:> Would this return a quosure? (i.e. a single sided formula that captures both > expression and environment). That's the data structure we've adopted in > tidyeval as it already has some built in support. > > Hadley > > > On Friday, March 17, 2017, Michael Lawrence <lawrence.michael at gene.com> > wrote: >> >> Interesting idea. Lazy and non-standard evaluation is going to happen; the >> language needs a way to contain it. >> >> I'll extend the proposal so that prefixing a formal argument with @ in >> function() marks the argument as auto-quoting, so it arrives as a language >> object without use of substitute(). Kind of like how '*' in C declares a >> pointer and dereferences one. >> >> subset <- function(x, @subset, ...) { } >> >> This should make it easier to implement such functions, simplify >> compilation, and allow detection of potential quoting errors through >> static >> analysis. >> >> Michael >> >> On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <jono at jcarroll.com.au> >> wrote: >> >> > (please be gentle, it's my first time) >> > >> > I am interested in discussions (possibly reiterating past threads -- >> > searching didn't turn up much) on the possibility of supporting standard >> > evaluation unquoting at the language level. This has been brought up in >> > a >> > recent similar thread here [1] and on Twitter [2] where I proposed the >> > following desired (in-principle) syntax >> > >> > f <- function(col1, col2, new_col_name) { >> > mtcars %>% mutate(@new_col_name = @col1 + @col2) >> > } >> > >> > or closer to home >> > >> > x <- 1:10; y <- "x" >> > data.frame(z = @y) >> > >> > where @ would be defined as a unary prefix operator which substitutes >> > the >> > quoted variable name in-place, to allow more flexibility of NSE >> > functions >> > within a programming context. This mechanism exists within MySQL [3] >> > (and >> > likely other languages) and could potentially be extremely useful. >> > Several >> > alternatives have been incorporated into packages (most recently work >> > on tidyeval) none of which appear to fully match the simplicity of the >> > above, and some of which cut a forceful path through the syntax tree. >> > >> > The exact syntax isn't my concern at the moment (@ vs unquote() or >> > other, >> > though the first requires user-supplied native prefix support within the >> > language, as per [1]) and neither is the exact way in which this would >> > be >> > achieved (well above my pay grade). The practicality of @ being on the >> > LHS >> > of `=` is also of a lesser concern (likely greater complexity) than the >> > RHS. >> > >> > I hear there exists (justified) reluctance to add new syntax to the >> > language, but I think this has sufficient merit (and a growing number of >> > workarounds) to warrant continued discussion. >> > >> > With kindest regards, >> > >> > - Jonathan. >> > >> > [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html >> > [2] https://twitter.com/carroll_jono/status/842142292253196290 >> > [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-devel at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > > > -- > http://hadley.nz
Lionel Henry
2017-Mar-22 19:17 UTC
[Rd] RFC: (in-principle) native unquoting for standard evaluation
ML> For the uqs() thing, expanding calls like that is somewhat orthogonal
ML> to NSE. It would be nice in general to be able to write something like
ML> mean(x, extra_args...) without resorting to do.call(mean, c(list(x),
ML> extra_args)).
This is not completely true because splicing is necessarily linked to
the principle of unquoting (evaluating). You cannot splice something
that you don't know the value of, you have to evaluate the promise of
the splicing operand. In other words, you cannot splice at the parser
level, only at the interpreter level, and the splicing operation has
to be part of the call tree. This implies the important limitation
that you cannot splice a list in a call to a function taking named
arguments, you can only splice when capturing dots. On the plus side,
it seems more R-like to implement it as a regular function call since
all syntactic operations in R are function calls.
Since splicing is conceptually linked to unquoting, I think it would
make sense to have a derivative operator, e.g. @@. In that case it
would simply take its argument by expression and could thus be defined
as:
`@@` <- `~`.
It'd be used like this:
# Equivalent to as.list(mtcars)
list(@@ mtcars)
# Returns a list of symbols
list(@@ lapply(letters, as.symbol))
To make it work we'd have two functions for capturing dots that would
understand arguments wrapped in an `@@` quosure. dotsValues(...)
would expand spliced arguments and then evaluate them, while
dotsExprs(...) would expand and return a list of quosures. Dotted
primitive functions like list() or c() would also need to preprocess
the dots with a C function.
Another reason not to use `...` as syntax for splicing is that it may
be better to reserve it for forwarding operations. I think one other
syntax update that would be worthwile to consider is forwarding of
named arguments. This would allow labelling of arguments to work
transparently across wrappers:
my_plot <- function(x) plot(1:10, ...(x))
# The y axis is correctly labelled as 11:20 in the plot
my_plot(11:20)
And this would also allow to forward named arguments to functions
taking their arguments by expression, just like we forward dots.
Lionel
Reasonably Related Threads
- RFC: (in-principle) native unquoting for standard evaluation
- RFC: (in-principle) native unquoting for standard evaluation
- RFC: (in-principle) native unquoting for standard evaluation
- RFC: (in-principle) native unquoting for standard evaluation
- RFC: (in-principle) native unquoting for standard evaluation