Hi all,
A while back, Hadley and I explored what an iteration protocol for R
might look like. We worked through motivations, design choices, and edge
cases, which we documented here:
https://github.com/t-kalinowski/r-iterator-ideas
At the end of this process, I put together a patch to R (with tests) and
would like to invite feedback from R Core and the broader community:
https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1
In summary, the overall design is a minimal patch. It introduces no
breaking changes and essentially no new overhead. There are two parts.
1. Add a new `as.iterable()` S3 generic, with a default identity
method. This provides a user-extensible mechanism for selectively
changing the iteration behavior for some object types passed to
`for`. `as.iterable()` methods are expected to return anything that
`for` can handle directly, namely, vectors or pairlists, or (new) a
closure.
2. `for` gains the ability to accept a closure for the iterable
argument. A closure is called repeatedly for each loop iteration
until the closure returns an `exhausted` sentinel value, which it
received as an input argument.
Here is a small example of using the iteration protocol to implement a
sequence of random samples:
``` r
SampleSequence <- function(n) {
i <- 0
function(done = NULL) {
if (i >= n) {
return(done)
}
i <<- i + 1
runif(1)
}
}
for(sample in SampleSequence(2)) {
print(sample)
}
# [1] 0.7677586
# [1] 0.355592
```
Best,
Tomasz
1. I'm not sure I see the need for the syntax change. Couldn't this all
be done in a while or repeat loop? E.g. your example could keep the
same definition of SampleSequence, then
iterator <- SampleSequence(2)
repeat {
sample <- iterator()
if (is.null(sample)) break
print(sample)
}
Not as simple as yours, but I think a little clearer because it's more
concrete, less abstract.
2. It's not clear to me how the for() loop chooses a value to pass to
the iterator function. (Sorry, I couldn't figure it out from your
patch.) Is "exhausted" a unique value produced each time for() is
called? Is it guaranteed to be unique? What does a user see if they
look at it?
Duncan Murdoch
On 2025-08-11 3:23 p.m., Tomasz Kalinowski wrote:> Hi all,
>
> A while back, Hadley and I explored what an iteration protocol for R
> might look like. We worked through motivations, design choices, and edge
> cases, which we documented here:
> https://github.com/t-kalinowski/r-iterator-ideas
>
> At the end of this process, I put together a patch to R (with tests) and
> would like to invite feedback from R Core and the broader community:
> https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1
>
> In summary, the overall design is a minimal patch. It introduces no
> breaking changes and essentially no new overhead. There are two parts.
>
> 1. Add a new `as.iterable()` S3 generic, with a default identity
> method. This provides a user-extensible mechanism for selectively
> changing the iteration behavior for some object types passed to
> `for`. `as.iterable()` methods are expected to return anything that
> `for` can handle directly, namely, vectors or pairlists, or (new) a
> closure.
>
> 2. `for` gains the ability to accept a closure for the iterable
> argument. A closure is called repeatedly for each loop iteration
> until the closure returns an `exhausted` sentinel value, which it
> received as an input argument.
>
> Here is a small example of using the iteration protocol to implement a
> sequence of random samples:
>
> ``` r
> SampleSequence <- function(n) {
> i <- 0
> function(done = NULL) {
> if (i >= n) {
> return(done)
> }
> i <<- i + 1
> runif(1)
> }
> }
>
> for(sample in SampleSequence(2)) {
> print(sample)
> }
>
> # [1] 0.7677586
> # [1] 0.355592
> ```
>
> Best,
> Tomasz
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
Hello,
A couple of comments:
- Regarding the closure + sentinel approach, also implemented in coro
(https://github.com/r-lib/coro/blob/main/R/iterator.R), it's more
robust for the
sentinel to always be a temporary value. If you store the sentinel
in a list or
a namespace, it might inadvertently close iterators when iterating over that
collection. That's why the coro sentinel is created with
`coro::exhausted()`
rather than exported from the namespace as a constant object. The sentinel can
be equivalently created with `as.symbol(".__exhausted__.")`, the
main thing to
ensure robustness is to avoid storing it and always create it from scratch.
The approach of passing the sentinel by argument (which I see in the example
in your mail but not in the linked documentation of approach 3) also
works if the
iterator loop passes a unique sentinel. Having a default of `NULL` makes it
likely to get unexpected exhaustion of iterators when a sentinel is not passed
in though.
- It's very useful to _close_ iterators for resource cleanup. It's the
responsibility of an iterator loop (e.g. `for` but could be other custom tools
invoking the iterator) to close them. See https://github.com/r-lib/coro/pull/58
for an interesting application of iterator closing, allowing robust support of
`on.exit()` expressions in coro generators.
To implement iterator closing with the closure approach, an iterator may
optionally take a `close` argument. A `true` value is passed on exit,
instructing the iterator to clean up resources.
Best,
Lionel
On Mon, Aug 11, 2025 at 3:24?PM Tomasz Kalinowski <kalinowskit at
gmail.com> wrote:>
> Hi all,
>
> A while back, Hadley and I explored what an iteration protocol for R
> might look like. We worked through motivations, design choices, and edge
> cases, which we documented here:
> https://github.com/t-kalinowski/r-iterator-ideas
>
> At the end of this process, I put together a patch to R (with tests) and
> would like to invite feedback from R Core and the broader community:
> https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1
>
> In summary, the overall design is a minimal patch. It introduces no
> breaking changes and essentially no new overhead. There are two parts.
>
> 1. Add a new `as.iterable()` S3 generic, with a default identity
> method. This provides a user-extensible mechanism for selectively
> changing the iteration behavior for some object types passed to
> `for`. `as.iterable()` methods are expected to return anything that
> `for` can handle directly, namely, vectors or pairlists, or (new) a
> closure.
>
> 2. `for` gains the ability to accept a closure for the iterable
> argument. A closure is called repeatedly for each loop iteration
> until the closure returns an `exhausted` sentinel value, which it
> received as an input argument.
>
> Here is a small example of using the iteration protocol to implement a
> sequence of random samples:
>
> ``` r
> SampleSequence <- function(n) {
> i <- 0
> function(done = NULL) {
> if (i >= n) {
> return(done)
> }
> i <<- i + 1
> runif(1)
> }
> }
>
> for(sample in SampleSequence(2)) {
> print(sample)
> }
>
> # [1] 0.7677586
> # [1] 0.355592
> ```
>
> Best,
> Tomasz
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
Great stuff, and I like the use of a sentinel as a terminator symbol. One aspect of this I would like to explore is that of a lazy sequence as a more fundamental language primitive. Generators in for loops are great, but generators returned by lapply() and friends would enable lazy functional transformations and efficient combination of processing steps. At the lowest level I can see this being facilitated by an ALTREP protocol with a similar API to what you propose. One big pain point of course is parallel processing. A two level design splitting the iterator index and data generation (like C++ does) could be a better fit if parallelization is desired. Curious to hear your thoughts. Best, Taras> On Aug 11, 2025, at 9:23?PM, Tomasz Kalinowski <kalinowskit at gmail.com> wrote: > > Hi all, > > A while back, Hadley and I explored what an iteration protocol for R > might look like. We worked through motivations, design choices, and edge > cases, which we documented here: > https://github.com/t-kalinowski/r-iterator-ideas > > At the end of this process, I put together a patch to R (with tests) and > would like to invite feedback from R Core and the broader community: > https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1 > > In summary, the overall design is a minimal patch. It introduces no > breaking changes and essentially no new overhead. There are two parts. > > 1. Add a new `as.iterable()` S3 generic, with a default identity > method. This provides a user-extensible mechanism for selectively > changing the iteration behavior for some object types passed to > `for`. `as.iterable()` methods are expected to return anything that > `for` can handle directly, namely, vectors or pairlists, or (new) a > closure. > > 2. `for` gains the ability to accept a closure for the iterable > argument. A closure is called repeatedly for each loop iteration > until the closure returns an `exhausted` sentinel value, which it > received as an input argument. > > Here is a small example of using the iteration protocol to implement a > sequence of random samples: > > ``` r > SampleSequence <- function(n) { > i <- 0 > function(done = NULL) { > if (i >= n) { > return(done) > } > i <<- i + 1 > runif(1) > } > } > > for(sample in SampleSequence(2)) { > print(sample) > } > > # [1] 0.7677586 > # [1] 0.355592 > ``` > > Best, > Tomasz > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel