Hi all, A while back, Hadley and I explored what an iteration protocol for R might look like. We worked through motivations, design choices, and edge cases, which we documented here: https://github.com/t-kalinowski/r-iterator-ideas At the end of this process, I put together a patch to R (with tests) and would like to invite feedback from R Core and the broader community: https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1 In summary, the overall design is a minimal patch. It introduces no breaking changes and essentially no new overhead. There are two parts. 1. Add a new `as.iterable()` S3 generic, with a default identity method. This provides a user-extensible mechanism for selectively changing the iteration behavior for some object types passed to `for`. `as.iterable()` methods are expected to return anything that `for` can handle directly, namely, vectors or pairlists, or (new) a closure. 2. `for` gains the ability to accept a closure for the iterable argument. A closure is called repeatedly for each loop iteration until the closure returns an `exhausted` sentinel value, which it received as an input argument. Here is a small example of using the iteration protocol to implement a sequence of random samples: ``` r SampleSequence <- function(n) { i <- 0 function(done = NULL) { if (i >= n) { return(done) } i <<- i + 1 runif(1) } } for(sample in SampleSequence(2)) { print(sample) } # [1] 0.7677586 # [1] 0.355592 ``` Best, Tomasz
1. I'm not sure I see the need for the syntax change. Couldn't this all be done in a while or repeat loop? E.g. your example could keep the same definition of SampleSequence, then iterator <- SampleSequence(2) repeat { sample <- iterator() if (is.null(sample)) break print(sample) } Not as simple as yours, but I think a little clearer because it's more concrete, less abstract. 2. It's not clear to me how the for() loop chooses a value to pass to the iterator function. (Sorry, I couldn't figure it out from your patch.) Is "exhausted" a unique value produced each time for() is called? Is it guaranteed to be unique? What does a user see if they look at it? Duncan Murdoch On 2025-08-11 3:23 p.m., Tomasz Kalinowski wrote:> Hi all, > > A while back, Hadley and I explored what an iteration protocol for R > might look like. We worked through motivations, design choices, and edge > cases, which we documented here: > https://github.com/t-kalinowski/r-iterator-ideas > > At the end of this process, I put together a patch to R (with tests) and > would like to invite feedback from R Core and the broader community: > https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1 > > In summary, the overall design is a minimal patch. It introduces no > breaking changes and essentially no new overhead. There are two parts. > > 1. Add a new `as.iterable()` S3 generic, with a default identity > method. This provides a user-extensible mechanism for selectively > changing the iteration behavior for some object types passed to > `for`. `as.iterable()` methods are expected to return anything that > `for` can handle directly, namely, vectors or pairlists, or (new) a > closure. > > 2. `for` gains the ability to accept a closure for the iterable > argument. A closure is called repeatedly for each loop iteration > until the closure returns an `exhausted` sentinel value, which it > received as an input argument. > > Here is a small example of using the iteration protocol to implement a > sequence of random samples: > > ``` r > SampleSequence <- function(n) { > i <- 0 > function(done = NULL) { > if (i >= n) { > return(done) > } > i <<- i + 1 > runif(1) > } > } > > for(sample in SampleSequence(2)) { > print(sample) > } > > # [1] 0.7677586 > # [1] 0.355592 > ``` > > Best, > Tomasz > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Hello, A couple of comments: - Regarding the closure + sentinel approach, also implemented in coro (https://github.com/r-lib/coro/blob/main/R/iterator.R), it's more robust for the sentinel to always be a temporary value. If you store the sentinel in a list or a namespace, it might inadvertently close iterators when iterating over that collection. That's why the coro sentinel is created with `coro::exhausted()` rather than exported from the namespace as a constant object. The sentinel can be equivalently created with `as.symbol(".__exhausted__.")`, the main thing to ensure robustness is to avoid storing it and always create it from scratch. The approach of passing the sentinel by argument (which I see in the example in your mail but not in the linked documentation of approach 3) also works if the iterator loop passes a unique sentinel. Having a default of `NULL` makes it likely to get unexpected exhaustion of iterators when a sentinel is not passed in though. - It's very useful to _close_ iterators for resource cleanup. It's the responsibility of an iterator loop (e.g. `for` but could be other custom tools invoking the iterator) to close them. See https://github.com/r-lib/coro/pull/58 for an interesting application of iterator closing, allowing robust support of `on.exit()` expressions in coro generators. To implement iterator closing with the closure approach, an iterator may optionally take a `close` argument. A `true` value is passed on exit, instructing the iterator to clean up resources. Best, Lionel On Mon, Aug 11, 2025 at 3:24?PM Tomasz Kalinowski <kalinowskit at gmail.com> wrote:> > Hi all, > > A while back, Hadley and I explored what an iteration protocol for R > might look like. We worked through motivations, design choices, and edge > cases, which we documented here: > https://github.com/t-kalinowski/r-iterator-ideas > > At the end of this process, I put together a patch to R (with tests) and > would like to invite feedback from R Core and the broader community: > https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1 > > In summary, the overall design is a minimal patch. It introduces no > breaking changes and essentially no new overhead. There are two parts. > > 1. Add a new `as.iterable()` S3 generic, with a default identity > method. This provides a user-extensible mechanism for selectively > changing the iteration behavior for some object types passed to > `for`. `as.iterable()` methods are expected to return anything that > `for` can handle directly, namely, vectors or pairlists, or (new) a > closure. > > 2. `for` gains the ability to accept a closure for the iterable > argument. A closure is called repeatedly for each loop iteration > until the closure returns an `exhausted` sentinel value, which it > received as an input argument. > > Here is a small example of using the iteration protocol to implement a > sequence of random samples: > > ``` r > SampleSequence <- function(n) { > i <- 0 > function(done = NULL) { > if (i >= n) { > return(done) > } > i <<- i + 1 > runif(1) > } > } > > for(sample in SampleSequence(2)) { > print(sample) > } > > # [1] 0.7677586 > # [1] 0.355592 > ``` > > Best, > Tomasz > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Great stuff, and I like the use of a sentinel as a terminator symbol. One aspect of this I would like to explore is that of a lazy sequence as a more fundamental language primitive. Generators in for loops are great, but generators returned by lapply() and friends would enable lazy functional transformations and efficient combination of processing steps. At the lowest level I can see this being facilitated by an ALTREP protocol with a similar API to what you propose. One big pain point of course is parallel processing. A two level design splitting the iterator index and data generation (like C++ does) could be a better fit if parallelization is desired. Curious to hear your thoughts. Best, Taras> On Aug 11, 2025, at 9:23?PM, Tomasz Kalinowski <kalinowskit at gmail.com> wrote: > > Hi all, > > A while back, Hadley and I explored what an iteration protocol for R > might look like. We worked through motivations, design choices, and edge > cases, which we documented here: > https://github.com/t-kalinowski/r-iterator-ideas > > At the end of this process, I put together a patch to R (with tests) and > would like to invite feedback from R Core and the broader community: > https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1 > > In summary, the overall design is a minimal patch. It introduces no > breaking changes and essentially no new overhead. There are two parts. > > 1. Add a new `as.iterable()` S3 generic, with a default identity > method. This provides a user-extensible mechanism for selectively > changing the iteration behavior for some object types passed to > `for`. `as.iterable()` methods are expected to return anything that > `for` can handle directly, namely, vectors or pairlists, or (new) a > closure. > > 2. `for` gains the ability to accept a closure for the iterable > argument. A closure is called repeatedly for each loop iteration > until the closure returns an `exhausted` sentinel value, which it > received as an input argument. > > Here is a small example of using the iteration protocol to implement a > sequence of random samples: > > ``` r > SampleSequence <- function(n) { > i <- 0 > function(done = NULL) { > if (i >= n) { > return(done) > } > i <<- i + 1 > runif(1) > } > } > > for(sample in SampleSequence(2)) { > print(sample) > } > > # [1] 0.7677586 > # [1] 0.355592 > ``` > > Best, > Tomasz > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel