An iteration protocol
Thank you Lionel, Peter, and Duncan! Some responses inline below:
Couldn't this all be done in a while or repeat loop? ... Not as simple as yours, but I think a little clearer because it's more concrete, less abstract.
Indeed, that?s the trade-off! Explicit and verbose vs. simple, concise, and abstracted away. There are certainly times when I prefer the former, but the latter is not even an option today. Particularly in a teaching context, I think the concept of iteration is more intuitive and faster to teach than the precise mechanics of iteration. The opportunity to make `for` usable with a broader set of object types is icing on the cake. (Some of these arguments are fleshed out further in the README linked in the first email.)
It's not clear to me how the for() loop chooses a value to pass to the iterator function.
In the draft patch, `for` creates a unique sentinel object, a bare
`OBJSXP`. The iterator closure is called with this sentinel as the
argument, and the closure must return exactly it to indicate
exhaustion.
This approach neatly achieves a few design goals. It introduces no
persistent symbols, keeping the API surface small, and avoids
introducing the ugly edge case of a potential false-positive
exhaustion detection. It has less overhead than a signal. Compared to
a signal, it should also encourage a more local coding style, making
code easier to reason about. Treating errors as values is one idea
that Rust has proven the value of to me, and this value-sentinel
approach is a close cousin of that.
The example `SampleSequence` iterator in the initial email had a
default sentinel value of `NULL`. This was to allow convenient manual
iteration with something like:
```r
it <- SampleSequence(9)
it(); it(); it(); ...
```
Or, if you prefer a more explicit approach:
```r
it <- SampleSequence(9)
repeat { val <- it() %||% break; ... }
```
Or:
```r
repeat { val <- it(break); ... }
```
Or:
```r
while (!is.null(val <- it())) { ... }
```
Or, for maximum robustness:
```r
done_sentinel <- new.env(parent = emptyenv())
while (!identical(done_sentinel, val <- it(done_sentinel))) { ... }
```
This enables a variety of usage patterns with different trade-offs
between convenience and robustness, with `for` able to take the most
robust approach, while allowing the iterator?s default sentinel to
prioritize convenience.
It's very useful to *close* iterators for resource cleanup.
This is interesting and, to be honest, not a use case we had considered.
Would using `reg.finalizer()` be sufficient for your use case? It
gives less control over timing than `on.exit()`, but can close
resources with something like:
```r
Stream <- function() {
r <- open_resource()
reg.finalizer(environment(), \(e) r$close())
\(done) r$get_next() %||% done
}
```
On Tue, Aug 12, 2025 at 5:20?AM Lionel Henry <lionel at posit.co> wrote:
Clever! If going for non-local returns, probably best for ergonomics to pass in a closure (see e.g. `callCC()`). If only to avoid accidental jumps while debugging. But... do we need more lazy evaluation tricks in the language or fewer? It's probably more idiomatic to express non-local returns with condition signals like `stopIteration()`. There's something to be said for explicit and simple control flow though, via handling of returned values.
Note that it is trivial to create a unique sentinel value -- any newly created closure (i.e. function() NULL) will do, as it will only compare identical() with itself.
Until you try that in the global env right? Then the risk of collision slightly increases. Unless you make your closure more unique via `body()`, but then might as well use a conventional sentinel. Best, Lionel On Tue, Aug 12, 2025 at 1:45?AM Peter Meilstrup <peter.meilstrup at gmail.com> wrote:
Passing the sentinel value as an argument to the iteration method is
the approach taken in my package `iterors` on CRAN. If the sentinel
value argument is evaluated lazily, this lets you pass calls to things
like 'stop', 'break' or 'return,' which will be called to signal end
of iteration. This makes for some nice compact and performant
iteration idioms:
iter <- as.iteror(obj)
total <- 0
repeat {total <- total + nextOr(iter, break)}
Note that iteror is just a closure with one optional argument and a
class attribute, so you can skip using s3 nextOr method and call it
directly:
nextElem <- as.iteror(obj)
repeat {total <- total + nextElem(break)}
For backward compatibility with the iterators package, the default
sentinel value for iterors is `stop("StopIteration")`.
Note that it is trivial to create a unique sentinel value -- any newly
created closure (i.e. function() NULL) will do, as it will only
compare identical() with itself.
sigil <- \() NULL
next <- as.iteror(obj)
while (!identical(item <-next(sigil), sigil)) {
doStuff(item)
}
Peter Meilstrup
On Mon, Aug 11, 2025 at 5:56?PM Lionel Henry via R-devel
<r-devel at r-project.org> wrote:
Hello, A couple of comments: - Regarding the closure + sentinel approach, also implemented in coro (https://github.com/r-lib/coro/blob/main/R/iterator.R), it's more robust for the sentinel to always be a temporary value. If you store the sentinel in a list or a namespace, it might inadvertently close iterators when iterating over that collection. That's why the coro sentinel is created with `coro::exhausted()` rather than exported from the namespace as a constant object. The sentinel can be equivalently created with `as.symbol(".__exhausted__.")`, the main thing to ensure robustness is to avoid storing it and always create it from scratch. The approach of passing the sentinel by argument (which I see in the example in your mail but not in the linked documentation of approach 3) also works if the iterator loop passes a unique sentinel. Having a default of `NULL` makes it likely to get unexpected exhaustion of iterators when a sentinel is not passed in though. - It's very useful to _close_ iterators for resource cleanup. It's the responsibility of an iterator loop (e.g. `for` but could be other custom tools invoking the iterator) to close them. See https://github.com/r-lib/coro/pull/58 for an interesting application of iterator closing, allowing robust support of `on.exit()` expressions in coro generators. To implement iterator closing with the closure approach, an iterator may optionally take a `close` argument. A `true` value is passed on exit, instructing the iterator to clean up resources. Best, Lionel On Mon, Aug 11, 2025 at 3:24?PM Tomasz Kalinowski <kalinowskit at gmail.com> wrote:
Hi all, A while back, Hadley and I explored what an iteration protocol for R might look like. We worked through motivations, design choices, and edge cases, which we documented here: https://github.com/t-kalinowski/r-iterator-ideas At the end of this process, I put together a patch to R (with tests) and would like to invite feedback from R Core and the broader community: https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1 In summary, the overall design is a minimal patch. It introduces no breaking changes and essentially no new overhead. There are two parts. 1. Add a new `as.iterable()` S3 generic, with a default identity method. This provides a user-extensible mechanism for selectively changing the iteration behavior for some object types passed to `for`. `as.iterable()` methods are expected to return anything that `for` can handle directly, namely, vectors or pairlists, or (new) a closure. 2. `for` gains the ability to accept a closure for the iterable argument. A closure is called repeatedly for each loop iteration until the closure returns an `exhausted` sentinel value, which it received as an input argument. Here is a small example of using the iteration protocol to implement a sequence of random samples: ``` r SampleSequence <- function(n) { i <- 0 function(done = NULL) { if (i >= n) { return(done) } i <<- i + 1 runif(1) } } for(sample in SampleSequence(2)) { print(sample) } # [1] 0.7677586 # [1] 0.355592 ``` Best, Tomasz
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel