Refinement: Iterator adaptor for handling intermediate Results

One of my biggest papercuts with Rust is that it's kind of awkward to handle intermediate Results in an iterator. Usually, the choices are to either create intermediate collections with collect(), or to handle the Results at every subsequent step of the iterator chain. A couple years ago, I posted a proposal here, but it needed some refining. I came back to this problem over the past weekend, and had a flash of inspiration.

The idea is to add this method to Iterator:

/// Continue the iterator stream on `Ok` values, stopping at the first `Err` value.
/// The supplied function `f` takes an iterator over `Self::Item`.
/// If all values of `Self` are `Ok`, `continue_with` returns the result of this function.
/// If any `Err`s are encountered, it returns the first error and does not evaluate further.
fn continue_with<F, U>(self, f: F) -> Result<U, Self::Error>
    where F: FnOnce(ContinueWith<Self, Self::Error>) -> U;

A proof-of-concept is on the playground here, along with a example of using this to handle several intermediate Results.

What do you think? Off the top of my head, I have a couple thoughts for improvements, if anyone has any suggestions:

  • I'm not sure on the name.
  • It would be nice not to expose the ContinueWith struct, but I'm not sure how to do so.
1 Like

Are you aware of the fact that Result implements the FromIterator trait (which is used by collect())? Why doesn't that cover what you're trying to do here?

1 Like

Sometimes you want to treat the Ok values from the iterator as an iterator unto themselves, and not just as individual values to be mapped over -- that's where this comes in handy.

Glancing at this, this looks a lot like process_results in itertools - Rust, right?

A process_results function also already exists internally in std used for implementing the abovementioned capability to collect() an iterator of Results.

7 Likes

Note that it's been renamed in master, so if you want to find it there,

I also put a related comment in RFC: add `try_all` and `try_any` to `Iterator` by FlixCoder · Pull Request #3233 · rust-lang/rfcs · GitHub

The iterator adapters usually deliberately expost their structs.

I'm with @chrefr on this one. Is there a reason to not expose the ContinueWith struct?

Ah, I was not aware of process_results (or try_process)! This is sort of an inline version of that, then. If there's not much interest in it for Iterator, I can see if Itertools would be interested.

That makes sense -- my thinking was that iterator adaptors usually expose their structs in the return type, whereas this type was in the argument, but I suppose it's the same either way from a user perspective.

For clarification, are you referring to the fact that it's a method rather than a freestanding function, or is there any other difference?

Yes, exactly that. My humble opinion is that having it as a method leads to a nicer flow, especially when nesting several calls to it.

Can you post an example to show what you mean?

Sure. From the playground link in my original post:

fn main() -> anyhow::Result<()> {
    let files = vec!["foo.txt", "bar.txt", "baz.txt"];
    let sum: i64 = files
        .into_iter()
        .map(File::open)
        .continue_with(|iter| {
            iter.map(BufReader::new)
            .flat_map(BufReader::lines)
            .continue_with(|iter| {
                iter.map(|line| line.parse::<i64>())
                .continue_with(|iter| iter.sum())
                .with_context(|| "when parsing as integer")
            })
            .with_context(|| "when parsing line")
            .flatten()
            }
        )
        .with_context(|| "when opening file")
        .flatten()?;
    println!("sum was: {}:", sum);
    Ok(())
}

With itertools::process_results(), this looks like:

fn main() -> anyhow::Result<()> {
    let files = vec!["foo.txt", "bar.txt", "baz.txt"];
    let sum: i64 = process_results(files.into_iter().map(File::open), |iter| {
        process_results(
            iter.map(BufReader::new).flat_map(BufReader::lines),
            |iter| {
                process_results(iter.map(|line| line.parse::<i64>()), |iter| iter.sum())
                    .with_context(|| "when parsing as integer")
            },
        )
        .with_context(|| "when parsing line")
        .flatten()
    })
    .with_context(|| "when opening file")
    .flatten()?;
    println!("sum was: {}:", sum);
    Ok(())
}

I personally think the former is easier to read and write, because when you get to the part of the iterator pipeline that produces a Result, you don't have to go back to the beginning of the iterator pipeline to insert a call to process_results() and keep track of the arguments to that.

One think I'll add here is that this is, to some extent, fundamental to laziness.

The magic thing that would make everything easier would be a function for impl Iterator<Item = Result<T, E>> -> Result<impl Iterator<Item = T>, E>. But that cannot be implemented lazily, because it would need to look at all the items to find out whether it needs to return Ok or Err.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.