Refinement: Iterator adaptor for handling intermediate Results

iox · March 15, 2022, 4:39am

One of my biggest papercuts with Rust is that it's kind of awkward to handle intermediate Results in an iterator. Usually, the choices are to either create intermediate collections with collect(), or to handle the Results at every subsequent step of the iterator chain. A couple years ago, I posted a proposal here, but it needed some refining. I came back to this problem over the past weekend, and had a flash of inspiration.

The idea is to add this method to Iterator:

/// Continue the iterator stream on `Ok` values, stopping at the first `Err` value.
/// The supplied function `f` takes an iterator over `Self::Item`.
/// If all values of `Self` are `Ok`, `continue_with` returns the result of this function.
/// If any `Err`s are encountered, it returns the first error and does not evaluate further.
fn continue_with<F, U>(self, f: F) -> Result<U, Self::Error>
    where F: FnOnce(ContinueWith<Self, Self::Error>) -> U;

A proof-of-concept is on the playground here, along with a example of using this to handle several intermediate Results.

What do you think? Off the top of my head, I have a couple thoughts for improvements, if anyone has any suggestions:

I'm not sure on the name.
It would be nice not to expose the ContinueWith struct, but I'm not sure how to do so.

djc · March 15, 2022, 4:43am

Are you aware of the fact that Result implements the FromIterator trait (which is used by collect())? Why doesn't that cover what you're trying to do here?

iox · March 15, 2022, 5:19am

Sometimes you want to treat the Ok values from the iterator as an iterator unto themselves, and not just as individual values to be mapped over -- that's where this comes in handy.

steffahn · March 15, 2022, 5:45am

Glancing at this, this looks a lot like process_results in itertools - Rust, right?

A process_results function also already exists internally in std used for implementing the abovementioned capability to collect() an iterator of Results.

scottmcm · March 15, 2022, 9:14am

Note that it's been renamed in master, so if you want to find it there,

github.com

rust-lang/rust/blob/0c292c9667f1b202a9150d58bdd2e89e3e803996/library/core/src/iter/adapters/mod.rs#L140-L156

      
        
            /// Process the given iterator as if it yielded a the item's `Try::Output`
            /// type instead. Any `Try::Residual`s encountered will stop the inner iterator
            /// and be propagated back to the overall result.
            pub(crate) fn try_process<I, T, R, F, U>(iter: I, mut f: F) -> ChangeOutputType<I::Item, U>
            where
                I: Iterator<Item: Try<Output = T, Residual = R>>,
                for<'a> F: FnMut(GenericShunt<'a, I, R>) -> U,
                R: Residual<U>,
            {
                let mut residual = None;
                let shunt = GenericShunt { iter, residual: &mut residual };
                let value = f(shunt);
                match residual {
                    Some(r) => FromResidual::from_residual(r),
                    None => Try::from_output(value),
                }
            }

I also put a related comment in RFC: add `try_all` and `try_any` to `Iterator` by FlixCoder · Pull Request #3233 · rust-lang/rfcs · GitHub

chrefr · March 15, 2022, 11:05am

The iterator adapters usually deliberately expost their structs.

ckaran · March 15, 2022, 12:47pm

I'm with @chrefr on this one. Is there a reason to not expose the ContinueWith struct?

iox · March 16, 2022, 2:32am

Ah, I was not aware of process_results (or try_process)! This is sort of an inline version of that, then. If there's not much interest in it for Iterator, I can see if Itertools would be interested.

That makes sense -- my thinking was that iterator adaptors usually expose their structs in the return type, whereas this type was in the argument, but I suppose it's the same either way from a user perspective.

steffahn · March 16, 2022, 5:08am

For clarification, are you referring to the fact that it's a method rather than a freestanding function, or is there any other difference?

iox · March 16, 2022, 5:32am

Yes, exactly that. My humble opinion is that having it as a method leads to a nicer flow, especially when nesting several calls to it.

ComputerDruid · March 16, 2022, 4:35pm

Can you post an example to show what you mean?

iox · March 17, 2022, 2:19am

Sure. From the playground link in my original post:

fn main() -> anyhow::Result<()> {
    let files = vec!["foo.txt", "bar.txt", "baz.txt"];
    let sum: i64 = files
        .into_iter()
        .map(File::open)
        .continue_with(|iter| {
            iter.map(BufReader::new)
            .flat_map(BufReader::lines)
            .continue_with(|iter| {
                iter.map(|line| line.parse::<i64>())
                .continue_with(|iter| iter.sum())
                .with_context(|| "when parsing as integer")
            })
            .with_context(|| "when parsing line")
            .flatten()
            }
        )
        .with_context(|| "when opening file")
        .flatten()?;
    println!("sum was: {}:", sum);
    Ok(())
}

With itertools::process_results(), this looks like:

fn main() -> anyhow::Result<()> {
    let files = vec!["foo.txt", "bar.txt", "baz.txt"];
    let sum: i64 = process_results(files.into_iter().map(File::open), |iter| {
        process_results(
            iter.map(BufReader::new).flat_map(BufReader::lines),
            |iter| {
                process_results(iter.map(|line| line.parse::<i64>()), |iter| iter.sum())
                    .with_context(|| "when parsing as integer")
            },
        )
        .with_context(|| "when parsing line")
        .flatten()
    })
    .with_context(|| "when opening file")
    .flatten()?;
    println!("sum was: {}:", sum);
    Ok(())
}

I personally think the former is easier to read and write, because when you get to the part of the iterator pipeline that produces a Result, you don't have to go back to the beginning of the iterator pipeline to insert a call to process_results() and keep track of the arguments to that.

scottmcm · March 17, 2022, 5:13am

One think I'll add here is that this is, to some extent, fundamental to laziness.

The magic thing that would make everything easier would be a function for impl Iterator<Item = Result<T, E>> -> Result<impl Iterator<Item = T>, E>. But that cannot be implemented lazily, because it would need to look at all the items to find out whether it needs to return Ok or Err.

system · June 15, 2022, 5:14am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Pre-RFC] Iterator adaptor for handling intermediate Results libs	12	1010	August 26, 2019
impl<T, E, Ts, Es> FromIterator<Result<T, E>> for Result<Ts, Es> libs	7	376	March 6, 2024
Fallible extend with fallible iterator language design	6	735	June 12, 2024
Proposal: Error conversion in FromIterator for Result libs	5	1100	May 19, 2020
Idea: Fallible iterator mapping with `try_map` language design	11	4769	May 17, 2022

Refinement: Iterator adaptor for handling intermediate Results

Related topics