An optimistic version of collect::<Result<T, E>>?

I found that .map(f).collect::<Vec<T>>() is as fast as the loop version, but not .map(f).try_collect.

The code:

#![feature(iterator_try_collect)]
#![feature(test)]
extern crate test;
use test::Bencher;

#[inline(never)]
fn handle(v: i32) -> Option<i32> {
    Some(v)
}

fn f1(v: &Vec<i32>) -> Option<Vec<i32>> {
    v.into_iter().map(|x| handle(*x)).try_collect()
}

fn f2(v: &Vec<i32>) -> Option<Vec<i32>> {
    let mut result = Vec::with_capacity(v.len());
    for val in v {
        let res = handle(*val)?;
        result.push(res);
    }
    Some(result)
}

static BENCH_SIZE: usize = 100;

#[bench]
fn bench_f1(b: &mut Bencher) {
    let v = test::black_box(vec![1; BENCH_SIZE]);
    b.iter(|| {
        f1(&v);
    })
}

#[bench]
fn bench_f2(b: &mut Bencher) {
    let v = test::black_box(vec![1; BENCH_SIZE]);
    b.iter(|| {
        f2(&v);
    })
}

The result tested on my macOS:

test bench_f1 ... bench:         403 ns/iter (+/- 244)
test bench_f2 ... bench:         158 ns/iter (+/- 82)

After some investigations, I found that the implementation of collect<Result<>> is a pessimistic version, which will assume the residential is a likely path so the Vec will not use the TrustedLen specified version. In detail, the internal iterator GenericShunt can't implement TrustedLen, and the lower bound of its size_hint is always 0 (Error on the first element). As the result, the Vec can't initialize all items at once.

But we all know that Try::Output is the likely path in almost all cases, should we implement that as an optimistic version? Otherwise we have to expand the loop manually.

1 Like

Or prepare the Vec with the desired capacity and use extend?

4 Likes

Related issue: Collecting into a Result<Vec<_>> doesn't reserve the capacity in advance · Issue #48994 · rust-lang/rust · GitHub

2 Likes

Only the lower-bound of the size_hint is really ever used today: Is .size_hint().1 ever used?

Personally, I think the answer is that size_hint is the wrong API for reserveing -- there should be something like reserve_guess with weaker guarantees so that it can both overestimate and underestimate, depending on what the adapter expects to happen.

1 Like

reserve_guess is very hard for filter to implement, there are too many cases so that any guess is not good enough.

I’d prefer to implement two versions, collect_optimistic and collect_pessimistic, which will use size_hint.0 and size_hint.1, and leave collect as an opaque implementation (even leave for PGO?). When in the critical path, users can choose the version manually.

The collect can use a specialized implementation for Result<I, E> that forward to optimistic version, and keep the original behavior for other types.

We can also introduce a new unsafe trait TrustedUpperBound for optimization.

Perhaps as a stopgap a with_size_hint function could be added to Iterator which injects the appropriate hints. This would be useful for me in typically unguessable cases like filter, sometimes I know that few/most items will be filtered.

1 Like

My thoughts on with_size_hint: https://github.com/rust-lang/rust/issues/68995#issuecomment-588569824.

2 Likes

I don’t think with_size_hint is a good idea. Sometimes, the size is derived with some complicated logical, e.g. repeat, array_chunks. If we need to provide the size_hint value explicitly, this is a violation to DRY. Personally, I’d prefer let users to specify optimistic or pessimistic.