Currently we have an ExactSize trait for an Iterator to claim that its size_hint is precise. Unfortunately, all code is forced to assume that ExactSize is exactly that: an unsubstantiated claim. The most we use it for is to allow certain iterator adaptors to implement DoubleEndedIterator iff they wrap an ExactSize iterator. Nice, but meh.
There is however a very real need for a way to unsafely assert that your iterator has a truly exact size. When we deprecated several Vec constructors in favour of iterators + collect, we introduced what I measured to be a 33% perf regression (3 vs 4 units of time when constructing a Vec of 100, 000 elements), because the Vec::from_iter code has to count and constantly branch on the number of elements inserted in case more than the iterator claimed to have is inserted (this is implemented by just calling push
over and over). If we trusted the iterator, we could instead just unconditionally offset the write-head and set the length at the end.
Here are some candidate solutions to the problem:
Make ExactSize an unsafe
trait
I honestly don’t think there’s any iterator that seriously “isn’t sure” about its precision, and you’ll get busted DoubleEnded impls if you’re wrong anyway. This seems relatively harmless to just do. This design has the benefit of having no redundant notion of exactness. Unfortunately we have basically no facilities for specialization, so implementing ExactSize would still be useless today. No reasonable code would only accept ExactSize iterators just for a performance optimization.
However assuming that we do get some specialization facilities, this would basically be flipping the bird to trait objects, because e.g. a Box<Iterator>
would have its ExactSize-ness erased.
This would also be a breaking change for the current ExactSize trait, which I believe has been stabilized.
Add TrustedExactSize on top of ExactSize
All the same strengths and weakness of the previous design, but with two changes:
- it’s back-compat to add whenever (yay!)
- it further complicates the iterator trait story, while adding basically duplicate notions of ExactSize (boo!)
Add a default-impl’d fn to Iterator
e.g. unsafe exact_size(&self) -> bool { false }
For 99.9% of iterators, this will produce a statically known value, and therefore incur no runtime cost to branch on assuming sufficient optimization (Rust’s favourite kind of optimization).
Over trait-based designs, this design has the following advantages:
- This design is compatible with Trait objects, it just requires an actual runtime branch to occur at the start of the procedure; a fairly trivial cost for
collect
, especially when performing dynamic dispatch on the iterator already. - This design can be used right now, and would be highly ergonomic to use:
if iter.exact_size() { fast } else { slow }
However it has the following drawbacks:
- This leaves us with a duplicate ExactSize trait, as it is still useful for DoubleEnded impls
- The
unsafe
is “wrong” because it should be considered safe to use but not to override. We lack the tooling to express “unsafe overrides”.
That previous thing but with proper unsafe semantics
That’d be p. cool