Chain
currently has a ternary flag to track which of its two sub-iterators it's exhausted, and thus should not call again. In practice, maintaining this correctly is very subtle in anything that can early-exit.
Proposal: Change Chain
from
pub struct Chain<A, B> { a: A, b: B, state: ChainState }
to
pub struct Chain<A, B> { a: Fuse<A>, b: Fuse<B> }
That would have a bunch of advantages:
- All the methods are simpler. For example, count is just
a.count() + b.count()
, nomatch
. Similarly, even things likefold
are easy as it can just fold both of them. It means no state maintenance in&mut self
methods likenext
ortry_fold
. And less code is generally good for compile times. -
For the common case of iterators that are also fused, the type is smaller, as there's no extra flag & padding.Thanks @RustyYato for pointing out that this is incorrect given the current state of specialization. - For already-fused iterators, it's probably cheaper to just look at them than to match on the state flag then look at them.
- Chain would itself always be fused (rather than only when both sub-iterators are fused)
- Two binary flag checks are probably not materially different from the one ternary flag check, perf-wise, if neither underlying iterator is fused.
- Because of the multiple match arms, many of the methods have multiple callsites for the same side. (For example,
Chain::next
has two callsides fora.next()
and two more forb.next()
.) Removing the match would eliminate that, potentially reducing code side in the common case of inlined methods (likenext
) typically is.
Downsides:
-
If neither iterator is fused,the struct is slightly larger as two state flags (+ padding) are bigger than one (+ padding). - Tree-nested chains have more work to do in
.next()
, as they need to look at all the sub-iterators to know to skip that side, rather than just checking the state flag and skipping them.
Mitigations:- Long chains are somewhat of an anti-pattern regardless, and are more often linear than tree'd
- One could wrap a side of a tree with an explicit fused flag to allow skipping that side, if one really wanted it
- Anything using internal iteration would be unaffected by this, as doing so automatically splits the loops. (And
Chain
is already one of the go-to examples where internal iterator is worthwhile.)
- If we eventually made
Chain
variadic, we might end up undoing this change as a numeric range for which iterators have been exhausted would then be valuable again. (But that could still easily be fused -- because the numeric range would empty -- so doesn't impact anything concretely today. And it's not like variadics are coming soon enough that this change would get undone a few weeks after being made.)
Thoughts? Worth trying? I'd make the initial PR not include the Fused bound expansion so it could be reverted at any time if problems do happen.
As a simple demo, https://rust.godbolt.org/z/F4FQDk shows a basic for
loop that shrinks from 48 lines of assembly to 23 by removing the state maintenance.