[Pre-RFC] iter::Fused trait


#1

Design

Add a marker trait Fused to std::iter and implement it on Fuse<I> and applicable iterators and adapters. By implementing Fused, an iterator promises to behave as if Iterator::fuse() had been called on it (i.e. return None forever after returning None once).

Motivation

Currently, there’s no way to say “I want an iterator that returns None forever after returning None once”. This means that, to be safe, API’s need to manually call fuse() on their iterators if they intend to rely on this behavior. However, many iterators already behave this way so having a way to mark them would be really nice.

Drawbacks

  1. Yet another special iterator trait.
  2. This won’t be fully useful until we get some form of specialization.

/cc @bluss


#2

Can you please add some background or what this actually does? I am fairly confused by this proposal because there is no explanation, only motivation…

Currently, there’s no way to say “I want an iterator that returns None forever after returning None once”.

Iterators already do that? playpen

What is ‘fusion’? why do we want to fuse iterators? Your motivation is not very motivating.


#3

Not always. From the docs:

The Iterator protocol does not define behavior after None is returned.

The point of this proposal is to introduce a trait that defines this behavior.


#4

Alright, that sounds reasonable.

Why Fused though? It sounds as if you are taking another Iterator and combining the two.


#5

The adapter is called Fuse, the method on Iterator is called fuse, and bluss calls it fused.


#6

Libstd had some adaptors with fusing bugs (Chain) and itertools too (several).

Since it’s a huge task I used quickcheck to test some fusing properties by using a “misbehaving” iterator as input and checking the length of the output when using different adaptors, that successfully found some bugs. We should expand this to test all the libstd adaptors too.

I think the effort to fix these bugs is needed, and maybe improve upon documentation too.

I’m not sure the marker trait is useful before specialization, as you note, what would be even more useful would be a different one, but I don’t think we can realize it either: IntoFused. It returns Self if you are already a fused iterator, otherwise it wraps it to produce Fuse<Self>.


#7
use std::iter::Fuse;

trait Fused: Iterator {}

trait IntoFused: Iterator {
    type Fuse: Fused<Item=Self::Item>;
    fn fuse(self) -> Self::Fuse;
}

impl<I: Iterator> Fused for Fuse<I> {}

impl<I> IntoFused for I where I: Fused {
    type Fuse = Self;
    fn fuse(self) -> Self {
        self
    }
}

// Need this.
impl<I> IntoFused for I where I: !Fused {
    type Fuse = Fuse<Self>;
    fn fuse(self) -> Fuse<Self> {
        Iterator::fuse(self)
    }
}

#8

I do like documenting which iterators are Fused or not, but I’m not sure a trait is necessary. Have you measured a perf hit using fuse on any already-Fused iterators (I wouldn’t be surprised, I think it would be quite clever analysis to strip the flag/check)?


#9

It makes a difference. Note: Range already exhibits “fused” behavior (always returns None after returning None once).

test fuse      ... bench:         199 ns/iter (+/- 1)
test fuse_fuse ... bench:         248 ns/iter (+/- 7)
test range     ... bench:          47 ns/iter (+/- 4)
#![feature(test)]

extern crate test;

#[bench]
fn fuse(b: &mut test::Bencher) {
    b.iter(|| {
        for i in (0..100).fuse() {
            test::black_box(i);
        }
    })
}

#[bench]
fn fuse_fuse(b: &mut test::Bencher) {
    b.iter(|| {
        for i in (0..100).fuse().fuse() {
            test::black_box(i);
        }
    });
}

#[bench]
fn range(b: &mut test::Bencher) {
    b.iter(|| {
        for i in (0..100) {
            test::black_box(i);
        }
    })
}

#10

This overhead is ridiculous :frowning: .


#11

That benchmark result and another test I did with Itertools::step (which also needs to use Fuse), reveals some very big gains that could be made if we could conditionally skip the Fuse adaptor for certain well behaved iterators.

For an adaptor like step the difference can be 50% reduction of the runtime if fusing isn’t needed.


#12

I’ve implemented a version of this using specialization that makes Fuse<I> a no-op when I implements FusedIterator. This appears to fix the performance issues noted above. That is:

use std::iter::Fuse;

trait FusedIterator: Iterator {}

impl<I: Iterator> FusedIterator for Fuse<I> {}
impl<A> FusedIterator for Range<A> {}

// etc...

// Specialize Fuse...
impl<I> Iterator for Fuse<I> where I: FusedIterator {
    type Item = I::Item;
    fn next(&mut self) -> Self::Item {
        self.0.next() // pass through.
    }
}

So far, I’ve only implemented FusedIterator for core (see https://github.com/Stebalien/rust/commits/fused). @bluss, are you still interested in this?


#13

That looks like it works very well. I think .fuse() hasn’t been used that much, but with an improvement like this it can maybe be used more.

It is a very minor extra iterator trait, but it’s really low complexity in a number of ways. Not implementing it degrades gracefully, implementing it incorrectly only leads to fusing bugs, nothing else.