Infinite iterators in for

HKalbasi · May 28, 2022, 5:23am

The idea is adding an unsafe marker trait for infinite iterators (or using TrustedLen if it works) and desugaring for loops for infinite iterators in this way:

for x in some_infinite_iterator {
    println!("{x}");
}
// desugars to
loop {
    let x = match some_infinite_iterator.next() {
        Some(val) => val,
        None => unsafe { unreachable_unchecked() },
                    // ^^^^^^^^^^^^^^ this is break for normal iterators
    };
    println!("{x}");
};

It will enable omitting unreachable!() or dummy works after such for:

for i in 1.. {
    if some_condition {
        return some_value;
    }
}
// no unreachable!()
// no Ok(dummy)
// no Err(not_found)

And break with value, similar to loops:

let first_good_even_number = for i in (0..).map(|x| 2 * x) {
    if is_good(i) {
        break i;
    }
};

As a side effect, it will cause better warnings. For example a beginner might think this code will print 0 to 255 and then finished:

for x in 0u8.. {
    println!("{x}");
}
println!("finished");

But in debug builds it will panic and in release builds it will start again from zero. With this feature, compiler can warn that println!("finished"); is unreachable, but currently it might be a legitimate usage like returning a dummy value.

scottmcm · May 28, 2022, 5:31am

It probably doesn't need to be unsafe, really.

It could be something like

trait InfiniteIterator : Iterator {
    fn next_infinite(&mut self) -> Self::Item {
        self.next().expect("should have a value because it's an infinite iterator")
    }
}

Though that would need the desugaring to know it's infinite, and emit a loop instead of the usual while let.

HKalbasi · May 28, 2022, 7:47am

That trait has downside of possible conflicting next and next_infinite implementation, maybe this one is better?

trait InfiniteIterator {
    type Item;
    fn next_infinite(&mut self) -> Self::Item;
}

impl<I: InfiniteIterator> Iterator for I {
    fn next(&mut self) -> Option<Self::Item> {
        Some(self.next_infinite())
    }
}

Anyway, all of those InfiniteIterator traits would work for the propose of diverging for loops.

scottmcm · May 28, 2022, 8:07am

Two problems there:

Conflicting method implementations are always possible, because even in just Iterator you can override nth/fold/etc inconsistently. So that's fine -- and, really, you wouldn't need to override next_infinite anyway because your next implementation is always returning Some(...), so LLVM will almost certainly just optimize out the expect anyway.
That blanket impl can't really be added, because it overlaps with all the other third-party implementations that already exist.

steffahn · May 28, 2022, 8:41am

Edit: Check @scottmcm's answer below for why this is actually different from specialization.

Changing the desugaring of for loops based on whether or not a trait is implemented is a form of specialization, and comes with all its (potential) soundness issues.

With a marker trait, the soundness issues could be debated away, as in the "how should it be infinite or not just depending on lifetimes? If it implements the marker trait for some lifetimes, it ought to work for all lifetimes"-style argument that currently justifies effectively copying values based on specialization even if the type might not implement Copy for that exact combination of lifetimes.
With a "safe" solution, as suggested by @scottmcm, you get into "full" specialization, as a method of the specialized-on trait is called, so this is unsound using the current mechanisms. One would need to solve specialization soundness in general first, before such a feature could become feasible (e.g. solve it by restricting the implementors of the InfiniteIterator trait somehow to avoid implementations that result in soundness issues).

scottmcm · May 28, 2022, 8:57am

I'm not certain about that in this particular case, because it doesn't have the parametricity problems that full specialization has. It wouldn't do something different if the concrete type passed to the generic happened to be an InfiniteIterator, only if at type-checking time there was an InfiniteIterator bound on the iterator type.

At that point it's more like method resolution -- if you know more, you might call something different.

steffahn · May 28, 2022, 8:59am

Oh, nevermind, you're right.

(These are the kinds of details why I usually try to come up with code examples/exploitations for soundness issues; I didn't in this case, but I would have probably noticed this oversight if I did.)

steffahn · May 28, 2022, 9:11am

Note that for method resolution, you can however get a match of the method, and a type error later. This doesn't necessarily seem acceptable for the for-loop desugaring of a type that does implement Iterator.

(I'm ignoring the .into_iter() step by the way in all of this discussion.)

Code example:

struct Foo<S, T>(S, T);

trait X {
    fn f(&mut self) {}
}
trait Y {
    fn f(&self) {}
}

impl<S, T> X for Foo<S, T> {}
impl<T> Y for Foo<T, T> {}

// you can tell which one of the method was called by
// the mutability marker. (If wrong, it would have errored
// about missing `mut`, or warned about unnecessary `mut`.)

fn works1() {
    let mut x = Foo(true, 42);
    x.f();
}

fn works2() {
    let x = Foo(0, 42);
    x.f();
}

use std::cell::Cell;

fn works3<'a>(a: Cell<&'a ()>, b: Cell<&'a ()>) {
    let x = Foo(a, b);
    x.f();
}

// lifetime mismatch
fn doesnt_work<'a, 'b>(a: Cell<&'a ()>, b: Cell<&'b ()>) {
    let mut x = Foo(a, b);
    x.f();
}

So, unless this kind of behavior is desired for for loop desugaring, the check would need to work different from how method resolution operates at the moment.

Nemo157 · May 28, 2022, 9:33am

I think adding a blanket impl based on a brand new trait is ok, there can be no third party types implementing that trait already; and I think they could validly change from implementing Iterator to implementing InfiniteIterator and retain their API. It might block other blanket impls that could be added in the future though (like one based on GAT “streaming iterators”).

cuviper · May 28, 2022, 3:06pm

I think the blanket impl would still conflict on iterators that should be conditionally infinite, especially adaptors like Map<I, F>. That wants Iterator where I: Iterator, and InfiniteIterator where I: InfiniteIterator, but then the latter makes the blanket impl apply and conflict with the former.

HKalbasi · May 28, 2022, 5:00pm

I feel that it won't happen in practical cases since (if I understand correctly) it's about lifetimes and lifetimes can't affect infinite-ness of an iterator, so that kind of implementation of Iterator and InfiniteIterator are at least imprecise and there should be a more general implementation of InfiniteIterator.

I guess that every practical implementation of InfiniteIterator would have the same parameters of Iterator impl, some of them restricted via an additional InfiniteIterator bound (or similar traits like InfiniteDataStructure which satisfy the same property). Assuming this property, I think that scenario won't happen.

And since it wouldn't cause unsoundness, IMO it is tolerable, if not desired.

afetisov · May 28, 2022, 10:55pm

That's not possible. The desugaring of for loops happens before the type inference happens. Similarly to macro expansion, it must happen purely syntactically.

It would also have poor compositionality. An iterator adapter applied to an infinite iterator is, in general, finite. You can't call filter, you can't zip with a finite iterator. While possible to specify, this will likely cause churn and unexpected lack of impls in the ecosystem. It will also be necessarily conservative, i.e. many iterators which in practice will be infinite won't be marked as such, since in the general case they may terminate.

I also very much doubt the premise that truly infinite iterators is something common enough and performance-sensitive enough that such optimizations are worthwile.

Finally, if it really matters for your use case, it's quite simple to implement manually via loop and unwrap_unchecked. This also allows break with value out of the box, doesn't cause any ecosystem issues and allows to use fine-grained information about termination.

The issue with 0u8.. should really be handled by a Clippy lint. There is no reasonable case where it would be better than explicitly using 0u8..u8::MAX.

CAD97 · May 28, 2022, 11:07pm

First off: 0..=u8::MAX.

Secondly: yes, if the intent is to break out of the loop manually, and iterating past u8::MAX is a bug, 0u8.. is theoretically better (because it would panic in debug mode when doing so). We don't have a for else to properly handle the fallout case if it happens, and arbitrary block break to emulate it is unstable, so you have to do a worse polyfill.

let sample = 'loop loop {
    for i in 0..=u8::MAX {
        if rand() { break 'loop i; }
    }
    panic!();
};

But in general I agree that this is a good lint candidate. For fewer false positives, it could check for the presence of an early-exit in the loop.

afetisov · May 28, 2022, 11:28pm

It would also become an infinite loop in realease mode, which is a very troubling failure mode. Regardless, I doubt how useful such iterators are in practice. Simply iterating by index is already a very rare thing in Rust.

scottmcm · May 28, 2022, 11:40pm

I don't think that's a law of the universe. It would be possible for HIR to have a node for for loops, and thus a customized typing rule so that they have type () currently but ! if infinite (and they don't break).

I don't think that's necessarily a good idea -- particularly if it makes it a breaking change to add impl InfiniteIterator for an existing iterator -- but it would be possible.

I don't think this is a big deal, really. zip(infinite, finite) wouldn't be infinite, but zip(exact_size, unknown_size) isn't exact-size either.

And some things do compose. Like finite.chain(infinite) is infinite.

afetisov · May 29, 2022, 12:00am

I'm not saying it's impossible, just that it won't be good enough and predictable enough for the real world. E.g. something like 0..=u64::MAX is infinite in practice, but can't be formally treated as one. Similarly for something like "iterate until you find the preimage of sha2 hash value", or (0..).filter(|x| *x != 10), or a queue which is filled faster than it's iterated.

On the other hand, 0u8.. must be treated as infinite, even though it is infinite only if you hit an annoying bug.

Since infinite iterators must also support break with value, switching somewhere from map to filter_map would suddenly require rewriting the loop entirely.

steffahn · May 29, 2022, 12:09am

I think you might be able to observe it not being infinite easily due to optimizations or special implementations of methods such as Iterator::nth.

~~I don't understand what you're saying here.~~ (Certainly, map or filter_map applied to an infinite iterator still produces an infinite iterator; but even if filter_map wouldn't do that, ~~I don't quite understand what you're saying with regards to "support break with value".~~)

Edit: Oh! I completely missed the suggestion to support break with value in the original post in this thread. I don't know if I like that idea too much in the first place.

scottmcm · May 29, 2022, 12:09am

That's an interesting point!

It also suggests a really easy way to do this:

loop for x in repeat_with(foo) {
    if bar(x) { break x }
}

And it just desugars to loop { let x = it.next_infinite(); BODY } instead of the while let.

afetisov · May 29, 2022, 12:15am

If you just plug it into a for loop, there is no possibility of calling nth, and if you manipulate it as a separate value, then it's no different from calling filter on it or zipping it with a finite range.

In any case, my point was that if you really care about infinite loops, there is much more information to take into account than a compiler can handle.

filter_map, just like filter, can just filter away your entire iterator, so it will now be finite. It will take infinitely long to compute the next value (maybe, if the optimizer doesn't turn it into noop), but from the consumer's perspective the iterator is finite.

steffahn · May 29, 2022, 12:16am

I disagree. If it never returns None from next then it isn't finite, and if it isn't finite, it's infinite. It's still entirely okay for an infinite iterator to panic or never terminate on a call to next, otherwise something like map couldn't preserve infiniteness either.

That should be an invalid optimization AFAICT

But the compiler could still optimize away the entire loop for something like for _ in 0..=u64::MAX {} in case it understands the code well enough. Thus the iterator can't be infinite; e. g. code after it isn't necessarily unreachable even "in practice".

Topic		Replies	Views
Using never_type and unbounded for loops (RangeFrom) language design	5	907	March 25, 2019
Infinite `Iterator`s - on returning `Self::Item` instead of `Option<Self::Item>` libs	3	1824	March 25, 2019
Applying `take()` on an infinite iterator should make it exactsizediterator language design	7	435	July 7, 2024
Thoughts about Iterators - can we have move-or-mutate semantic for functions? language design	6	1120	March 25, 2019
There should be a new trait "Iterable" ideas (deprecated)	4	1301	March 25, 2019

Infinite iterators in for

Related topics