Decide if for-loop iterator is empty given a literal/const iterator

fenollp · August 19, 2024, 3:54pm

Hi! I've encountered a case where the compiler could be smarter, in a backwards-compatible way.

This code fails to compile:

fn main() {
    let mut x: u8;
    for y in [42] {
        x = y;
    }
    println!("{x}");
}

error[E0381]: used binding `x` is possibly-uninitialized
 --> src/main.rs:6:15
  |
2 |     let mut x: u8;
  |         ----- binding declared here but left uninitialized
3 |     for y in [42] {
  |              ---- if the `for` loop runs 0 times, `x` is not initialized
...
6 |     println!("{x}");
  |               ^^^ `x` used here but it is possibly-uninitialized

Playground link

Note the same error is given for a const array of known non-zero size:

fn main() {
    const ITERABLE: [u8; 1] = [42];
    let mut x: u8;
    for y in ITERABLE {
        x = y;
    }
    println!("{x}");
}

To me and at least in the case of const iterables, rustc (or LLVM?) should peek into the const iterable just to be able to count how many times this loop runs.

Surely this is overly naive on my part. I can imagine nested loops and/or ifs that go out of the loop before assignment. And my example may be way to simple and uncommon for the effort. But who knows, maybe I got this wrong and something will spark in a compiler dev's head?

TadaHrd · August 19, 2024, 7:05pm

It could be infinitely smarter but it also could be infinitely slower.

The Rust compiler is slow enough as it is. A couple seconds isn't too bad but when you need to, for example, debug build scripts, it's pretty slow tens of seconds.

Sure, a little check like this doesn't seem like a lot but these features to make the compiler "smarter" can add up.

Do you think it's a good trade-off?

cuviper · August 19, 2024, 7:32pm

That for loop expands to something like:

let mut iter = [42].into_iter();
while let Some(y) = iter.next() {
    x = y;
}

Looking at function signatures alone, the borrow checker (which also handles initialization) sees that this loop body might never run if next() were to return None.^[1] The compiler would need to use inter-procedural analysis to see that array::IntoIter::next() always returns Some at least once for N > 0. Later optimization passes probably do figure this out, especially after inlining -- but that's not a language semantic guarantee, so you still have to deal with initialization up front.

(I'm pretty sure this has been discussed before, with better explanations than that, but I can't find it now...)

It's not really even that smart -- even while true looks like "this loop might never run." ↩︎

scottmcm · August 19, 2024, 7:56pm

This is unlikely to happen for the same reason that the compiler doesn't do this for the much easier case of while true: Rust would rather not have things break when you move things to be a variable than it would special-case stuff like this.

In particular, we don't want something to not be working, you change it to a literal expression as part of debugging until you get it to work, then you change it back to a variable and it breaks again.

If you want the compiler to know it runs at least once, use a loop, not a while or a for. Or use a combinator like Iterator::reduce instead of a loop.

chrefr · August 19, 2024, 11:35pm

This can be made working more generally by adding an unsafe trait NonEmptyIterator: Iterator {}, and implementing it for things like arrays where len >= 1 (also: repeat, map of NonEmptyIterator, etc.).

However, the borrow checker has then to check if a for'ed type is NonEmptyIterator (specialization), and we need to be able to implement NonEmptyIterator only for arrays with >=1 length (generic const exprs, if you don't do that by a macro, which will be weird for a language feature).

It can be done today if we make the trait internal to std and use a macro for arrays, but I don't know how likely that is (also, someone will have to propose an RFC).

scottmcm · August 20, 2024, 12:38am

That cannot exist on Iterator, because next exists. It would end up actually being InfiniteIterator.

It would need to be on IntoIterator instead.

chrefr · August 20, 2024, 8:03am

Yes, you are right. There is another reason we probably want to define it on IntoIterator: if we want to implement it for references to array (which we probably do), it has to be on IntoIterator, because array reference yield slice iterators, and changing that will be a breaking change.

Topic		Replies	Views
Strange error information generate with rustc	3	586	November 12, 2022
About optimizations of `for` loops compiler	6	1121	August 25, 2023
Idea: &mut I where I:Iterator<Item=T> should be usable in for loops language design	4	661	May 27, 2019
Unnecessary mut lint does not catch some cases where &mut works language design	5	811	June 7, 2021
`for _ in _` loops could have the same semantics as `while let Some(_) = _.next()` language design	8	1022	March 20, 2023

Decide if for-loop iterator is empty given a literal/const iterator

Related topics