Why does a const-eval not-taken branch still generate LLVM-IR?

Hi, I'm currently looking into pruning code-gen, and I've noticed that a const function returning false, will still emit LLVM-IR for that branch, despite rustc already knowing it will never be called. Example Compiler Explorer

Is this on purpose, or a bug or known issue?

As far as I’m aware, the call to branch_fn() in your code happens at run-time. (Ignoring optimizations, such as inlining, of course.) const fn means the function can be called at compile-time, but an ordinary call to that function that’s not used to define a const, initialize a static or happening in yet-another const fn which itself is called at compile-time, too (and probably a few more settings) will be an ordinary function call executed at run-time.

But the compiler could know that, and generating less IR seems like a good idea for better compile times.

Know what, exactly?

Know that the branch cannot be taken at run-time, thus doesn't require LLVM-IR.

Well, it could, possible, through inlining; but the const-ness of branch_fn is essentially irrelevant. If you want actual const-evaluation at the call-site, you’d need something like if const { branch_fn() } {. I don’t know how much inlining happens in debug-mode (i.e. without any optimizations), and I don’t know – if I turn optimizations on – from what stage of optimization the LLVM IR comes (i.e. whether it’s the IR passed to LLVM, or the result of some LLVM-optimizations), so I don’t know either what exactly to make of the fact that it looks better-optimized when the optimization level is higher than 0.

I guess my understanding was, that const functions can affect code-gen in a way that they must be evaluated at the MIR stage, eg. let local_stack_array = [0i32; some_const_fn()];. From this I though concludes that the compiler has to have this information ready somewhere regardless of optimization level. And I would imagine that it would help compile times to avoid generating this IR.

array lengths (as well as any const-generic argument to any type) are one of the places where evaluation alway happens at compile-time; like the things I listed above like initializers of statics or definitions of const items.

Yes, so the compiler has to have the ability to evaluate such functions at compile time. I argue it's a good idea to do so for branches.

You can observe particularly well that a const fn is evaluated at run-time in that if its evaluation doesn’t terminate (or takes a very long time to compute) of it its evaluation panics, these result in run-time behavior (non-termination at run-time or long computations at run-time or panic at run-time, respectively).

For example you could replace branch_fn with

const fn branch_fn() -> bool {
    branch_fn()
}

or

const fn branch_fn() -> bool {
    panic!()
}

and see the effects

If it's a non-generic const, it'll happen at MIR time.

If it's a generic const, then MIR optimizations -- run on the generic version -- won't do that.

But LLVM generally will. Your example isn't passing optimization flags, so it's expected that it's unoptimized.

Pass -O and it becomes

example::main:
        ret

as expected.

So generally, yes, a non-optimized build being suboptimal is on purpose.

(Change the function to return ackermann(4, 2) > 100 and you no longer want it running on every debug build.)

1 Like

I know that, that's not my point. My point is that the compiler could potentially become faster by emitting less LLVM-IR. In this scenario it already has all the necessary information and infrastructure.

What I care about here, is not the final assembly or performance, but rather the work the compiler has to do.

Asking rustc to do const eval as aggresively as possible will likely be slower than asking LLVM to optimize it. Especially when LLVM is able to turn a loop into a closed form expression while const eval has to evaluate it entirely. And even when that is not the case, the const eval engine of rustc is pretty slow. And what about if the const eval'ed code turns out to do an infinite loop? And finally it hurts incremental compilation.

4 Likes

I'm not saying it should do it as agressivly as possible, but if false { ... } still generates the body code, that's wasteful. I would imagine there can be some middle-ground that is better than the current status quo. One approach would be to limit the amount of instructions miri is allowed to do for such an analysis.

if false does get optimized out at the MIR level: https://rust.godbolt.org/z/obcPKv85d. As mentioned using a if const { branch_fn() } block to force the condition to be pre-evaluated allows this optimization to apply to your example. The missing MIR optimization that would allow it without any usage of const or const fn is inlining.

True if false gets optimized. I didn't know if const was a thing, I though that was a hypothetical const-eval syntax. But even then if const { func_returning_false() } does not get optimized, and produces the LLVM IR of the if body in my testing.

Hmmm, it didn't when I tested based on your godbolt: https://godbolt.org/z/bbac8jGPT.

1 Like

Hmm, I can't reproduce anymore. I must have made a mistake, but I could swear I saw it not happen earlier with some other code. Maybe it was something else happening, or I missed something.

For comparison, Clang skips generating IR for the contents of an if (0) { … } block even with optimizations disabled. This seems to apply whenever the expression being tested is a language-level 'constant expression'.

1 Like

Indeed it even works for more complex code Compiler Explorer and if you drive up the loop iterations it tells you that it takes too long Compiler Explorer

constexpr evaluation hit maximum step limit; possible infinite loop?

I suspect if they do it, it might not be a bad idea for rustc to do it.