Why does a const-eval not-taken branch still generate LLVM-IR?

Voultapher · February 20, 2023, 8:14am

Hi, I'm currently looking into pruning code-gen, and I've noticed that a const function returning false, will still emit LLVM-IR for that branch, despite rustc already knowing it will never be called. Example Compiler Explorer

Is this on purpose, or a bug or known issue?

steffahn · February 20, 2023, 8:20am

As far as I’m aware, the call to branch_fn() in your code happens at run-time. (Ignoring optimizations, such as inlining, of course.) const fn means the function can be called at compile-time, but an ordinary call to that function that’s not used to define a const, initialize a static or happening in yet-another const fn which itself is called at compile-time, too (and probably a few more settings) will be an ordinary function call executed at run-time.

Voultapher · February 20, 2023, 8:23am

But the compiler could know that, and generating less IR seems like a good idea for better compile times.

steffahn · February 20, 2023, 8:28am

Know what, exactly?

Voultapher · February 20, 2023, 8:32am

Know that the branch cannot be taken at run-time, thus doesn't require LLVM-IR.

steffahn · February 20, 2023, 8:36am

Well, it could, possible, through inlining; but the const-ness of branch_fn is essentially irrelevant. If you want actual const-evaluation at the call-site, you’d need something like if const { branch_fn() } {. I don’t know how much inlining happens in debug-mode (i.e. without any optimizations), and I don’t know – if I turn optimizations on – from what stage of optimization the LLVM IR comes (i.e. whether it’s the IR passed to LLVM, or the result of some LLVM-optimizations), so I don’t know either what exactly to make of the fact that it looks better-optimized when the optimization level is higher than 0.

Voultapher · February 20, 2023, 8:40am

I guess my understanding was, that const functions can affect code-gen in a way that they must be evaluated at the MIR stage, eg. let local_stack_array = [0i32; some_const_fn()];. From this I though concludes that the compiler has to have this information ready somewhere regardless of optimization level. And I would imagine that it would help compile times to avoid generating this IR.

steffahn · February 20, 2023, 8:41am

array lengths (as well as any const-generic argument to any type) are one of the places where evaluation alway happens at compile-time; like the things I listed above like initializers of statics or definitions of const items.

Voultapher · February 20, 2023, 8:44am

Yes, so the compiler has to have the ability to evaluate such functions at compile time. I argue it's a good idea to do so for branches.

steffahn · February 20, 2023, 8:47am

You can observe particularly well that a const fn is evaluated at run-time in that if its evaluation doesn’t terminate (or takes a very long time to compute) of it its evaluation panics, these result in run-time behavior (non-termination at run-time or long computations at run-time or panic at run-time, respectively).

For example you could replace branch_fn with

const fn branch_fn() -> bool {
    branch_fn()
}

or

const fn branch_fn() -> bool {
    panic!()
}

and see the effects

scottmcm · February 20, 2023, 8:48am

If it's a non-generic const, it'll happen at MIR time.

If it's a generic const, then MIR optimizations -- run on the generic version -- won't do that.

But LLVM generally will. Your example isn't passing optimization flags, so it's expected that it's unoptimized.

Pass -O and it becomes

example::main:
        ret

as expected.

So generally, yes, a non-optimized build being suboptimal is on purpose.

(Change the function to return ackermann(4, 2) > 100 and you no longer want it running on every debug build.)

Voultapher · February 20, 2023, 10:37am

I know that, that's not my point. My point is that the compiler could potentially become faster by emitting less LLVM-IR. In this scenario it already has all the necessary information and infrastructure.

What I care about here, is not the final assembly or performance, but rather the work the compiler has to do.

bjorn3 · February 20, 2023, 11:47am

Asking rustc to do const eval as aggresively as possible will likely be slower than asking LLVM to optimize it. Especially when LLVM is able to turn a loop into a closed form expression while const eval has to evaluate it entirely. And even when that is not the case, the const eval engine of rustc is pretty slow. And what about if the const eval'ed code turns out to do an infinite loop? And finally it hurts incremental compilation.

Voultapher · February 20, 2023, 11:56am

I'm not saying it should do it as agressivly as possible, but if false { ... } still generates the body code, that's wasteful. I would imagine there can be some middle-ground that is better than the current status quo. One approach would be to limit the amount of instructions miri is allowed to do for such an analysis.

Nemo157 · February 20, 2023, 12:11pm

if false does get optimized out at the MIR level: https://rust.godbolt.org/z/obcPKv85d. As mentioned using a if const { branch_fn() } block to force the condition to be pre-evaluated allows this optimization to apply to your example. The missing MIR optimization that would allow it without any usage of const or const fn is inlining.

Voultapher · February 20, 2023, 12:50pm

True if false gets optimized. I didn't know if const was a thing, I though that was a hypothetical const-eval syntax. But even then if const { func_returning_false() } does not get optimized, and produces the LLVM IR of the if body in my testing.

Nemo157 · February 20, 2023, 2:50pm

Hmmm, it didn't when I tested based on your godbolt: https://godbolt.org/z/bbac8jGPT.

Voultapher · February 20, 2023, 3:55pm

Hmm, I can't reproduce anymore. I must have made a mistake, but I could swear I saw it not happen earlier with some other code. Maybe it was something else happening, or I missed something.

comex · February 20, 2023, 6:30pm

For comparison, Clang skips generating IR for the contents of an if (0) { … } block even with optimizations disabled. This seems to apply whenever the expression being tested is a language-level 'constant expression'.

Voultapher · February 20, 2023, 6:39pm

Indeed it even works for more complex code Compiler Explorer and if you drive up the loop iterations it tells you that it takes too long Compiler Explorer

constexpr evaluation hit maximum step limit; possible infinite loop?

I suspect if they do it, it might not be a bad idea for rustc to do it.

Topic		Replies	Views
Other possible const-fn purposes language design	19	1984	July 26, 2020
[pre-RFC] [idea] Detecting const evaluation language design	6	813	February 11, 2020
Is "`#[inline(const)]`" possible? compiler	11	1208	January 28, 2023
Do `const fn`s always be inlined?	13	2017	February 13, 2023
Adding #[cfg(const)]? language design	6	1455	February 3, 2021

Why does a const-eval not-taken branch still generate LLVM-IR?

Related topics