Hi, I'm currently looking into pruning code-gen, and I've noticed that a const function returning false, will still emit LLVM-IR for that branch, despite rustc already knowing it will never be called. Example Compiler Explorer
As far as I’m aware, the call to branch_fn() in your code happens at run-time. (Ignoring optimizations, such as inlining, of course.) const fn means the function can be called at compile-time, but an ordinary call to that function that’s not used to define a const, initialize a static or happening in yet-another const fn which itself is called at compile-time, too (and probably a few more settings) will be an ordinary function call executed at run-time.
Well, it could, possible, through inlining; but the const-ness of branch_fn is essentially irrelevant. If you want actual const-evaluation at the call-site, you’d need something likeif const { branch_fn() } {. I don’t know how much inlining happens in debug-mode (i.e. without any optimizations), and I don’t know – if I turn optimizations on – from what stage of optimization the LLVM IR comes (i.e. whether it’s the IR passed to LLVM, or the result of some LLVM-optimizations), so I don’t know either what exactly to make of the fact that it looks better-optimized when the optimization level is higher than 0.
I guess my understanding was, that const functions can affect code-gen in a way that they must be evaluated at the MIR stage, eg. let local_stack_array = [0i32; some_const_fn()];. From this I though concludes that the compiler has to have this information ready somewhere regardless of optimization level. And I would imagine that it would help compile times to avoid generating this IR.
array lengths (as well as any const-generic argument to any type) are one of the places where evaluation alway happens at compile-time; like the things I listed above like initializers of statics or definitions of const items.
You can observe particularly well that a const fn is evaluated at run-time in that if its evaluation doesn’t terminate (or takes a very long time to compute) of it its evaluation panics, these result in run-time behavior (non-termination at run-time or long computations at run-time or panic at run-time, respectively).
I know that, that's not my point. My point is that the compiler could potentially become faster by emitting less LLVM-IR. In this scenario it already has all the necessary information and infrastructure.
What I care about here, is not the final assembly or performance, but rather the work the compiler has to do.
Asking rustc to do const eval as aggresively as possible will likely be slower than asking LLVM to optimize it. Especially when LLVM is able to turn a loop into a closed form expression while const eval has to evaluate it entirely. And even when that is not the case, the const eval engine of rustc is pretty slow. And what about if the const eval'ed code turns out to do an infinite loop? And finally it hurts incremental compilation.
I'm not saying it should do it as agressivly as possible, but if false { ... } still generates the body code, that's wasteful. I would imagine there can be some middle-ground that is better than the current status quo. One approach would be to limit the amount of instructions miri is allowed to do for such an analysis.
if false does get optimized out at the MIR level: https://rust.godbolt.org/z/obcPKv85d. As mentioned using a if const { branch_fn() } block to force the condition to be pre-evaluated allows this optimization to apply to your example. The missing MIR optimization that would allow it without any usage of const or const fn is inlining.
True if false gets optimized. I didn't know if const was a thing, I though that was a hypothetical const-eval syntax. But even then if const { func_returning_false() } does not get optimized, and produces the LLVM IR of the if body in my testing.
Hmm, I can't reproduce anymore. I must have made a mistake, but I could swear I saw it not happen earlier with some other code. Maybe it was something else happening, or I missed something.
For comparison, Clang skips generating IR for the contents of an if (0) { … } block even with optimizations disabled. This seems to apply whenever the expression being tested is a language-level 'constant expression'.
Indeed it even works for more complex code Compiler Explorer and if you drive up the loop iterations it tells you that it takes too long Compiler Explorer
constexpr evaluation hit maximum step limit; possible infinite loop?
I suspect if they do it, it might not be a bad idea for rustc to do it.