Should const calls with small finite inputs become a lookup table?

Hi,

I have some expensive functions that only need to work with a small range of inputs. I was hoping I could tell the compiler to expect a small set of inputs, via an assertion or enum, and that it would pre-compute a lookup table for the result, rather than including a lot of slow code. Is there a way to hint this to the compiler? Is this an optimisation that you've considered?

Here's a contrived example of what I mean:

pub fn fibonacci(n: u32) -> u32 {
    assert!(n < 7);
    #[inline]
    const fn inner(n: u32) -> u32 {
        match n {
            0 => 1,
            1 => 1,
            n => inner(n-1) + inner(n-2),
        }
    }
    inner(n)
}

Note that I'm hoping it would replace the inner function call with a lookup table, it doesn't.

Even being a lot more explicit doesn't work, I'd expect this to at least to compile-time evaluate the const function calls (it doesn't):

#[repr(u32)]
enum Input {
    N0 = 0,
    N1 = 1,
    N2 = 2,
    N3 = 3,
    N4 = 4,
    N5 = 5,
    N6 = 6,
}

pub fn fibonacci(input: Input) -> u32 {
    const fn inner(n: u32) -> u32 {
        match n {
            0 => 1,
            1 => 1,
            n => inner(n-1) + inner(n-2),
        }
    }
    match input {
        Input::N0 => inner(0),
        Input::N1 => inner(1),
        Input::N2 => inner(2),
        Input::N3 => inner(3),
        Input::N4 => inner(4),
        Input::N5 => inner(5),
        Input::N6 => inner(6),
    }
}

Emits:

example::fibonacci:
    cmp      edi, 2
    jae      example::fibonacci::inner
    mov      eax, 1
    ret

I would expect the compiler to have similar output as something like this, for such a small input range:

pub fn fibonacci(input: u32) -> u32 {
    assert!(input <= 6);
    const fn inner(n: u32) -> u32 {
        match n {
            0 => 1,
            1 => 1,
            n => inner(n-1) + inner(n-2),
        }
    }
    const LUT: [u32; 7] = [
        inner(0),
        inner(1),
        inner(2),
        inner(3),
        inner(4),
        inner(5),
        inner(6),
    ];
    LUT[input as usize]
}

In fact, that does actually evaluate the const functions, so perhaps there's just some optimisation path that's being cut off? Or being run in an order that loses the range information?

So I guess there's a few things missing here:

  • Evaluate const functions when the parameters are known, and the output size doesn't make it a worse-tradeoff
  • Recognise that a known small range of inputs (via enum or assert) may be "unrolled" into a match statement for potential optimisations
  • Replace a const match statement with a lookup table

Thanks!

const fn doesn't at all mean “this should be constant-evaluated”; it means “this can be called from a constant evaluation, rather than that being an error”. Whether a given expression is constant evaluated is 100% determined by syntactic location (inside a const item or block, or similar cases), not by the optimizer.[1]

You already noticed that eagerly const-evaluating a function call could make the program significantly larger, but also, constant evaluation is much more expensive than normal execution so it could make the program enormously slower to compile. There very well might be some room for opportunistic, not-too-expensive const evaluation that hasn’t been implemented yet, but if you want that to happen, you (or someone else interested) are going to have to dig into the compiler internals and figure out what heuristics are both actually implementable and won't make things worse.

So, today, the expected behavior is that if you want a const fn to be called at compile time, you have to ask for that with a const item or const block. Note that you can use const blocks in your match, without having to explicitly construct a lookup table constant:

match input {
    Input::N0 => const { inner(0) },
    Input::N1 => const { inner(1) },
    Input::N2 => const { inner(2) },
    Input::N3 => const { inner(3) },
    Input::N4 => const { inner(4) },
    Input::N5 => const { inner(5) },
    Input::N6 => const { inner(6) },
}

Of course, in a lot of cases, the explicit table is probably nicer overall, but that might help with cases which are less neatly numerical on the surface.


  1. And when the optimizer seems to have replaced code with its result, that's a process of inlining and simplification which is completely separate from const evaluation, so there’s less already there than you might think. ↩︎

2 Likes

Thanks for the detailed explanation!

I didn't realise that it was an explicit choice not to evaluate const functions with const parameters outside of a const item or block. I assumed that if a function was marked as const then the programmer was declaring both that it can be const, and that it isn't too expensive to evaluate at complile time. Although I suppose the cumulative or compounded cost of const calls may be costly.

I expect we already do constant folding for some things like simple integer math operations. Although that cost is very predictable.

In my particular case the performance benefits of a longer compile time are definitely worth it, so I'll make a simple convenience macro for building the LUT from a const function, and stick that all into its own file to avoid it being rebuilt unecessarily.

I'm not saying that it was explicitly chosen that this type of evaluation should not happen — only that const fn isn't an explicit request for it, there are no other ways to ask for it in the language today, and adding it wouldn’t be trivial.

I took a look at the original proposal for const fn, RFC 911 to see what more might have been said at the time, and that RFC is entirely about adding the ability to use constructor functions in const so that structs with private fields (and invariants) can be constructed at compile time. Pre-calculation of values wasn’t being considered at the time — there were no local variables and no flow control. Later RFCs have gradually added more capabilities, until we have the situation today where const fn can do enough things that we think more about what it can’t do than what it can. (News in this area: &mut references in const fn has been stabilized and will be in the 1.83 release.)

FYI, constant folding is done by the LLVM optimizer, is a completely separate process from const evaluation, and is done to all compiled code whether it is marked const fn or not.

Note that in Rust, the compilation unit (thing which a single invocation of the compiler processes) is an entire crate, not a single file, so having a separate file doesn't mean the file won't be processed at all if it is unchanged. However, the query system and incremental compilation mean that even after the compiler has re-read all the crate’s source files, it can notice when inputs are identical to the previous run and reuse outputs, significantly speeding up recompilation after small changes.

There is still benefit to having a separate file: if a file isn't changed, then none of its spans (filename-and-line-number info) have changed, so anything dependent on the spans (e.g. in a panic message or debug information) can be reused. It’s just that the overall process is not an all-or-nothing “rebuild” decision; it looks at more and throws out less than you might think.

3 Likes

that's what I meant thanks.

Excellent news, thanks!

I assumed it was at least done in LLVM, but wasn't sure if we had any MIR level stuff to ensure more stuff could be picked up by LLVM. I wasn't clear in my question, but I wasn't explicitly talking about the const fn attribute only, I was happy with things like inlining being able to resolve it (the function in my first example is marked as inline, but this wasn't sufficient). I should have said something more like:

Why aren't compile-time evaluable calls that can become a small lookup table compiled down to one?

Either way, you've answered that question by saying that we'd need a bunch of heuristics and it's not a clear win with unclear performance costs, thank you :slight_smile:

Yeah, that's why I said unecessarily rebuilding, I was meaning incremental building.

Thanks for all your explanations, it's not the answer I was hoping for, but it did explain why it's not like I wanted.