Pre-RFC: Stabilization of `likely(bool)`, `unlikely(bool)` & `assume(bool)` intrinsics

Introduction

The safe likely(bool) and unlikely(bool) intrinsics signal the optimizer possible branch reorderings for lowering cache misses and improving the CPU branch speculation. These functions have been worked around in stable Rust by using cold attributes (notably in Hasbrown; prior art section), but even if they may result in the same optimized code, the message they are signaling to LLVM is different from their nightly counterparts.

The unsafe assume(bool) intrinsic signals the optimizer the fact that an invariant is always upheld, which can remove entire unnecessary branches if used correctly—or otherwise lead to catastrophic UB for obvious reasons—. This function can be implemented in stable Rust by using the already stabilized hint unreachable_unchecked() in a branch that executes when the invariant is not upheld; thus, there is no issue in stabilizing this specific intrinsic, as it saves two lines of code and is more concise.

Why stabilize the aforementioned functions?

Mostly because there are no roadblocks nor possible API changes, and they're useful. Even though their use-cases are mostly niche, their usefulness in these niche spaces is unarguable. Besides, C supports them.

Bikeshed

This Pre-RFC suggests adding the following wrapper functions to core::hint, which would trickle down to the std; documentation not included.

#[inline]
pub fn likely(b: bool) -> bool {
    crate::intrinsics::likely(b)
}

#[inline]
pub fn unlikely(b: bool) -> bool {
    crate::intrinsics::unlikely(b)
}

#[inline]
pub unsafe fn assume(b: bool) 
    if cfg!(debug_assertions) {
        // Elsewhere the debug_assert would've been optimized
        // away when compiling a release build with
        // `debug_assertions` enabled.
        assert!(crate::hint::black_box(b));
    }
    crate::intrinsics::assume(b);
}

Prior Art

9 Likes

Yes, it'd be great. Small nit: assume should be unsafe, because:

Informs the optimizer that a condition is always true. If the condition is false, the behavior is undefined.

or perhaps leave assume out, since unreachable_unchecked can already emulate it.

2 Likes

It's not clear to me that an RFC is needed for this. You might be able to get away with creating an ACP and going through the normal process for small API additions.

Note though that regardless of what path you take, I would expect to see some motivating examples personally. (Not just links to prior art, but examples that can be isolated and understood on their own.)

7 Likes

AFAIK there are cases where likely and unlikely regress performance even when the indicated case happens 100% of the time due to being a function call that won't get removed until late in the compilation pipeline.

I think you should just open a PR for hint::assume. Enough people write if !b { hint::unreachable_unchecked() } that we should just let them write hint::assume(b).

Unfortunately, last I heard, likely/unlikely didn't actually do anything most of the time. And that using PGO was substantially better.

(I do with there was a #[likely] on match arms or similar, and some kind of unpredictable as exactly the opposite of likely/unlikely for things where select is better than branching.)

8 Likes

That is my recollection as well when I brought up stabilizing it in the past. However, I do not recall seeing any examples of where this is actually the case, nor do I think it's necessarily a blocker (documentation could simply mention it).

The is definitely the case for assume -- libstdc++ was recently removing calls to it because of that.

But yes, it's not a blocker. The documentation for assume just needs to emphasize that it's not something you should use without evidence of benefit.

(And a test for that benefit, so people can remove them later when they not longer provide any value.)

2 Likes

Yes, it'd be great. Small nit: assume should be unsafe, because:

Oh, thank you, that was an accident; though it wouldn't even compile without the unsafe.

I actually think an ACP might be better, then. My personal motivation is getting several projects of mine out of nightly, and this is one of the blockers, since there is no stable counterpart that does the same.

Then we could try to merge both implementations?

#[cold]
#[inline]
fn cold() {}
 
#[inline]
pub fn likely(b: bool) -> bool {
    if !crate::intrinsics::likely(b) { cold(b); }
    b
}

#[inline]
pub fn unlikely(b: bool) -> bool {
    if crate::intrinsics::unlikely(b) { cold(); }
    b
}

Tomorrow I will benchmark Hashbrown and the project that is actually driving me to do this, Lariv with the three different implementations, and I'll post here the results.

1 Like

I just edited the bikeshed code snippet to include a debug assertion of the invariant in the assume implementation, which is common practice in these kinds of hint functions.

I want to boost @scottmcm. I would like to be able to signal to the compiler that a condition is completely unpredictable to avoid having branches inserted into branchless code. In clang there is a __builtin_unpredictable intrinsic that can be used to express this in C.

2 Likes

For core items, it's intrinsics::assert_unsafe_precondition!. (See e.g. unreachable_unchecked.)


For consistency with unreachable_unchecked, this function should probably be spelled as core::hint::assume_unchecked.

1 Like

Interesting. That suggests that we might get better results with a macro-flavored implementation (that turns into a hint very early in compilation) rather than a function (even an always inlined one)?

I am not sure how the macro works (the pattern matching there is confusing me), and I can't find its definition anywhere, so I would appreciate it if someone could implement the correct precondition checking for me.

When you "assume" something, conceptually, you are specifically not checking it, so in my mind that is redundant and unnecessary. I should add that the unreachable_unchecked intrinsic/hint is named that way because there's also a macro that panics at runtime. On the other hand, there's no such a thing for assume, so my vote is to leave the function named that way. The discussion is open, though.

Macros are long gone by the time the code gets to LLVM, that's why they have to be function calls.

I was talking about a built-in operation that expanded as directly as possible to the appropriate LLVM IR (llvm.expect) without having to wait for functions to get inlined, to avoid the problem that @bjorn3 was referring to.

I don't think that there's a way to inject LLVM IR without using intrinsics. We could use attributes, but they're still experimental on expressions, so that's a no. My only guess is that we would need support from the compiler, and have a magic macro like the one used for inline assembly, an "inline LLVM IR" macro. This would also allow us to move a big chunk of the compiler magic into core, though.

1 Like

It needs to end up in the MIR somewhere, so I'm not sure that adding an extra "these are annotations" thing to a mir::Body would necessarily be good. Part of the reason they impact compilation is that if you want them to work, things have to handle them -- like a MIR optimization has to preserve it properly if rewriting SwitchInts -- which is always harder when they're not in-line.

Assume is now a MIR primitive, in part to avoid needing a Call and another BB ever time you use it. We might be able to do the same with some kind of "expect this value".

Alternatively, we could have some kind of builtin# for an unpredictable select, rather than trying to expose the same "here's a pseudo-function that's useless in most places" form that C extensions tend to use.

1 Like

Macros in std can allow inner unstable, so a macro could expand directly to calling intrinsics::assume. And while that looks like a function, the "rust-intrinsic" calling convention means that the calling convention for intrinsics::assume is "do the backend assume bytecode" rather than any sort of function call.

Also, we control the language. We can make new syntax builtin # assume(b) instead of using an intrinsic function if we really wanted. It's not exclusively a T-libs(-api) job if doing so, but it's an available option.

If the assume only being visible to the backend after inlining makes using the assume worse, then using a macro/intrinsic/builtin to codegen the assume directly makes sense, since the purpose of the optimization hint is stunted by late discovery.

1 Like