`core::hint::assume`?

Would it make sense to have a core::hint::assume method, in the same fashion as core::hint::unreachable_unchecked? The relevant intrinsic already exists as core::intrinsics::assume.

I believe this would be functionally equivalent to

unsafe fn assume(cond: bool) {
    if !cond {
        core::hint::unreachable_unchecked()
    }
}

Ultimately this would be an optimization hint, so nothing has to happen, strictly speaking. Users would naturally expect certain improvements, in the same manner as unreachable_unchecked.

3 Likes

I agree this might be useful, but I think it would require some major wrinkles to be sorted out first. For example, what should this do?

assume(loop {});

Not long ago, I was playing with the Clang equivalent, and I noticed that __builtin_assume(({ for(;;); 1; })) will, in fact, emit the infinite loop. In fact, Clang will even emit some finite loops. This:

#include <limits.h>

__attribute__((__const__,__always_inline__))
inline static int is_pow2(unsigned j) {
    __auto_type i = j;

    for (int c = 0; c < CHAR_BIT * sizeof(i); ++c) {
        i = (i >> 1) | ((i & 1) << (CHAR_BIT * sizeof(i) - 1));
        if (i == 1)
            break;
    }

    return i == 1;
}

int foo(void) {
    extern unsigned bar(void);
    __auto_type x = bar();

    __builtin_assume(is_pow2(x));
    
    return __builtin_popcount(x);
}

will emit code for the loop in the body of is_pow2. (Not that this is a very idiomatic way to write it.)

Any why would anyone want to write such a thing? Well, some people may – inadvertently or deliberately – hint conditions with loops (like ‘this graph contains no cycles’) and/or side effects (like transient memory allocations), and such hints may end up pessimising generated code.

Probably worth mentioning as well: The Regehr™ performed an experiment turning assertions in Clang into compiler hints, and the results were rather underwhelming. There are many caveats here, of course, but it does suggest that compiler hints like these need not gain us very much.

(How do I disable Rust syntax highlighting in a code block?)

3 Likes

Assume is backend-specific and very hard to use correctly, so it would be good to support this with existing success stories of using the intrinsic.

It's known from libcore that assume can break optimizations and pessimize code there too, so it's not always clear when using it is a benefit.

2 Likes

Hmm…I wasn't aware of existing issues with LLVM that mean it might make things worse. That certainly isn't expected.

With regard to assume(loop {}) [or more generally assume(!)], my intuition leans towards it doing absolutely nothing (just running the inner value, no hints emitted).

1 Like

Personally, I'd expect assume, just like any other function, to evaluate the argument before entering the function. The semantics of the function would then be "if the argument is false, UB; if the argument is true, do nothing".

So:

  • For assume(loop {}), we never finish evaluating the argument, so we never enter the function.
  • For something like assume({x = 2; true}), x is guaranteed to be set to 2 afterwards. The compiler can't elide side effects just because the expression happens to be passed to assume.
  • However, for something like println!("hello"); assume(false);, the compiler could remove the println, simply because UB can always travel back in time. In other words, a program's runtime behavior becomes unpredictable as soon as it is inevitable that it will execute UB at some point in the future. This isn't specific to assume.
8 Likes

Would it be reasonable to have assume be a macro that expands to nothing, meaning its argument is not evaluated? Actually, to keep things typechecked and whatnot it might be best to expand it to if false && $expr { ::core::hint::unreachable_unchecked() } or something like that, but then you need some kind of internal rustc attribute or something you can add to $expr so the optimizer can consume the expression and use it when making deductions or proofs, without doing what LLVM does (namely, the compiler wouldn't have to "preserve the instructions only used to form the intrinsic’s input argument").

Of course, this would only be helpful if rustc had its own optimization passes that could consume this...

Re being backend-specific, I don't think it is a large concern since ignoring assume is always a correct (although not optimal) implementation.

2 Likes

This is a coherent stance – but it does mean that some uses of assume may end up pessimising the code. In the case of the most naïve, non-optimising compiler (where assume is a no-op), all of them would, as even assume(2 + 2 == 4) would have to evaluate 2 + 2 == 4 only to discard the result afterwards.

Clearly this is less than satisfactory; to address this, Clang prevents using expressions with side-effects in __builtin_assume (discarding them with a warning) and discards all expressions unconditionally at -O0. As we can see though, this isn’t perfect.

Ideally, I think, assume should tell the compiler to evaluate the expression speculatively, assume it returns true (implying in particular that it should converge/terminate), and mine that assumption for optimisations. That would allow expressing ‘non-local’ properties like connectivity with impunity, and it would mean that assume(loop {}) would be UB while assume({x = 2; true}) a no-op. But if that is to remain within a realm of possibility, it follows that assume cannot be a regular function in the first place.

1 Like

IMO if assume always discards any potential side effects, it should be a macro. Otherwise we'd be introducing a second type of function, so to speak.

4 Likes

I'll try to provide sources. The intrinsic has the same "disclaimer" in the doc that it has had since being added in 2014: assume in std::intrinsics - Rust

Others are on board with the knowledge that it is hard to know when it's appropriate to use: https://github.com/rust-lang/rust/pull/54995#issuecomment-429071477 and LLVM Language Reference Manual — LLVM 18.0.0git documentation

With that said, Llvm is always developing, right? And optimizing compilers are mercurial, small changes in input can have big effects on the output sometimes. So new practical experience is always needed.

Yes, but I'll just make a counterargument - if it's possible that the assume either makes or breaks an optimization, then you might want to use it completely differently depending on backend. Of course, we don't need new intrinsicts for that (llvm_assume, cranelift_assume, or even rustc_assume?) unless that's the only way to conditionally compile code depending on the backend.

Adding an unstable core::hint::assume for improved experimentation sounds like a good idea, but it should IMO only happen if we underline that stabilization is unlikely unless the feature can prove itself in real code.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.