Pure annotation for stateless functions

I feel like this borders what an unsafe attribute should be. The programmer 'knows' that the panic path is dead code. The compiler optimization proves it but just like any other optimization that is irrelevant to the typing of the program and it would be really surprising to me if this out of all properties gets to have feedback from the optimizer stage. Really we must prove it to the type system and this operation is an axiomatic assertion about possible runtime behaviors, i.e. unsafe { … }.

Incidentally that is quite a prospect in of itself. For a pure function, calls to impure functions are unsafe. That is, not all invocations of impure functions are actually impure yet the pure caller must prove that their invocations are. Maybe this helps with rolling it such a feature out, due to having an escape hatch? Instead of being fully blocked on upstream decisions about purity we are merely inconvenienced.

#[pure]
fn this_is_pure(a: u32, b: u32) -> u32 {
    let div = 33 - b.leading_zeros();
    // Uh oh, a / div is not statically pure!

    // Safety: panic path is not reachable, div > 0
    unsafe { a  / div }
}

Of course, addition panicking in some profiles will be quite the pitfall if this pattern were used in practice.

But well, this strays a bit far from the point. Really removing panic needs a soudness assertion as reaching the hook is straight up UB. Reordering the evaluations of a safe function shouldn't in the common case, except it could be a precondition for a related unsafe block where it remains to figure out how to define the scope of reordering such that other crate's soundness properties can not be violated.

Yes, if we have an explicit side effect between the calls, then we can not reorder the functions. I was talking only about a sequence of function calls which are "pure, but may contain panics". In other words, I was implying the following modification of your snippet:

{
  let y = g();
  let x = f();
  (x, y)
}

IIUC currently the compiler can not reorder f and g, which are potentially panicking, but do not contain any other side effects. Here is a somewhat practical example which demonstrates horrible codegen caused (IIUC) by the inability to reorder panics (uncomment the asserts to see the difference):

Unfortunately, I don't think it can be done in a practical fashion. We effectively have to commit into code a formally verifiable proof that all panics can be removed for the given function. Doing it manually is extremely annoying and time consuming. It's simply impractical outside of very limited high assurance areas. Maybe we could ask the compiler to generate the proof for us, save it into a separate file, and reference it (and re-generate it on each code change), but development of such tooling would require a lot of work with a plethora of open questions.

2 Likes

Yeah that sounds right. I think ideally the language would let the user express that they don't care about the order here -- it seems awkward if the user first writes the operation in a particular order, and then the compile reorders; it makes a lot more sense for the user to say "just do these two in any order" and e.g. the type system can ensure that this is sound and the result is the same (aside from the exact panic location).

4 Likes

True, I considered that implied by determinsitic. While there are syscalls that might be determinsitic (uname?), there is no reasonable way for Rustc to track that.

One interesting effect of this though is that non-escaping heap allocations could not be handled. Logically it should be fine to optimise away the second call if f allocates a Vec internally and then throws it away.

This I don't follow? If e.g. f returns random numbers it will change the observed result? (Random numbers are possible without syscalls or global state, just use inline asm with rdrand).

It seems to me that getting determinism is key here.

If you have two Turing machines, f and g, one that halts for input X and the other that doesn't, you can prove they aren't the same Turing machine.

Even uname results can change (AFAIK) for a running process via:

  • CRIU save/restore on different kernels
  • kexec in-place kernel upgrades
  • hostnamectl or other ways to set the hostname as that appears as well

Non-determinism and randomness are very different beasts. Non-determinism has no notion of "distribution", there is just a set of possible values. The compiler is allowed to shrink that set, but now grow it.

Let's say f: fn() -> bool returns a non-deterministically chosen bool (here is an example of such a function), and the program does:

let a = f();
let b = f();

Now we change this to

let a = f();
let b = a;

This is trivially correct. All we have to show is that every possible behavior of the new program is also a possible behavior of the original program. It is certainly possible for f to return the same value twice in a row, therefore we are done.

A realistic example of this is a program that works on some x: &AtomicU32 and does:

let a = x.load(o);
let b = x.load(o);

It is entirely legal for the compiler to make the second line let b = a; (no matter the o). I don't know if it will actually do that, LLVM is sometimes quite conservative when it comes to atomic accesses, but it would be legal.

You seem to be using non-determinism as a term for "behavior controlled by the environment"; this is indeed also often called non-determinism, but it is different from the way I used the term (and my usage is equally common). If you want to be precise you can talk about internal and external non-determinism. Internal non-determinism is what is often modeled with a non-determinism monad; primary examples are concurrency and memory allocation (as a language primitive) picking an address to place the allocation. The "non-deterministic turing machine" is also a case of internal non-determinism (but that one is angelic, the one I describe is demonic... I could give an entire lecture on this but sadly I don't have the time :wink: ). External non-determinism is really just the program interacting with the outside world, and I find it quite confusing to conflate the two by giving them the same name -- which is why I would always explicitly refer to an environment or "syscalls" or something like that when talking about this kind of "non-determinism", and reserve "non-determinism" for situations where the choice is part of the language semantics, not part of the environment.

2 Likes

Ah okay, yes I meant you call external non-determinism. I had never heard about this separation before.

I think it's possible to make a UB-free annotation for "pretend this function is pure".

I'm imagining a #[pretend_pure] attribute (idk what a good name is, feel free to bikeshed) applied to a function that comes with the following changes:

  • If the function has any side effects, then those side effects will be observed the first time the function is called [ETA: with a specific set of arguments] and may or may not be observed any subsequent times. And on those subsequent times, the side effects may be the side effects of a prior call [ETA: with the same arguments] instead.
  • If the function has a non-deterministic return value (counting either diverging or unwinding as a "return value"), then it may return the value returned by any previous call with the same inputs instead.
    • An old value will only be memcpyed to the new value if the return type is Copy. For non-Copy values, this optimization will be of limited use (unless the compiler can somehow prove a memcpy is safe in this instance, but I doubt that'll happen often).
  • If the function is called multiple times without data dependencies from the return value of one call to the input(s) of another, then the compiler can freely re-order the calls.

You could use it to annotate a function like:

#[pretend_pure] // safe attribute because I think it can't cause UB on its own
fn foo(a: u32) -> u32 {
    log::info!("Calling `foo({a})`");
    assert!(some_sanity_check(a));
    a.wrapping_add(1)
}

let a = foo(5);
let b = foo(5);

With this attribute, the compiler is allowed to optimize to let b = a;, so you might get 1 or 2 logs, depending on optimizations. And it's also allowed to reorder and optimize to let a = b;, so if the assertion fails then you might see either call in the stack trace. And I don't think there's any way to cause any memory unsafety by applying these attributes to otherwise safe code.

I have written a number of places with a log message and/or an assertion that I expect to always hold (but want to keep in case of logic errors), but which I'd like the compiler to give those optimizations to, if it can.

And this also works for the "internal non-determinism" some of y'all were talking about above, e.g. you can also apply it to fn foo() -> usize { Box::<()>::new().as_ptr() as usize } to get the compiler to treat this internally-non-determinstic function as pure.

2 Likes

Did you mean "the first time the function is called with a specific set of arguments" here?

1 Like

Yes

If I understand Ralf's example right, the core problem is: purity does not only affect sub-expression elimination (i.e. a = foo(); b = foo() ==> a = foo(); b = a;, which seems to be OK if we mark diverging function as pure, but it also affects statement reorder. Let's make a concrete example:

fn check_we_are_good() {
    if !we_are_good() {
        panic!("We are not good");
    }
}

fn load_secret_key() -> SecretKey {
    // Get secret key from somewhere in the memory
}

check_we_are_good();
let secret_key = load_secret_key();

load_secret_key is a pure function, which does not have any side-effects: it simply loads secret key from somewhere in the memory, or from some specially-designed register with specially-designed mechanism.

If we mark check_we_are_good as pure, then compiler may reorder the statements, then even if we are not good, the secret key is still loaded, which leaves a small time interval that attackers could retrieve secret keys in the stack.

If I understand it right, then we might should design some purity annotations that only affects sub-expression elimination without statement reordering?

On top of that, is it specific values or literal arguments? For example, is let x = 4; let a = foo(4); let b = foo(x); eligible for this? How about let (x, y) = (4, 4); let a = foo(x); let b = foo(y);? Does it change if foo takes a T or &T?

Stack contents aren't considered observable.

On the other hand, if two indexing operations are reordered, so that the panic message mentions an index that wasn't actually accessed… well, what if the index depends on a secret value? If the reordering hadn't happened, then the secret index wouldn't have been printed out, but now it would be.

This is the kind of thing that frustrates me about formal semantics. 'Obviously' it's your own fault if you write code like:

let a = array[out_of_bounds_index];
let b = array[secret_index_which_is_also_oob];

It's pathological for the only thing protecting a secret to be an out-of-bounds panic. But there's no formal definition of "pathological". Formality only sees a secret that was leaked when it wasn't supposed to be.

But hold on, let me defend formality. Once you're executing dead code, things can quickly get absurd. What if the second index is actually the value you retrieved using the first index, and by skipping the panic from the first indexing operation, you're indexing an array out of bounds? Then the panic could disclose unrelated memory contents!

// Assume `array` doesn't contain secret data,
// but there's secret data following it in memory.
let a = array[out_of_bounds_index];
let b = array[a];

Oh, but if the second index depends on the first load, it couldn't be reordered before it. But with aggressive value speculation and a specific memory layout, it actually could! Except the only way to get that kind of speculation is with code that's even more pathological, by far. You have to really be trying to leak the secret.

But safe code is supposed to be unable to violate memory safety, even if it's pathological or outright malicious. That's Rust's goal. In practice, there are plenty of known and unknown safety holes, but they are considered bugs. Thus, allowing reordering of indexing panics would create a bug.

In other words, the formality was telling us something.

…On the other hand, 99% of Rust users don't care about defending against malicious safe code. To be sure, they do still benefit from other upsides of formal reasoning, like increased certainty and understandability. But in my opinion, the missed optimization potential is in this context a downside of formal reasoning.

2 Likes

On another topic:

It appears that LLVM's existing optimizations around pure functions would not help with @newpavlov's pathological bounds-checking codegen example.

In the following link, I recreate their example in C, but I replace the indexing operation with a function which is marked __attribute__((const)).

__attribute__((const)) is documented here; it's a stronger version of __attribute__((pure)), and LLVM translates it to memory(none) nounwind willreturn. My indexing function can abort, so marking it __attribute__((const)) is actually undefined behavior. This is intended as an upper bound for what LLVM could do with an actually-sound pure annotation.

Unfortunately, __attribute__((const)) doesn't improve the assembly output at all.

This is partly because the LLVM attributes are lost after inlining. But there seems to be a more general missing optimization. In this version, I've simplified things as much as possible. For one thing I manually inlined get and unrolled the loop. But more importantly, all the 'panics' are replaced by jumps to a single location, without printing the index that failed. Therefore all the failure cases are indistinguishable, therefore in theory there's nothing preventing LLVM from merging all the bounds checks together. Still doesn't help.

I imagine this could be improved on LLVM's end.