Idea: a path toward reducing the need for `unsafe` by making invariant machine checkable

Disclaimer: this post should be seen as the starter for a discussion and not as a direct proposition to change the Rust language.

The main idea is to move safety invariant from "SAFETY" comments, into machine-enforceable checks. I think that all the change I’m proposing are backward-compatible, but I do not expect them to be completely trivial to implement.

TL; DR:

  1. Add const_assert! that has some static analysis capacity to check boolean expression at compile time. Not validating the condition would stop the compilation with an error.

  2. Add unsafe assert_unchecked! macro that gives hint to the compiler (the programmer is responsible to validate that the condition is effectively true). It’s especially useful to help the compiler validating a const_assert when its static analysis isn’t good enough.

  3. Add a way to express function pre-condition. The desugaring uses const_assert! at call site, and assert_unchecked in the body of function with pre-condition.

  4. Optionally, add a way to express predicates that can’t be express using Rust type system (like a non-dangling pointer), and make them usable by assert_unchecked! and const_assert!

All those combined (and only the 3 first points are really needed) should drastically reduce the use of the unsafe keyword.

1. Adding const_assert!

The first change is to introduce a const_assert! macro (similar to static_assert in C++), that is evaluated at compile time. I never used this crate, but I assume that it would be similar to static_assertions. Furthermore, const_assert! should be able to check such kind of code:

let condition: bool = get_bool();

// this line would not compile since `condition` is not guaranteed to be true
// const_assert!(condition);

if !condition {
    return
}

// here condition is guaranteed to be `true` because `false` was diverging
const_assert!(condition); // must compile

Which means that this should transitively be valid:

let condition: bool = get_bool();

assert!(condition);

// here `condition` is guaranteed to be `true` otherwise `assert!` would have diverged
const_assert!(condition);

2. Add unsafe assert_unchecked!

The second change is to add an assert_unchecked! macro which is unsafe to call. I also never used that crate, but I assume it’s what assert_unchecked crate) provides. The programmer is responsible to validate that the condition is true (hence why it’s an unsafe macro). assert_unchecked! doesn’t generate any code, but the compile can rely on its condition being true, especially is const_assert!.

3. Add contracts to functions

The third change is to make it possible to express function pre-conditions:

fn foo(non_zero: i32)
where non_zero != 0
{ ... }
fn do_stuff_with_non_empty_vec(non_empty: &Vec<T>)
where !non_empty.empty()
{ ... }

Adding such where clause has two consequences:

  • At call site, calling such functions is equivalent to do a const_assert! of the pre-conditions before doing the actual call:
let some_vec = …;
do_stuff_with_non_empty_vec(&some_vec);

would be desugared into:

let some_vec = …;
{ // using a block, this way the desugaring also works in expressions
    const_assert!( !non_empty.empty() );
    do_stuff_with_non_empty_vec(&some_vec);
}

Here this would obviously not compile because the const_assert! condition can’t be validated at compile time. Therefore the user would have to add a diverging case, for example:

let some_vec = …;
if (!some_vec.empty()) {
    // this time, the call is valid and will compile successfully
    do_stuff_with_non_empty_vec(&some_vec);
}

Note: this is why we need a relatively powerful static analysis for const_assert

  • Furthermore, in the callee, since the condition is guaranteed to be checked by the caller, the optimizer can use that information, as-if the first line of the body was unsafe { assert_unchecked!(pre_condition) };

So

fn do_stuff_with_non_empty_vec(non_empty: &Vec<T>)
where !non_empty.empty()
{ ... }

is desugared into:

fn do_stuff_with_non_empty_vec(non_empty: &Vec<T>)
{
    assert_unchecked!(!non_empty.empty());
    …
}

Note: even if this is not strictly necessary, this should help calling function with the same pre-conditions (or a subset of them).


Here we should already have a relatively useful feature. It would already be possible to expose invariant of some unsafe functions, and make them safe (since all of there invariant are guaranteed to be checked by the caller).

For example Vec::get_unchecked() and Vec::get() ultimately call SliceIndex::get_unchecked()](index.rs - source) and SliceIndex::get(). But let assume that the internal machinery was a bit simpler:

// before:
impl Vec {
    // SAFETY: `i` must be < self.size()
    unsafe fn get_unchecked(&self, i: Index) -> &Self::Output;

    fn get(&self, i: Index) -> Option<&Self::Output> {
        if i < self.size() {
            // SAFETY: all invariant of `get_unchecked` are validated: i < self.size()
            Some { unsafe { self.get_unchecked(i) } }
        } else {
            None
        }
    }
}

With our change, it would become possible to change those to be:

// after
impl Vec {
    // no longer unsafe
    fn get_unchecked(&self, i: Index) -> &Self::Output
    where i < self.size();

    fn get(&self, i: Index) -> Option<&Self::Output> {
        if i < self.size() {
            // the function isn’t unsafe anymore, so no unsafe block needed
            Some {
                self.get_unchecked(i)
                // the above call is desugared as:
                // { const_assert!(i < self.size); self.get_unchecked(i) }
                // so the invariant are correctly checked
            }
        } else {
            None
        }
    }
}

4. Adding predicates

The last change would be to add predicates. They are pure function, always unsafe, and always return true, be can only be called in const context. They don’t have a body. Since they are pure function an unsafe { assert_unchecked!(some_predicate(some, arg, s)) } followed by a const_assert!(some_predicate(some, arg, s)) is guaranteed to succeed (as long as the arguments where not used by a &mut call, otherwise it’s a compile error).

predicate fn doesNotDangle(p: *T);
predicate fn noReferencesAreActive(p: *T);

Those predicate would only be useful when expressing contract, and in assert_unchecked!:

// no need to make it unsafe, the predicate already express the pre-conditions
fn get_mut(p: *T) -> &mut T
where doesNotDangle(p) && noReferencesAreActive(p)
{
    unsafe {*p} // note: the * operator could be made safe if it had the same contact
}

And when calling it:

let p = …;

// this would fail to compile since it would call predicate at runtime
// assert!(doesNotDangle(p) && noReferenceAreActive(p))

// that’s the only way to validate a predicate
unsafe { assert_unchecked!(doesNotDangle(p) && noReferenceAreActive(p)) }

// We can now is this function safely without unsafe block
let ref: &mut T = get_mut(p);

If I’m not mistaking, calling predicate functions in assert_unchecked! would be the last place where we would use unsafe blocks (except maybe extern C functions).


I did gloss over many points, but you should get the general idea. It would also be possible to make it more practical by being able to express post-condition in order to be able to chain the output of a function into another that has preconditions more easily. I also did not discuss Fn traits and fn pointers, and how contract should be represented in the type system, etc… I also did not discussed about which limitation we intentionally want to add to the static analysis capacity of const_assert!. I don’t see anything fundamentally blocking in those points, so I wanted to open the discussion about the merit of such idea without immediately going into the details. It’s because I may have totally missed something that would render the whole thing infeasible, and I don’t want anyone to lose time on details if the main idea doesn’t make sense.

2 Likes

Some potentially relevant links:

lang-team/2022-11-25-contracts.md at master · rust-lang/lang-team · GitHub.

5 Likes

This affects type signatures, yes? So what happens with iter.for_each(foo)? Where does the const_assert! get inserted? How would I enforce it? By adding a .filter(|x| x != 0)? How do I know that such a condition wasn't already on the iter being passed to me?

This feels like "flow-sensitive typing" which gets brought up from time to time.

3 Likes

That’s a good point.

I realize that directly calling a function precondition should be replaced by the block { const_assert!(preconditions…); function(args…) } and that passing a function with precondition should be replaced by a lambda |args…| { const_assert!(preconditions…); function(args…) }.

This means that for_each, when calling the compiler-generated lambda wrapping the call to foo need to propagate the information that the value returned by filter is != 0. I am not very familiar with flow typing, and I think that it’s indeed required for this to work ergonomically.

In a first implementation (without flow typing), the user could simply be required to wrap the call to foo in a lambda with the appropriate assertions:

iter.filter(|x| x!= 0).for_each(|x| {
    assert(x != 0); // or unsafe { assert_unchecked!(x != 0); }
    foo(x); // the compiler with replace this with
            // { const_assert!(x != 0); foo(x) };
            // which is trivially valid because of the explicit `assert!`
});

A few questions to consider:


How to handle `str::from_utf8_unchecked`, `Arc::from_raw`, `MaybeUninit::assume_init`, `Pin::new_unchecked`, etc? Their predicates are actually quite complicated. How do you plan to handle them?
It is feasible to have stability guarantees? If new compiler version might fail to prove a const_assert, it would be bad. So we would need a way to specify exactly what analysis does the compiler perform. AFAIK, it is a very hard problem with no good solutions (see how optimizing compilers break optimizations).
Consider a simplification. Support only:
assert!(p);
/* some code: g */
const_assert!(p);

What limitations can be put on predicate p and interleaving code g to make this code sound? Obviously code g should not use any variable present in p, but this is not enough.

Note, that all functions here should fail to compile:

Test cases
fn cell(x: &Cell<u8>, y: &Cell<u8>) {
  assert!(x.get() > 0);
  y.set(0);
  const_assert!(x.get() > 0):
}

fn pointer(x: *mut u8, y: *mut u8) {
  assert!(unsafe {*x} > 0);
  *y -= 1;
  const_assert!(unsafe {*x} > 0):
}

fn pointer_read(x: *mut String, y: *mut String) {
  assert!(unsafe {*x}.len() > 0);
  let a = *y;
  const_assert!(unsafe {*x}.len() > 0);
}

fn dirty() {
  assert!(rng() != 0);
  const_assert!(rng() != 0);
}

fn mutable(v: &mut Vec<u8>) {
  assert!(v.pop().is_some());
  const_assert!(v.pop().is_some());
}

P.S. having said that, I am really fascinated by the idea

This requires adding a new type of function. There's no guarantee for an arbitrary fn empty(&self) -> bool that if it.empty() { assert!(it.empty()) } will not panic. In the near future, const fn probably won't mean that either (or separately from that, const refs to cell also removes the purity implication from const fn).

You need some sort of annotation for fn which are guaranteed to be pure and transitively not cause any mutations, whether shared or unshared, global or referenced, to be able to at a language level assume that an earlier call to it will produce the same value as a latter call to it.

Not only that, but every intermediate use also needs to be annotated as pure or somehow annotated as not impacting the checked predicate.

Basically, without a full annotation of precondition effects, the benefit of tracking trivial preconditions for trivial scope is quite small. Not zero -- it's essentially what // SAFETY: comments do by convention -- but small enough that justifying a large addition to the language for it is very difficult.

This is quite different to static_assertC++ or current library const_assert!. Traditionally, const_assert! has been a const item essentially synonymous with const _: () = assert!($expr);. Being an item, the assertion is order-independent and as such can't refer to any local bindings.

To be fair, flow-dependent value refinement is the only way that const_assert! could potentially be allowed to refer to local let bindings, but it's still a large divergence from any other thing which we call const. (Either a const item/generic evaluated entirely at compile time and only referring to other const items, or if not in a const evaluation context, just a plain runtime function with no further special semantics.)

(I think it's reasonably unlikely std will get a const_assert!; rather, the std solution will instead be just to write const { assert!($expr) }, which makes the constant evaluation context obvious. The nonconst formatting machinery makes doing so difficult in the short term, though.)

1 Like

args… could require a move on the closure to work here. Also, asserting on the arguments could cause them to be moved which would make them unavailable for the call itself.

That’s exactly what the 4th point is about. Predicate are just a machine checkable SAFETY comment since they can only be used with assert_unchecked!.

I glossed over, but yes it’s really important to formalize what the compiler should be able to check and intentionally limit it to what we are confident that we can guaranty to check in a future version of the compiler. Having a compiler a bit too dumb can be compensated with a combination of predicate, and explicit call to assert! and uncked_assert!. And assert! may not cost you anything, because even if the compiler doesn’t guaranty that it can optimize it away (otherwise const_assert! would be valid), it is still allowed to do it.

That’s true, and what I totally failed to consider. I tried to find a solution but for the moment I have nothing elegant unfortunately. If I (or anyone else) fails to do so, as you said, my proposition requires too much changes for too few gains.

args… would have exactly the same kind of capture than function (so either move, & or &mut). I fail to see why replacing the direct reference to a function by a lambda would require a move` where the function didn’t need it. Do you have an example in mind?

This is at best unclear. The compiler isn't allowed to remove assert! checks unless it proves that it can never fire.

This is what I was trying to say :wink:

1 Like

Hmm, probably not now that I dwell on it more.