Pre-RFC: Unsafe reasons

Maybe this can be achieved using a newly introduced type of assertions (we'll call them unsafe_assert for now). Then properly_aligned, readable, etc can be new functions which only attach compiler-accessible properties to variables. Then this:

unsafe
    [properly_aligned(x), "this came from a reference so it must be"]    
    [properly_aligned(y), "this came from a reference so it must be"]    
    [...]
{
    let a = ptr::read(x);
    let b = ptr::read(y);
    ptr::write(x, b);
    ptr::write(y, a);
}

Becomes this:

unsafe
{
    // This came from a reference so it must be
    unsafe_assert!(properly_aligned(x));
    ...
    let a = ptr::read(x);
    let b = ptr::read(y);
    ptr::write(x, b);
    ptr::write(y, a);
}

This approach also allows us to encode the non-operlapping requirement of copy_nonoverlapping as:

unsafe_assert!(x + count <= y || y + count <= x);
1 Like

I think it makes all the difference.

I don't like unsafe[aligned, writable] { ... } at all, because well what's aligned and writable? It fails the test I want from an improvement here: I had read(p) before, add read(other), and it still compiles despite me not proving the new requirement.

That's why the variable name is there, and why it makes it ok to have it at a larger scope.

I intentionally used a different syntax in that post than in the linked on because I don't know how best to write the syntax either.

Another version, for example, is defining the names (aligned, readable, etc) as a new kind of item so they can be namespaced, used, rustdoc'd, etc. That might be a really nice way to codify things like the https://doc.rust-lang.org/std/ptr/index.html#safety section -- the descriptions would be on the safety items, and the methods would crosslink back to those precise safety items.

(Or for extra credit, even defining rules on those safety items like "references all meet aligned(p)", and letting those inherit across various things in helpful ways.)

Note that I was intentionally restricting it to multi-parameter names only, not allowing full expressions where someone would expect the compiler to actually understand the expressions.

That way it like easy-mode prolog, something we can definitely implement well.

For full expression contracts, I'd lean more on things like what kani are looking at doing.

1 Like

But it still has the big drawback that adding a reason to an unsafe function becomes a breaking change.

Isn't that the whole point of the idea? If a function becomes unsafe for another reason, the callers may be broken. It's like adding a second unsafe marker -- it should be a breaking change for the same reason adding unsafe in the first place is a breaking change.

1 Like

I'm talking about a function that was already unsafe but without reasons (probably because it is older than this proposal). But it's also true for new functions.

It has always been a breaking change to add new requirements to an unsafe API; the compiler would just be able to tell you about it with such annotations.

1 Like

Not to add new requirements, to add reasons for existing requirements, or even new unsafe functions with reasons.

An old-style unsafe function without specified reasons should be treated as being unsafe for all possible reasons. Specifying a list of reasons is then a reduction of the requirements, and hence not a breaking change. Similarly, an unsafe block without specifying reasons is equivalent to listing all possible reasons, an unsafe block with specific reasons is weaker.

It is a breaking change because code will stop compiling. Consider:

unsafe(reason) {
    unsafe_function_with_reasons();
    unsafe_function_without_reasons_yet();
}

If we add reasons to unsafe_function_without_reasons_yet(), the code will stop compiling, due to the new reasons being not included in the unsafe block's list.

This is true only if you assume that that snippet of code complies. But the unsafe block only provides one reason and then tries to run an unsafe function that requires all possible reasons, so it should be an error.

1 Like

Sure, semver-checks could now diagnose the issue. But the semantic break has always been there. It's analogous to how a C++ API dealing with pointers changing lifetime behaviors (caller-delete vs. callee-delete) would still compile because signatures don't change, but be a breaking change for anything that worked as intended before (maybe a leak, maybe a double-free depending on the direction of the change).

No, the code might have perfectly fine and considered all documented preconditions of unsafe_function_without_reasons_yet(), it just not listed them in the reasons list (because they weren't listed in the function's declaration).

An unsafe function without reason list does not have all possible reasons, it might have a very specific and narrow set of preconditions, documented in a /// # Safety doc comment. It's just that the compiler cannot reason about this.

Anyway it is possible to say that it is an error to call a function without reasons in an unsafe block with reasons, but this means that you cannot do that if you have e.g. two dependencies that only one of them is with MSRV that allows it to adopt reasons, and also, for some functions reasons won't really help (e.g. common in FFI, where there is plenty of unsafety and reasons will just obscure the call) - and this is another situation where with this change you cannot ever put them in the same unsafe block.

This shouldn't compile because reason is not sufficient for the second call. Instead, this code should be written:

unsafe(reason) {
    unsafe_function_with_reasons();
}
unsafe {
    unsafe_function_without_reasons_yet();
}

So make a way around that. For example, allow unsafe [reason1, ..] { ... } to also have other reasons that weren't mentioned (as a transition strategy, you could change all unsafe{} to unsafe[..]{}) and require the .. for old-style unsafe functions that don't have specific reasons yet.

Or just don't make it a hard error in the first place -- say that the unsafe block requirement is the hard error, and that missing reasons is just a warning, which people can upgrade to deny if they want but it still won't break in dependencies because cap-lints.

2 Likes

How would it work when the argument isn't a simple variable name, or when the parameter has destructuring inside it?

1 Like

At least at the function level, this reads a bit like refinement types. It composes really well with flow typing. A semi-concrete example I'd love to exist and work:


impl<T> [T] {
    pub unsafe fn get_unchecked<I>(&self, index: I) -> &I::Output
    where
        I: SliceIndex<Self>,
        unsafe index < self.len,
    {
        ...
    }
}
let v = [1, 2, 3];
// SAFETY: I'm pretty sure `x` is in bounds.
unsafe {
    let _ = v.get_unchecked(x);
}

assert!(y < v.len());
let _ = v.get_unchecked(y); // it's not unsafe!

That's a lot more than this simple proposal, and starts looking like formal provers (like Kani). Even simpler flow typing is a really big change to consider.

I think missig reasons should be a warning not an error (if the compiler does not verify the requirements). For me it seems that actually adding new invariants to an unsafe function is a rare case.

Two examples I came up with are:

For a function some invariants were missed in the reasons list. But the invariant was always present. This will inevitably lead to many functions having a:

"Note: we missed [...] But due to semver we did not add it to the reason".

If it is only a warning any user of the crate would be notified after cargo update.

The second case is adding new functions that interact badly with the first function.

i.e. "func_a must not be called in the scope of func_b"

Now adding func_b is a semver hazard as the interaction can't be added to the reasons in func_a without a breaking change.

(though this second examle also kind of shows the annoyance of negative reasoning)

Flow typing is not a prerequisite. I'm suggesting

  • Reasons should live in the where clause.
  • Future interactions with these other features should be considered. In particular reasons could exist within the outer scope. Syntax here is probably really important, as ideally if reasons exist all unsafe code should use them. This implies a certain amount of inference.
  • "Unsafe" functions which list out their requirements, and have all of those requirements asserted/met, are not unsafe to call.
  • A function which lists out reasons, which does not have its reasons met, could simply be unsafe to call (with a warning!). This allows upgrading existing functions to have reasons.
  • Add a warning for an unsafe function which does not have reasons.