Two Kinds of Invariants: Safety and Validity

When talking about the Rust type system in the context of unsafe code, the discussion often revolves around invariants: Properties that must always hold, because the language generally assumes that they do. In fact, an important part of the mission of the Unsafe Code Guidelines strike force is to deepen our understanding of what these invariants are.

However, in my view, there is also more than one invariant, matching the fact that there are (at least) two distinct parties relying on these invariants: The compiler, and (authors of) safely usable code. This came up often enough in recent discussions that I think it is worth writing it down properly once, so I can just link here in the future. :slight_smile:

https://www.ralfj.de/blog/2018/08/22/two-kinds-of-invariants.html

14 Likes

Thank you for writing this! It is extremely useful to have this written down somewhere and in such a clear way!


If you go back up and look at safety for types like fn(bool) -> i32 , you can see that this is inherently not checkable – checking if f is safe at fn(bool) -> i32 requires solving the halting problem!

Can you elaborate on this? Can't we check that the address of this function is the address of a function? If the function happens to be Rust, can't we check its signature? Also, if the function happens to be a C FFI function, can't we find the extern declaration in Rust for that function and check its signature too?


found plenty of cases of unsafe code temporarily violating the safety invariant. [...] We could decide that all of that code is incorrect and should use raw pointers, but I think it is too late for that.

The rust specification has forever specified that &T must refer to a valid T. If we change that, can we break valid safe Rust code? When given the choice, I'd rather break unsafe Rust code by enabling a check in miri, than potentially break safe Rust code that doesn't check whether &T refers to a valid T.

1 Like

Checking the signature is far from enough. Here's how you solve the halting problem, assuming you can check safety of fn types:

fn foo() {
  does_this_function_terminate();
  unsafe { *(0usize as *raw i32) = 0; }
}

This function is safe to call if and only if does_this_function_terminate loops forever.


Safety still requires that &T refers to a safe T. That's what safe code can and will be able to assume.

The question is if validity, which is weaker, requires &T to refer to a valid T. Weakening that cannot break any code, safe or otherwise.

That's the core point of the article: Safety is something that only has to hold in safe code. Validity has to always hold, and it can be weaker than safety.

Aha! I see where you are coming from. I think about fn types differently: calling a fn jumps program execution to the function, that's the operation that has to be safe (the jump), whether the function then runs into undefined behavior or not is, in my mind, irrelevant and as you mentioned, cannot be checked. However, I think that checking whether the jump is "correct" is worthwhile.

1 Like

Safe code however certainly needs to rely on more than just a jump. A safe function promises to its callers not to cause UB, and hence it must require the same from its callees.

True, and calling a function with mismatched signature is certainly UB, there is no other option. However, just holding a function pointer with bad signature that is not being used -- should that already be UB? That is what validity of the fn type is about.

1 Like

So just for clarity around what you say with partial initialization and &mut:

(Under this model,) It is valid to have an &mut T to a partially initialized T, but it is not safe to do so (as safe code could use the uninitialized part of the value).


I like that this distinction of safety/validity also creates an actual reasoning behind where to end a unsafe {} block. In a safe function, all types must invariantly be safe. In an unsafe function or unsafe block, types must be valid but may temporarily be not safe. By the time the unsafe block/fn ends, however, everything must be safe again.

So given the very silly type:

struct Silly(i32, i32);
impl Silly {
    fn assert_safety(&self) {
        assert_eq!(self.0, self.1);
    }
}

It would be ok to have a safe method that did unsafe { self.0 = 0; self.1 = 0; } but not ok to have a safe method that did unsafe { self.0 = 0; } unsafe { self.1 = 0; }.

1 Like

(Musing on the dereferencable-N-times thing)

How does auto-deref interact with safety/validity? Would it just be a (safe) use (thus be part of safety), or since the compiler does it automatically, part of validity? Since Deref can be user implemented, I suppose it can’t really be part of validity…


(On safety)

For certain inbuilt types (&_), the compiler can reason about safety in addition to just validity – that’s where dereferenceable N times comes in.

Would it make sense to introduce optimizations in safe code only that rely on safety of references additionally to validity? If I’m not mistaken, that would allow applying the un-ref optimization to safe functions but not unsafe functions.

It’s already explicitly said that unsafe code is not allowed to rely on implementation details of safe code (though you can rely on behavior/accuracy of your pen safe code). That would make me think that unsafe code requiring safe code to just pass through a reference to further unsafe code doesn’t need to be allowed.

Though I may have just thought of a counter-example. If the safe code just has for<T> T, it can’t do anything with it, so safety doesn’t apply. So any safety-relyant optimization would at least require the type to be spelled out and not generic.

This is a bizarre conclusion; it has you putting unsafe blocks around purely safe code (something that the compiler even lints against!), and I don't think it is what the post says at all. I guess you drew this from the statement

Unsafe code only has to uphold safety invariants at the boundaries to safe code.

but the boundaries to safe code are not the end of an unsafe block or an unsafe fn. They are at the end of safe, public fns.

(How public? Visible outside the crate, or just the module? For "safety invariants," I believe the most sensible answer is "visible outside the crate.")

Is there some sort of optimization that could be made possible by making this UB? If not, it seems silly to make it UB...

Unsafe code (in the sense of "code that must be specially audited to ensure UB-freedom") extends beyond unsafe {} blocks. For example, taking a reference to a partially initialized value and passing it to a safe function can and should be OK if the safe function is in the same module (or even crate, though stylistically I'd prefer to avoid that) and is deliberately written to be able to handle this. And that's for functions -- for unsafe {} blocks, I see no reason at all to ascribe much meaning to the syntactic extent of the block (as long as the code compiles). In particular, this distinction:

just seems adversarial to programmers for no reason. Both should be OK. What matters is that the value with broken safety-invariant doesn't escape to code that assumes the safety-invariant – which is all unknown code, but not necessarily safe code in the same library.

@RalfJung

I am glad to see that some of my previous ideas in comments on @nikomatsakis 's last blog post was presented in this blog. Specifically, the difference between ! and &!. I highlight the following:

while &! does not have a safe inhabitant, it is possible to have data that is valid for &! . After all, it does not have to point to a valid ! (which would be impossible).

For the same logic,

so validity for (T, U) requires that the first field is valid at T , and the second field is valid at U .

means (u32,!) is like !, it can never be valid.

Now the problem becomes: should we allow partial initialization on an invalid place?

I guess this also implies that partial initialization on invalid place should never be allowed, even in unsafe code.

I fully admit that I may be going a bit aggressive here, and may be over-applying a stretched conclusion. But I hope with the other notes here I can argue why this wouldn't be a horrible place to end up.


The intent of the direction I pulled this in, being the end of unsafe {} blocks, is one that I've seen in a few places: no one seems to be exactly certain where an unsafe block should end. The nomicon discusses this: changing safe code can break unsafe variants because safety is a global concern.

I don't see any reasons where this would require unsafe blocks around safe code. Safe code, by definition, cannot put values into an unsafe state. I guess, in the overly-aggressive-sanitizer specification, inbetween every line of code is an assertion that every place is valid, and inbetween every safe line of code is an assertion that every place is safe. (In my example I'm using field assignment as a proxy for writing out unsafe setters.)

This means that it's still fine for safe code to block calls to unsafe temporary violations of safety, just that the unsafe {} block delimits the scope in which values may be not in a safe state.

I don't think it's an unreasonable goal to say that structs should minimize the amount of time they're in a not-safe state; after all that means that you have to be really picky about panic points while a type is in an inconsistent state. It seems useful to say "this isn't in an unsafe block, thus the types are in a safe state", which also implies panic safety.

My interpretation boils down to whether this is the desired state of affairs. If we say that a reference is only safe if it points to a safe T, then it falls out of that definition that where safety holds, dereferencable-n-times and thus the de-ref optimization is valid.

I don't think it's much of a burden to require arguments to a safe function to be in a safe state. And the de-ref optimization shows that it's useful in at least one case. I don't disagree that a singular model covering safe and unsafe isn't simpler, though.

I know at least for my code, however, I have a strong preference for the former blocking of the two, as the unsafe {} block should be the unit of unsafety where possible.


I realize that safe code putting a same-module type into an unsafe state complicates things a lot, and the musing I'm outputting is likely more restrictive than the eventual model wants to be. But I think it's an interesting way of considering it, at the very least.

I tend to make and use unsafe setters rather than doing direct field access for most of my types anyway, so sometimes I forget that field access is always safe :stuck_out_tongue:


The way I've always thought of partial initialization is that the compiler treats them as distinct fields until the full place is initialized. So to use the common example:

let tuple: (T, U);
// do some work
tuple.0 = make_t();
// do some work with tuple.0
tuple.1 = make_u();
// do some work with tuple.1
return tuple;

This would become, effectively:

// do some work
let tuple.0 = make_t();
// do some work with tuple.0
let tuple.1 = make_u();
// do some work with tuple.1
return (tuple.0, tuple.1);

with the additional guarantee that tuple.0 and tuple.1 are arranged in memory such that tuple is just the memory place including both tuple.0 and tuple.1, and is valid once both tuple.0 and tuple.1 is valid.

So from that, we can derive that partial initialization of a partially uninhabited^1 place is allowed, just that it is impossible for the place to be fully initialized. The valid subset of the place can be initialized but the uninhabited subset cannot be validly initialized.

3 Likes

I like this idea. It is a way to get around the problematic

let t: (i32,!);
t.0=10;
t.1=abort();

Right now, it causes problem because (i32,!) was implemented as ZST, and so t.0 point to invalid memory. But with your proposal it will be desugar to

let t:(i32,!);
let __t0 = 10;
let __t1 = abort();
let t = (__t0,__t1);

even with (i32,!) being ZST, the above desugar will work.

But I would expect that, until in a static analysis shows that a value t is fully initialized, we should not allow it being sent to or return from other functions, even in unsafe code. If we change type of t to (i32, &!) in unsafe code should be allowed then.

1 Like

Your own example had unsafe { self.0 = 0; self.1 = 0; }.

And I don't believe I am just picking on a strawman. Applying your arguments, the implementation of Vec should never modify len outside of a unsafe block, even though it does not actually require unsafe.

I took a shortcut, and I failed to remember that safe code can break safety invariants within the same module. That is my bad. If I were writing Vec, I'd be using self.set_len(len) rather than self.len = len personally, because I like being reminded when I'm doing something potentially unsafe (especially around unwinds, which are invisible).


Good news, that's how it works! The compiler will complain about use of partially uninitialized / possibly uninitialized variables. The problem arises around unions/MaybeInitialized, and what parts are actually required to be valid when, which is the purview of the other thread, not this one. Let's not hijack this thread to talk about partial initialization :slight_smile:

Partial initialization is pretty in-topic as it relates to definitions about what is "valid" and what is "safe", and this is what the blog post is all about.

Partially; As ! is never initialized, the compiler should also apply the same rule; but we cannot because we want ! to be in the return position. For the parameter position though, I would say NO. If you want to write a function that receives uninhabited values, use indirection like &!.

(I am not saying you cannot declare a function accepts !; but such a function can never be called, and the compiler can always ignore such a function, even when deny(dead_code))

Howso? The only way to exit a function that returns an uninhabited type is to create a ! -- to diverge, panic, or call a subroutine that returns a !. How does this preclude the exact same initialization check as is done today?

The initialization check doesn't care about validity, safety, or any other measure of did you actually get a value of this type. It only cares about if every place has received a value. And if you call a diverging function, you can fully initialize an uninhabited type as far as the type checker is concerned. It's just that you cannot reach this (or, equivalently, reaching it is UB), as signified by having a value of type !.

It's perfectly allowed to take an uninhabited value as a parameter. It's just impossible to reach code that calls that function. (playground)

fn conjure<T>(never: !) -> T { never }

fn main() {
    let never = panic!();
    let it_came_from_outer_space: String = conjure(never);
    println!("{}", it_came_from_outer_space);
}

Oh, so we agree. There's nothing wrong about writing the function, it's just not callable. Nobody is arguing that. You can actually use a function that takes !, just not for anything useful. However, it should trigger the unreachable lint just like everything else that you don't use.

As a slight aside, there has been discussion of allowing fields to be marked unsafe as a reminder to the programmer that they're relied on by other unsafe code. len would be a perfect candidate for such a thing.

That is what I am proposing, yes. The terminology of validity vs. safety I think is useful independent of what we decide for references, (a) to be able to even have this discussion and (b) because there will be other differences between validity and safety.

I do not think we want the extent of the unsafe block to have any direct consequences.

However, I think this distinction is useful to define what unsafe code is. It is essentially code that works with unsafe data (data that is valid, but not safe). This goes back to what I wrote two years ago about how far out unsafe "leaks".

All of these discussions, as far as I am concerned, are happening on the MIR -- so autoderef is just a method call like any other.

See what I wrote above about it being hard to define where unsafe code ends. :slight_smile:

Yes! Maybe I should clarify in the post :slight_smile:

I cannot think of any.

I have been sitting on these thoughts for quite a while and wished I had them already spelled out for that discussion -- but then, that discussion certainly helped me to sharpen my thoughts and make them more precise.

Agreed.

Notice, however, that I glossed over one issue in that post: When I said all data must always be valid, I was only referring to data the compiler considers initialized. In

let x : &'static i32;

Certainly x must not be valid.

It is quite a burden. Unsafe code does things like calling size_of_val on uninitialized references -- I found an instance of this in Arc::drop_slow and I am sure I would have found more if I would have went on looking. See the PR implementing MaybeUninit for a whole bunch of examples where passing &mut T pointing to uninitialized data to known safe functions is very useful.

I am a fan of saying that a safe function must only be called with safe data, ever -- but first of all we still cannot make violations of this UB (because safety cannot be checked), and secondly this is surprising to lot of people.

That is totally the case. You can only move a value if the compiler thinks it is initialized, which means at this point is must be valid or else you have UB.

If you want to write a function that receives uninhabited values, stop what you are doing.

If you want to write a function that receives uninitialized values, e.g. to initialize them, use &mut T. This will usually be generic code. (That is assuming we adapt my proposal for validity, so your &mut T must be valid but that does not require you to find a valid T.)

I consider this the most beautiful part of the whole proposal.

4 Likes

How does unchecked_get and the likes fit into this model? In the standard usage of them, there is no unsafe data involved.