I tried to go around existing discussions about unsafe like the following ones:
- Ambiguity of "unsafe" causes confusion in understanding
- What does `unsafe` mean?
- Unsafe Abstractions
I also tried asking around for discussions I might have missed, including on the wg-formal-methods stream of the rust-lang Zulip chat.
I'm surprised I didn't find anything matching my mental model because it seems pretty intuitive to me and has been my mental model since I learned about Rust in 2013.
The idea is just to add an informal proof system to Rust type system in a way familiar to programmers. Rust developers understand programming-related concepts like function parameter, function result, associated constant, etc. By describing unsafe-related concepts using programming-related concepts, most ambiguities about unsafe should hopefully be lifted.
I wrote Writing down my mental model of unsafe · GitHub with ideally enough details to be understood, but I'll also inline the first (and most important) section here for ease of reading. Please, if anything is still unclear, feel free to ask such that I can improve the document for future readers.
The proof type
Rust doesn't have a formal proof system. Proofs and properties are written in natural language (usually English). In my mental model, Rust provides a way to make those informal proofs and properties more explicit in Rust code and types. This is done using the following type:
/// Mathematical proofs for a given set of properties.
///
/// Proofs have no impact on execution. They only matter for type-checking. In particular, proofs
/// are `Copy`, zero-sized, and do not participate in function ABI.
///
/// The properties are never explicitly written. They are always inferred from the safety section of
/// the documentation of the item in which the type is mentioned.
#[lang = "proof"]
pub struct Proof<Properties> {
/// A proof of the set of properties.
///
/// This field is never explicitly written. It is always inferred from the safety comment of the
/// statement in which the struct is mentioned.
pub proof: Properties,
}
In this example, a proof is taken as argument and a new proof is returned:
/// Combines small numbers into a bigger number.
///
/// # Safety
///
/// The proof parameter properties are:
/// - The absolute value of `x` is smaller than or equal to 100.
/// - The absolute value of `y` is smaller than or equal to 100.
///
/// The proof result property is:
/// - The absolute value of the result is smaller than or equal to 1000.
fn combine(x: i32, y: i32, p: Proof) -> (i32, Proof) {
let r = 3 * x + 2 * y;
// SAFETY: By `p`, we know that `x` and `y` are between -100 and 100. By arithmetic, we know
// that `r` is between -500 and 500. By arithmetic, we conclude that `r` is between -1000 and
// 1000.
let q = Proof;
(r, q)
}
Digression (feel free to skip)
Using documentation and comments instead of types and code is mostly for practical reasons.
It is impractical to write types like Proof<"# Safety\n\nThe absolute value of the result is smaller than or equal to 1000.">
and code like Proof { proof: "// SAFETY: We know that arguments are small, so some small linear combination is only slightly bigger." }
.
However, a consequence of this choice is that it looks like we get some form of type erasure, but
this is only in appearance. Even though the properties are erased from the types, the programmer
must know at runtime the properties of all proofs (both parameters and results). For example, having
a vector of function pointers like Vec<fn(Proof)>
would need the programmer to know for each
element, what are the properties expected to hold to call the function, because they may differ from
one element to the other.
Vocabulary
I'll call pre-conditions (or requirements), proofs taken as parameter. And I'll call
post-conditions (or guarantees), proofs that are returned. In particular, when calling a
function, one has to prove the requirements and get back guarantees that can be used to prove future
requirements. Note that pre-conditions may refer to code that happens after the function. For
example, Pin::get_unchecked_mut()
has a pre-condition that the result must not be moved, which is
a property about what happens after the function is called. Similarly, a post-condition may refer
to data that used to exist before the function was called. For example, slice::is_sorted()
has a
post-condition that the input is sorted if the result is true (assuming the standard library wishes
to expose this post-condition for safety concerns).
I'll call invariants, proofs in any other position (neither parameter nor result of a function type), for example struct fields and associated constants. Invariants on mutable data must be updated at the same time the mutable data they refer to is updated. For proof fields, this is similar to self-referential structs because the proof field references the other fields.
Link with dependent types (feel free to skip)
This mental model comes from dependent type programming languages like Agda and Coq. If the proof type were formal, it would introduce a dependency from types on runtime values. Dependent types are rather inconvenient to use with the current state of research, so avoiding them is a reasonable decision for programming languages that wish to be widely usable. The proof type being informal, in particular its properties being just informal text, it doesn't formally refer to runtime values thus breaking this problematic dependency. But this informality is the reason why manual proofs and manually checking them is required for the type system to be sound. This can be seen as a usability trade-off: it's easier to write programs but it's harder to check their soundness.