Any contribution to the syntax bikeshed is much welcome
If you do not like the proposed syntax, feel free to use the unabbreviated “unsafe preconditions” and “unsafe postconditions” expressions, which can hopefully be more readily agreed upon.
An “unsafe(pre) trait” is in current syntax simply a trait with all unsafe methods.
While I started with this definition too (along with a variant for unsafe(post) traits), because it looked like a very convenient and sensible shorthand, I backtracked from it after I came to the realization that traits differ from the sum of their public interfaces. If that were not the case, there would be no marker traits.
It all boils down to this question: what does “using a trait” mean? Does it mean using the interfaces provided by that trait? Or does it mean adding that trait to a list of trait bounds in a generic construct?
You can only describe marker traits using the latter definition. And you can achieve the former semantics, albeit in a more laborious way, by marking all functions in the trait as unsafe. So I think that a trait’s safety contract should be defined by what the trait guarantees about a type when used in generic bounds.
Which, in turn, means that I cannot think of an example in which a trait would be unsafe(pre), in the sense that adding this trait bound to generic code would introduce a risk of memory/type/thread unsafety if some conditions are not met.
“unsafe(post) function” and “unsafe(post) types” exist as methods and associated types of unsafe traits.
In your opinion, what would an unsafe(post) type be?
“unsafe(post)” is meaningless for freestanding items since their implementation can’t change due to backwards compatibility.
However, it’s true that functions that are called by unsafe code can cause memory-unsafety if their functionality is non-compatibly broken, but that’s more of a property of the calling unsafe code. Requiring “unsafe(post)” on the callee seems inappropriate, since unsafe code may in theory want to rely on any behavior of safe code, so there is no way to figure out whether a function should be marked “unsafe(post)”
This is, in my opinion at least, a failure of the current unsafe semantics which this proposal aims to address.
It is frequently stated in the Rust community that the safety of unsafe code cannot rely upon the correctness of safe code. For example, the safety of a HashTable cannot rely on the key type’s implementations of Hash and Eq being correct. However, what this negative statement fails to accurately convey is that as a result there is actually remarkably little that unsafe code is allowed to rely upon.
By the above standards, it is nearly impossible to safely use any kind of abstraction (from functions to generics) in unsafe code. In doing so, one must painfully attempt to mentally step through all the ways the abstractions could be incorrect (an exercise which the human mind is notoriously bad at), before hopefully convincing oneself that they have been relatively well accounted for. That is, until a sufficiently broken implementation of the abstraction surprises us by showing that yes, we have still not accounted for every possible bug.
Because unsafe code cannot rely on any abstraction, except possibly the ones that were defined in the same crate and by the same author (and even then…), it is very difficult to “scale up” unsafe codebases to complex tasks. Some will argue that this is generally a good thing, as unsafe code is dangerous and should be used very sparingly and in extremely localized locations. However, in practice, crossing the boundary between safe and unsafe code often comes with run-time validation costs, so whenever unsafe is used for performance reason (such as in the process of building containers or high-performance algorithmic primitives), keeping its usage at a very fine granularity may not be viable.
The notion of unsafe(post) aims to address this lack of scalability of unsafe codebases by introducing the notion of safety-critical contracts. Because Rust’s abstraction vocabulary is not (yet?) rich enough to express such a contract in code, it is to be expressed in comments, like all other contracts that pertain to unsafe code in Rust. But by formulating this contract, an abstraction author states “I recognize that a certain property is critical to memory safety, and I promise to guarantee this property for the entire lifetime of this abstraction, subjecting my code to the same level of scrutinity as any other safety-critical (“unsafe”) code”.
unsafe(post) is therefore a commitment that restricts an abstraction’s implementation in order to allow it to be used in more contexts, much like const fn is.
Having the syntax specify “unsafe(pre)” and “unsafe(post)” would have been better, but changing it now might not be really worth it.
Well, this is the question which I am trying to answer in this pre-RFC thread. If we were in the pre-rust 1.0 days, then it would be clear to me that clarifying unsafe semantics is called for. Today, where it involves some amount of churn for the Rust ecosystem, I need to prove that the benefits outweigh the cost. Since this is a trade-off which, as the person proposing a change, I come ill-prepared to analyze alone, I am looking for external points of view on this matter.
Regarding “design-by-contract”, the proper solution is to add dependent types to Rust, so that pre and post conditions can be encoded as values of dependent types, and they can be statically checked.
While I would love stronger support for design-by-contract in Rust, allowing desired properties to be expressed in code, I think that it would not fulfill all of the use cases of this feature, for the following reasons:
- Some safety-critical properties cannot be easily expressed in code. Think about the “Do not create two &mut-references to the contents of an UnsafeCell” for example: considering that the client gets an *mut and is allowed to do whatever it likes with it, how should this postcondition be expressed?
- Some safety-critical properties cannot be checked statically and are too expensive to check dynamically. Think about “Integers used to index slices with get_unchecked() are in range”, for example: the very reason why get_unchecked() exists is that dynamic bound checks are sometimes necessary to guarantee safety, but too expensive for indexing-intensive code.
Conversely, I understand that dependent types would be a much larger addition to the language than what this proposal suggests, which would need much stronger motivation, much deeper analysis of the problem and use cases, and would in general have much weaker chance of being accepted. In contrast, this proposal solves a well-defined and known problem in a minimal way, which is I think has been an important characteristic of a good RFC ever since Rust has stabilized.