It will take me a while to fully process and answer the comments from last night (thanks everyone for your interest and feedback!), but in meantime, I thought I would like to throw in a possible alternative to this pre-RFC that sprung in my mind while reading @H2CO3’s post, on which any thought would be appreciated.
It seems to me that one common theme in this discussion is that the proposed semantic clarifications are often evaluated to be useful, but not enough to justify a churn-inducing language addition at this point in time.
The obvious way to answer this kind of feedback is to provide a library-based implementation of the proposed concept, which allows experimenting with it today and gaining experience of it, to make a better future decision about whether a future language integration would be worthwhile.
So far, since this is touching rather deep language semantics I have struggled to come up with such an implementation. But I think I just had an idea which might be heading in the right direction. Like all library-level implementations of something which really should be implemented at the language level, it is pretty ugly-looking and limited in some ways, but it might still be usable enough to be viable for real-world projects.
Tentative library-level translation
First unsafe(pre) fn sketch
Unsafe function preconditions are often, though not always, about parameters. Often, as discussed in the static analysis section of the pre-RFC, they even target a subset of the function’s parameters.
As it turns out, unlike tagging whole functions as unsafe(pre), tagging individual function parameters as unsafe(pre) can be quite easily done without help from the language. All we need is a suitably designed wrapper type for the parameters:
struct UnsafeData<T>(T)
impl<T> UnsafeData<T> {
// Need an unsafe block to attest that the contract is respected
unsafe fn new(inner: T) -> Self { Self(inner) }
// Accessor methods are unimportant and may be freely bikeshedded,
// a realistic implementation will probably want to use Deref, or
// even make the whole type pub.
fn unpack(self) -> T { self.0 }
fn get(&self) -> &T { &self.0 }
fn get_mut(&mut self) -> &mut T { &mut self.0 }
}
// UNSAFE PRECONDITION: "dangerous" must be equal to 42.
fn unsafe_pre(dangerous: UnsafeData<usize>, safe: isize) {
// Some boilerplate is needed on the callee side because this is
// a type system hack rather than a language feature. But most
// importantly, the body of the function is not implicitly unsafe.
let dangerous = dangerous.unpack();
// Do something with the parameters, with or without using the
// unsafe precondition that the input UnsafeData provides
}
fn call_unsafe_pre() {
// I hereby testify that I understood the contract of unsafe_pre()
let dangerous = unsafe { UnsafeData::new(42) };
unsafe_pre(dangerous, 64);
}
If a precondition is not about a single function parameter, but about a relationship between multiple function parameters, then we can state this by putting the parameters together in a tuple or struct and giving an UnsafeData<ParameterPack> to the function.
// UNSAFE PRECONDITION: The provided index must be in range for the
// provided slice for this code to be safe.
fn unsafe_coupling(coupled: UnsafeData<(&[u8], usize)>) -> u8 {
let coupled = coupled.unpack();
unsafe { coupled.0.get_unchecked(coupled.1) }
}
Then there is the problem of unsafe preconditions which are about “ambient state” (the hardware, the operating system, global variables…) rather than unsafe parameters. We can encode them in the type system by using a marker type which actually contains nothing, but is unsafe to create nonetheless:
struct UnsafeContext()
impl UnsafeContext {
// Again, we need unsafe to attest that the precondition is upheld
unsafe fn new() -> Self { Self() }
}
// UNSAFE PRECONDITION: Do not use this function outside of April 1st
fn april_fools(joke: &str, _date_checked: UnsafeContext) {
println!("{}", joke);
}
fn call_april_fools() {
// Yes, I am sure that it is April 1st today
let date_checked = unsafe { UnsafeContext::new() };
april_fools("Rust is getting merged into C++20", date_checked);
}
First unsafe(post) fn sketch, and a problem
These functions promise safety-critical guarantees about either their result or the ambient state at the end of their execution. So we use the same strategy here of using a suitable wrapper type on the result side. As it turns out, we can do this using the same UnsafeData and UnsafeContext notions that we have introduced above:
// UNSAFE POSTCONDITION: Will return "42"
fn unsafe_post() -> UnsafeData<usize> {
/* Perfectly safe work */
// I'm sure that this is 42, I have double-checked it
unsafe { UnsafeData::new(42) }
}
// UNSAFE POSTCONDITION: Only returns on April 1st
fn wait_april_1st() -> UnsafeContext {
/* Safely wait until the day is right */
// I have made sure that it is time now
unsafe { UnsafeContext::new() }
}
fn call_unsafe_post() {
let result = unsafe_post();
wait_april_1st();
// Again, some _safe_ boilerplate is needed on the caller side
let result = result.unpack();
}
Although they introduce some boilerplate at the boundary between unsafe and safe code, these library-level implementations of unsafe(pre) and unsafe(post) compose nicely with each other…
fn compose() {
april_fools("Rust 2018 will introduce mandatory garbage collection", wait_april_1st());
unsafe_pre(unsafe_post(), 64);
}
…however, there lies also a major problem with this first type system encoding of unsafe preconditions and postconditions: it is excessively permissive. We can plug unsafe data from any function into any other function that expects unsafe data of the same type, even if the associated unsafe preconditions and postconditions are wholly unrelated, and all this occurs without using an unsafe block. This is obviously not good.
Take two: encoding the contract
To address this major safety issue, we need to make the underlying contract part of the UnsafeData or UnsafeContext type. A first rough implementation could use simple marker types:
// Unsafe types
struct UnsafeData<T, Contract>(T, Contract)
struct UnsafeContext<Contract>(Contract)
// Marker types representing contracts
struct DataIs42()
struct DayIsApril1st()
// Shorthands to avoid insanity
type DataWithContract = UnsafeData<usize, DataIs42>
type ContextWithContract = UnsafeContext<DayIsApril1st>
By adding suitable trait bounds on the Contract type and modifications to the UnsafeData and UnsafeContext implementation, one could also later extend this implementation into a poor man’s design-by-contract tool, where contracts which are expressible in code are expressed in code, as @gbutler had in mind earlier.
As before unsafe associated trait methods would be handled with the same formalism as free unsafe functions, with added “every implementation must provide this” caveats.
What a library-based proposal cannot address
- As stated before, like any type system hack, this is much more boilerplate-intensive than language-level concept integration.
- A library cannot deprecate the old unsafe abstraction vocabulary, leading to confusing coexistence of the old and new approach (some will see this as an advantage)
- There is as far as I can tell no way to express unsafe(post) traits at the type system level in today’s Rust, so they would need to stick with the current syntax
- Unsafe invariants cannot be expressed as function inputs and outputs, and therefore cannot be expressed at the type system level. They can be again approximated with UnsafeContext, but the approximation is even more detached from the actual contract.
- With a library-level implementation, some forms of static analysis such as linting of dangerous existing usage of unsafe code become much more difficult.