Sure, let's look at the source of Vec
, which is the prime example for the discussed pattern.
pub struct Vec<T, A: Allocator = Global> {
buf: RawVec<T, A>,
len: usize,
}
impl<T, A: Allocator> Vec<T, A> {
pub fn truncate(&mut self, len: usize) {
// This is safe because:
//
// * the slice passed to `drop_in_place` is valid; the `len > self.len`
// case avoids creating an invalid slice, and
// * the `len` of the vector is shrunk before calling `drop_in_place`,
// such that no value will be dropped twice in case `drop_in_place`
// were to panic once (if it panics twice, the program aborts).
unsafe {
// Note: It's intentional that this is `>` and not `>=`.
// Changing it to `>=` has negative performance
// implications in some cases. See #78884 for more.
if len > self.len {
return;
}
let remaining_len = self.len - len;
let s = ptr::slice_from_raw_parts_mut(self.as_mut_ptr().add(len), remaining_len);
// ^^^ UNSAFE
self.len = len;
ptr::drop_in_place(s);
// ^^^^^^^^^^^^^ UNSAFE
}
}
}
I have marked the only operations in the source of Vec::truncate
which are actually unsafe. The rest is, technically, safe code. In practice, it's all a tight knot of invariants. drop_in_place
is only valid because s
is exactly the raw slice *mut [T]
of truncated initialized values, which is the case because self.as_mut_ptr().add(len)
is exactly the start of truncation, and remaining_len
is exactly its length. None of those properties of remaining_len
are expressed via explicit invariants on called unsafe code. The safety of ptr::add
is mostly tangential to the safety of this method. Yes, it's important that it's within the safe allocation as self.as_mut_ptr()
, but that's not nearly strong enough for the safety of the entire function. That actually depends on the self.len < len
check above, as well on the global semantic properties of self.len
.
If we were to dowscope unsafe { }
, we'd end up with
pub fn truncate(&mut self, len: usize) {
if len > self.len {
return;
}
let remaining_len = self.len - len;
let s = ptr::slice_from_raw_parts_mut(unsafe { self.as_mut_ptr().add(len) }, remaining_len);
self.len = len;
unsafe { ptr::drop_in_place(s) };
}
which is just misleading. It's not even possible to properly prove the safety of these unsafe blocks. Both of them depend heavily on the surrounding code. The first one is mostly trivial, while the proof for the second one would start as "looking at the entire body of this function...", which means that it should have been an all-encompassing unsafe { }
block in the first place (just as it really is).
Even if you banned direct field access and demanded a (safe) getter and unsafe setter for self.len
, you still would be unable to properly scope the unsafe blocks. The only call to the setter is on the pre-last line, and it's the least interesting part of this function. The getter calls would still be outside an unsafe block, even though the correctness of the following unsafe code crucially depends on the preceding checks and on the subtle interaction between the vector length and data pointers.
Yes, kinda like the Safe Transmute project proposed. That's a very cool approach, and I hope it works out.
I sometimes dream about "Rust without UB", which would have all preconditions as explicit static checks. There would still be stuff that the compiler couldn't prove, but the UB part would be limited to explicit assertions to the compiler "this unsafe property holds". There's still UB in the language, of course, but it's never implicit. It's always very in your face: you make an assumption that the compiler couldn't check, and it's wrong, so everything breaks. Like a souped up version of current Rust: instead of "UB can only happen due to errors in these code blocks" we'd have "UB can only happen because these specific assertions are wrong". That's still the case in current Rust, e.g. each time we dereference a pointer, we assert that it's not null, live, aligned and contains a valid value of the referent.
I doubt it could really work at the scale of the whole language (at least for Rust as it exists now), since it probably would require dependent types or proof checkers to make it really scalable and/or ergonomic. But in certain scoped cases, like safe transmute or pointer assumptions, it looks like something workable.