Detect and Fix Overscope unsafe Block

Sure, let's look at the source of Vec, which is the prime example for the discussed pattern.

pub struct Vec<T, A: Allocator = Global> {
    buf: RawVec<T, A>,
    len: usize,
}

impl<T, A: Allocator> Vec<T, A> {
    pub fn truncate(&mut self, len: usize) {
        // This is safe because:
        //
        // * the slice passed to `drop_in_place` is valid; the `len > self.len`
        //   case avoids creating an invalid slice, and
        // * the `len` of the vector is shrunk before calling `drop_in_place`,
        //   such that no value will be dropped twice in case `drop_in_place`
        //   were to panic once (if it panics twice, the program aborts).
        unsafe {
            // Note: It's intentional that this is `>` and not `>=`.
            //       Changing it to `>=` has negative performance
            //       implications in some cases. See #78884 for more.
            if len > self.len {
                return;
            }
            let remaining_len = self.len - len;
            let s = ptr::slice_from_raw_parts_mut(self.as_mut_ptr().add(len), remaining_len);
            //                                                      ^^^  UNSAFE
            self.len = len;
            ptr::drop_in_place(s);
            //   ^^^^^^^^^^^^^  UNSAFE
        }
    }
}

I have marked the only operations in the source of Vec::truncate which are actually unsafe. The rest is, technically, safe code. In practice, it's all a tight knot of invariants. drop_in_place is only valid because s is exactly the raw slice *mut [T] of truncated initialized values, which is the case because self.as_mut_ptr().add(len) is exactly the start of truncation, and remaining_len is exactly its length. None of those properties of remaining_len are expressed via explicit invariants on called unsafe code. The safety of ptr::add is mostly tangential to the safety of this method. Yes, it's important that it's within the safe allocation as self.as_mut_ptr(), but that's not nearly strong enough for the safety of the entire function. That actually depends on the self.len < len check above, as well on the global semantic properties of self.len.

If we were to dowscope unsafe { }, we'd end up with

pub fn truncate(&mut self, len: usize) {
    if len > self.len {
        return;
    }
    let remaining_len = self.len - len;
    let s = ptr::slice_from_raw_parts_mut(unsafe { self.as_mut_ptr().add(len) }, remaining_len);
    self.len = len;
    unsafe { ptr::drop_in_place(s) };
}

which is just misleading. It's not even possible to properly prove the safety of these unsafe blocks. Both of them depend heavily on the surrounding code. The first one is mostly trivial, while the proof for the second one would start as "looking at the entire body of this function...", which means that it should have been an all-encompassing unsafe { } block in the first place (just as it really is).

Even if you banned direct field access and demanded a (safe) getter and unsafe setter for self.len, you still would be unable to properly scope the unsafe blocks. The only call to the setter is on the pre-last line, and it's the least interesting part of this function. The getter calls would still be outside an unsafe block, even though the correctness of the following unsafe code crucially depends on the preceding checks and on the subtle interaction between the vector length and data pointers.

Yes, kinda like the Safe Transmute project proposed. That's a very cool approach, and I hope it works out.

I sometimes dream about "Rust without UB", which would have all preconditions as explicit static checks. There would still be stuff that the compiler couldn't prove, but the UB part would be limited to explicit assertions to the compiler "this unsafe property holds". There's still UB in the language, of course, but it's never implicit. It's always very in your face: you make an assumption that the compiler couldn't check, and it's wrong, so everything breaks. Like a souped up version of current Rust: instead of "UB can only happen due to errors in these code blocks" we'd have "UB can only happen because these specific assertions are wrong". That's still the case in current Rust, e.g. each time we dereference a pointer, we assert that it's not null, live, aligned and contains a valid value of the referent.

I doubt it could really work at the scale of the whole language (at least for Rust as it exists now), since it probably would require dependent types or proof checkers to make it really scalable and/or ergonomic. But in certain scoped cases, like safe transmute or pointer assumptions, it looks like something workable.

3 Likes

Let me summarize how I see it and where I think the point of contention is.

I will ignore @scottmcm 's proposal here because while that's an interesting idea, I see it as a different, alternative mechanism that would make it quite different from the way things works now.

So let's say we have a datastructure. There are several types of code:

  1. Code that calls unsafe APIs. That obviously should be (and has to be) in an unsafe block. That code should ideally have Safety comments that explain why all the preconditions are satisfied. This explanation can make use of the datastructure's invariants that are maintained elsewhere.

  2. Code that modifies that data structure that has to be careful to maintain invariants, without using unsafe APIs. I say this should not be in an unsafe block. I think this is the point of contention. The correctness of the invariants that are maintained here can very well be essential for safety of the unsafe blocks, but that doesn't mean this code needs to itself be in an unsafe block. If these invariants are tricky, maybe there should be "Correctness: ..." comments, explaining how the invariants are maintained. But these are different from the "Safety: ..." comments elsewhere.

  3. Code that exposes unsafe APIs. That should be an unsafe marker on the function. Whether its implementation uses unsafe blocks or not is a separate issue. E.g. Vec::set_len should be an unsafe function but shouldn't use an unsafe block in its implementation because it doesn't itself use any unsafe APIs. Now of course Rust implicitly creates an unsafe block in every unsafe function. I consider that a design flaw in the language, which is why I do #![deny(unsafe_op_in_unsafe_fn)] if I define any unsafe APIs.

I much prefer your second version of truncate. Each unsafe block should explain why the call is safe. I disagree that it is "misleading" that the safety relies on surrounding code. As far as I'm concerned, that's normal, that's how it should be.

You seem to want unsafe blocks to mean "this is the only place where bugs can cause UB". But that's not what they mean. Their correctness is allowed to rely on the correctness of the rest of the code.

std implementation is inconsistent on this issue. Some code is unnecessarily inside unsafe blocks, I think it would be better if it wasn't.

String maintains the invariant that the contents is UTF-8 and most of it is, appropriately, not inside unsafe blocks. Even though bugs there would cause UB.

Another example -- String::pop:

    pub fn pop(&mut self) -> Option<char> {
        let ch = self.chars().rev().next()?;
        let newlen = self.len() - ch.len_utf8();
        unsafe {
            self.vec.set_len(newlen);
        }
        Some(ch)
    }

A bug in the calculation of ch or of newlen would be bad and would cause UB. But they are, appropriately, outside the unsafe block. I consider this good code, with the unsafe call appropriately marked (except there is no Safety comment explaining why the call is safe). It is not "misleading" that the safety relies on the rest of the function.

That doesn't make it "impossible" to prove their safety.

Even when all of truncate is in one big unsafe block, its safety still depends on code outside of that block because it relies on the correctness of other code in all the other functions that maintain Vec's invariants. So this idea that unsafe blocks should be expanded to contain everything that safety depends on is futile. That's not how unsafe is designed to work.

1 Like

What I think that is missing from Rust is a way to annotate that a given safe operation provide safety-related guarantees that some unsafe code defined on the same crate relies upon.

For example, if an unsafe code relies upon the field len being less than or equal to capacity, maybe the field should have #[something] on them to warn about this, and every code (even safe code) that modifies those fields needs to be inside some something { .. } block.

This is not the same thing as unsafe { .. } because unsafe blocks have the opposite polarity - unsafe blocks create new demands upon other code that might violate a given invariant (which, due to privacy, we try to generally restrict it to code in the same module - though unfortunately there are unsafe code that relies on global proprieties of the program), that is, unsafe blocks introduce new obligations to other code. This something { .. } I'm talking about would do the opposite, it would fulfill obligations.

(I think this is related to the comment that only unsafe operations have safety preconditions)

Now, naming is hard. I saw here a suggestion that the dual of unsafe should be called robust (even though I don't really understand it at a deeper level, I think it's close to what I'm proposing).

With that in mind, a rough sketch would be something like this

struct Vec<T> {
    data: *mut T,
    #[robust]
    len: usize,
    #[robust]
    capacity: usize,
}

impl Vec<T> {
    pub fn truncate(&mut self, len: usize) {
        if len > self.len {
                return;
        }

        unsafe {
            // some things..
        }

        // ROBUST: supposing `self.len <= self.capacity`, and since `len` is
        // smaller or equal to it, `self.len` will continue to be smaller or
        // equal to capacity
        robust {
            self.len = len;
        }


        unsafe {
            // other things..
        }
    }
}

Too verbose? Probably. Looking at the actual Vec implementation there's one improvement that comes to mind: it should be syntactically light to run a robust operation inside an unsafe block, otherwise code becomes too noisy (and noisy code means harder to review and ultimately harder to assess whether it's sound). Maybe using postfix keywords or something.

However, in the site I linked above, it discourages documenting something as robust, which is kind of frustrating because if this term isn't used across the ecosystem the whole concept or something dual to unsafe will continue to be ignored by the Rust community at large.

Doesn't all code have to fulfill obligations?

Let's say I build a priority queue data structure. It provides certain guarantees. Some other crate might want to rely on my priority queue for safety. Am I supposed to put all my priority-queue code in robust { ... } markers to make that possible?

2 Likes

Not always.

For example, unsafe code can't generally depend on the correctness of PartialEq to not have UB (for sure it can't depend on it on a generic context, because you can't depend on upstream crates for safety), but you can still write x == y in your unsafe code. It just means that your unsafe code shouldn't trigger UB even if the PartialEq implementation is buggy.

Likewise, if your unsafe code depends on a priority queue, it could be interesting to know whether it depends on its correctness to not trigger UB, or if a broken data structure couldn't possibly lead to UB. In the first case, the unsafe code is expecting some robustness from the priority queue (and as such, a through audit of the unsafe code must also assess the correctness of the priority queue). In the second case, the priority queue itself doesn't have any safety impact and can be skipped in an unsafe audit.

But no I don't propose any modification at all in the crate that defines the priority queue. I just wondered that, in the module where the unsafe code actually depends on the correctness of the queue in order to not trigger UB, maybe calls to the queue's methods should be wrapped in a robust { } block. Something like robust { queue.pop(); } or something. Another unsafe code that calls the same queue but doesn't depend on it for safety, it wouldn't need to wrap it in robust blocks.

The more I think about it the more I dislike the added verbosity. However, this information would be useful to actually audit unsafe code.

And, if you think that most calls will be robust anyway, a solution would be to annotate non-robust calls instead. (However the benefit of a robust block might not be the block per se, but some // ROBUST comment that says exactly what invariant this uphold, in the chance it is non-obvious)

Would this apply to the standard library? E.g. if your safety relies on Vec::pop doing the right thing, would that also need to be wrapped in robust? How about a + b computing the sum correctly on two integers?

1 Like

Yeah it's fair that the stdlib should be trusted. And if robustness became a thing, there could even be some entry on Cargo.toml to specify that some specific, carefully vetted dependencies are trusted and unsafe code can assume they are correct.

About a + b: we can't suppose it's correct for arbitrary types, but either it is correct for integers or the compiler has a bug. And we can't defend against hypothetical compiler bugs.

1 Like

Another problem with this approach: let's say you update your dependency to a new major version, and a function that used to be safe is now unsafe. Your IDE might start displaying it differently, but you might not notice anyway if the code still compiles.

It's essentially a kind of type safety thing. You want to mark certain things explicitly to make sure the interface is understood the same way from both sides, and this is one of them. You want an explicit unsafe marker on an unsafe call to signify that the caller intended to call an unsafe function.

That's not the interesting property, though. Important, yes, but also mostly trivial. The interesting part is that all of self.len elements are initialized, and the rest are properly dropped.

Up to a point, it's possible. You can make safe functions return zero-sized opaque tokens which represent witnesses to some propositions, which can be later consumed by unsafe operations. Unfortunately, that's not really scalable, since there can be an unbounded number of interesting guaranteed and implied properties, and there is no way to compose and decompose such tokens in general. Currently we really need to use something like kani or Creusot to deal with proofs.

That presupposes that there actually are explicit invariants to maintain. That's a valid assertion for the public API, or at least for private functions. That's not something you can claim when we're just talking about blocks of code inside of a function's implementation. I'd say the purpose of unsafe blocks is that we can explicitly delineate where we need to rely on non-checkable assumptions, or provide some extra assumptions to the following code.

See, I think we can agree that the usage of unsafe blocks in the wild can be inconsistent, and is often subjective. And we can debate whether something like String::pop should have a small or large unsafe block.

But as far as I can tell, you take an extremist position "there is no single definitive answer, so let's just use single-operation blocks since that's a rule which can be unambiguously defined". And I think you're throwing out the baby with the water. Small or large, the unsafe block is still important information: the author thinks that subtle stuff is going on here. And you propose to mechanically erase that information, for what benefit exactly?

That sounds like a position held in some circles: comments can be outdated and wrong, nobody reads documentation, so let's not write any comments or docs at all. Code is the documentation, variable/function names are your comments.

And it's not an improvement, it's just erasure of information. An imperfect source of information is still usually better than no information at all, and wrong comments are valuable if only because they show me what the author was thinking at the time, and indicate that the following code should be carefully scrutinized. Erasing that information isn't in any way an improvement, it's laziness with a spin of ideology.

That's a major API break. This just shouldn't be done by responsible maintainers.

Bugs like that are in the same category as "what if this O(1) function changes its implementation to be exponential" or "what if File::read decides to format your hard drive".

Since the unsafe mental model was mentioned, let me chime in quickly to try to answer some questions from my point of view (even though it seems to me the discussion is going off-topic, the original topic seems to be essentially answered by the recommended lints which have been answered in the first 5 comments).

I agree and I also don't like it too much. This is only relevant for unsafe reviews, to avoid forcing unsafe reviewers to review correctness in addition to soundness. However, with better tooling that could track what was actually reviewed during an unsafe review (e.g. which set of features, which APIs, etc.) the unsafe reviewers could document that they didn't review robustness. So I hope to see crate authors providing guarantees about the correctness of their code eventually. Ideally, this would be accompanied with proof artifacts (Prusti, Creusot, etc).

Not robust { ... } blocks, this is a new concept I'll address below. But you'll need to document your public API as robust (at least the functions that are robust, i.e. have a correctness guarantee). For example:

/// Sorts a slice of integers.
///
/// # Robustness
///
/// The slice is sorted when the lifetime ends.
robust fn sort(xs: &mut [i32]);

In the unsafe mental model, I suggested that the standard library is considered to be robust by default, because they put a lot of effort on correctness and it's used by many, so bugs will quickly be found.

That's an interesting idea. Essentially robust { ... } would be code location where you can put stuff in your proof context. I like the idea but I'm worried (like you) that this will be either too restrictive (not enough facts in the proof context) or too verbose (essentially needing to annotate a lot of things). I think the current status quo of implicitly adding everything to the proof context is more maintainable (at least without tooling). Being explicit may help unsafe review, but hinders readability at the same time, so it's hard to judge. I've created Discussion about robust blocks · Issue #19 · ia0/unsafe-mental-model · GitHub to track this idea.

I don't think this is what the unsafe marker signifies. That's conflating a different meaning to what unsafe actually signifies: APIs with conditions that have to satisfied by the caller.

There are other ways to document subtle code, for example:

// This is extremely subtle: ...

Every API has condition which must be satisfied by the caller. unsafe specifically means "code here can violate memory safety", and it's up to the author's judgement how big that dangerous part is. It certainly isn't "this specific pointer dereference", that's practically useless for auditability.

1 Like

I meant specifically: APIs that have undefined behavior unless the caller satisfies certain conditions.

I think "memory" is a red herring. It's about UB in general. But that's a side topic.

You suggest it's about marking parts of code that are error-prone according to some arbitrary judgement. I consider that a complete misconception. It's not about error-prone code on some sliding scale, it's about interfaces that have UB conditions.

Not really the same kind of thing: your examples are regressions, my example is a stricter compile-time check without a regression in functionality. Maybe it was a bug in their library, they forgot that the API is unsafe and forgot to mark it as such. It happens. So in the new release they mark it appropriately to make sure all users have to update their code if they previously used it as safe.

But you're supporting a coding style that basically ignores the details of which APIs exactly are unsafe, just think about the risk in the whole function holistically by some imprecise judgment. That to me is a wrong way to think about unsafe markers.

1 Like

After going through all of the above discussion, I found that the Recommended lints lacks a check for a situation where "there is only one unsafe operation in an unsafe block, but there are other safe operations in the same unsafe block." Eg. else if branch in fn grow

pub fn grow(&mut self, new_cap: usize) {
        unsafe {
            let (ptr, &mut len, cap) = self.triple_mut();
            let unspilled = !self.spilled();
            assert!(new_cap >= len);
            if new_cap <= self.inline_size() {
               ..............
            } else if new_cap != cap {
                let mut vec = Vec::with_capacity(new_cap);
                let new_alloc = vec.as_mut_ptr();
                mem::forget(vec);
                ptr::copy_nonoverlapping(ptr, new_alloc, len); //only one unsafe op here
                self.data = SmallVecData::from_heap(new_alloc, len);
                self.capacity = new_cap;
                if unspilled {
                    return;
                }
            }
            ..........................
        }
    }

I think limiting unsafe blocks of code to strictly necessary is relatively easy to do right now (I'll also try adding such a lint first in clippy). Better yet, as a says, an unsafe block contains a prerequisite for an unsafe operation (which makes it easier to review and read the comment), perhaps by means of something like stain tracking, which I don't have much idea about yet. All in all, limiting an unsafe block of code to the minimum necessary scope is intuitive and easy to implement, and I will try to implement this lint first, at the very least it will ensure that there is not too much code in an unsafe block that has nothing to do with true unsafe operations.

1 Like

This is largely what unsafe fields try to achieve -- they might not cover every such case, but they do cover set_len. I think their latest incarnation is

2 Likes

This is a recurringly debated topic:

  • should the unsafe scopes be tightly scoped to the directly-capable-of-causing-memory-unsafety operations to be able to spot them and make sure we reason about them?
  • (x)OR should they be larger, encompassing the non-unsafe code which validates or ensures the preconditions needed by said unsafe ops?

Both points are valid and good, so the issue, here, lies in Rust forcing that (x)OR on us.


FWIW, what I am now personally striving to do is using #[allow(unsafe_code)] {} as the latter scope, so as to be able to use ultra fine-grained unsafe {} scopes for the former:

  1. #![warn(
        unsafe_code,
        unsafe_op_in_unsafe_fn,
        clippy::multiple_unsafe_ops_per_block,
        clippy::undocumented_unsafe_blocks,
    )]
    

    at the root of the src/lib.rs

  2. #[allow(unsafe_code)] {
        // ...
    }
    

    around areas of code (bonus: even atop impl blocks or whatnot) over which the safety reasoning encompasses.

  3. Fine-grained unsafe {} blocks.

The annoyance at the moment is that unsafe fn is still deemed to be unsafe_code, despite the unsafe_op_in_unsafe_fn :pensive:


Example

pub fn grow(&mut self, new_cap: usize) {
    #![allow(unsafe_code)]
    // Safety invariants: ...
    let (ptr, &mut len, cap) = self.triple_mut();
    let unspilled = !self.spilled();
    assert!(new_cap >= len);
    if new_cap <= self.inline_size() {
        if unspilled {
            return;
        }
        self.data = SmallVecData::from_inline(unsafe {
            // SAFETY: ...
            mem::uninitialized()
        });
        let dst = self.data.inline_mut().ptr_mut();
        unsafe {
            // SAFETY: ...
            ptr::copy_nonoverlapping(ptr, dst, len);
        }
    } else if new_cap != cap {
        let mut vec = Vec::with_capacity(new_cap);
        let new_alloc = vec.as_mut_ptr();
        mem::forget(vec);
        unsafe {
            // SAFETY: ...
            ptr::copy_nonoverlapping(ptr, new_alloc, len);
        }
        self.data = unsafe {
            // SAFETY: ...
            SmallVecData::from_heap(new_alloc, len)
        };
        self.capacity = new_cap;
        if unspilled {
            return;
        }
    }
    unsafe {
        // SAFETY: ...
        deallocate(ptr, cap);
    }
}

As long as Github, and other non-IDE reviewing environments are a thing, I don't think we can consider IDEs to be a sufficient tool on its own to help with these things (even if I agree they're great for the majority of cases where such tooling is being used).

2 Likes