Unsafe types and auto traits

So @RalfJung wrote up a pretty interesting blog post about the recent bug fix to MutexGuard:

In there, he draws some lessons about auto traits and in particular their relationship to "unsafe types" that I think are worth digging into some more:

Again, I think the solution here is that types like MutexGuard<T> should not automatically implement unsafe auto traits. What do I mean by “types like MutexGuard<T>”? The point that makes MutexGuard<T> special here is that there is an invariant attached with this type that all instances of MutexGuard<T> satisfy. [..] My proposal is to tell the compiler that this is the case.

In general, I strongly agree. The intention of the auto traits design was things would only be "automatically" Send and Sync when that was safe, obviously, but as this blog post demonstrates, the current design can't achieve that.

What went wrong, specifically, is that MutexGuard contains an &Mutex, which is considered (and righly so) to be a Sync type. But that is true only for external observers, those who are limited to Mutex's public API. MutexGuard has privileged access to the internals of Mutex, and hence it is not Sync. This distinction (between the public/private API surface of a type) was not considered when designing auto traits.

Ralf's proposal is to add unsafe fields, which is something that I am in favor of in general, and then leverage that around auto traits. The idea would be that any type with unsafe fields cannot implement any auto traits. I have mixed feelings about this. On the one hand, it makes a lot of sense, and may be necessary, but on the other hand it doesn't feel sufficient for two reasons:

  • One can easily neglect to call a field unsafe! The original design was intended to be robust against this sort of mistake. It basically said "we will consider a type "safe" if it only has access to safe things". (And then relies on the fact that one cannot engineer a data race (as opposed to a more general race, which we don't claim to prevent) from data-race free components that are composed together.)
  • There may be auto traits for which this is too strong. My main example would be a "deep freeze" trait that scans for UnsafeCell -- I forget just why I had wanted this recently, but I remember there was an example, and it's something that even unsafe code ought not to be violating (technically this is something that the Unsafe Code Guidelines will settle, but I think there is and has always been universal agreement on this point).
    • That said, this depends. Maybe we would say that an unsafe field can be mutated, even when shared.

So, a related thought that I had is that perhaps auto traits ought not to recurse solely on the fields of the type in question, but also on the accessible fields of the types they contain. In this case, MutexGuard would be !Sync by default, because it would have access to the UnsafeCell contained in Mutex.

One final thought is that we may very well have to tinker with the conditions under which one can add an auto trait impl in the first place. I've been reading into recent work on mixing inductive and co-inductive logic programming, and it proposes a basic rule that recursing between inductive (normal traits) and co-inductive (auto traits) predicates is inherently contradictory. We saw this ourselves in the various unsoundnesses that arose from mixing auto traits and super traits, and I suspect it may imply some kind of limits on the kinds of where clauses one can use in an impl Send for T impl.

2 Likes

So what you mean is, it should look at the fields of Mutex instead of looking at its impl Sync, or should it do both and ensure all requirements are satisfied?

Interesting point of view. That's not how I thought about it. I was more thinking along the lines of what @ubsan wrote on GitHub: MutexGuard<'a, T> really contains an &'a mut T. That's pretty clearly visible when you start to write down the invariants of these types formally (well, at least in the formalism we picked); it is also visible in the APIs that are provided (deref_mut and deref, in particular). I guess what you are saying here is that this all comes from the fact that MutexGuard<T> accesses the T in Mutex<T> knowing that it acquired the lock, and hence bypassing the public API. But I feel like there's more here than just accessing additional APIs; there's also no way for a user to even get a &Mutex from a &MutexGuard -- which is why MutexGuard can be Sync even when T is not Send. MutexGuard not just accessing more API than the public has access to, it is just different API.

I don't know =)

I think I meant "instead of", but I wasn't sure.

I'm not sure I understand. If I built a MutexFoo type on my own, independent from the standard library:

struct MutexFoo<'a, T> {
    x: &'a Mutex<T>
}

I could put any "logical invariants" I want on that type and it still would not be able to create a data-race, no?

That part about using different APIs doesn’t lead to races, it leads to a too restrictive auto trait bound. So it’s not a safety issue. Still I thought I should bring it up: The fact that there is an &Mutex field in MutexGuard makes it look like MutexGuard can only be Sync when &Mutex is Sync, but that’s not actually true.

Hm, when I say it this way, it actually sounds like something that should be fairly common, and should not really be an issue because it only leads to bounds being too strict. Never mind.

As I understand what’s going on with the MutexGuard (please correct me if I’m wrong), there are two problems here – first, the &Mutex<T> is not restrictive enough to make MutexGuard<T> to be Sync only when T is Sync. And the second problem is that the presence of &Mutex is overly restrictive – it requires Mutex<T>: Sync, thus T: Send to satisfy MutexGuard<T>: Sync, which is not strictly necessary (note that this overly restrictive issue affects only Send-but-not-Sync types put in the Mutex that you can use only in a single thread).

Both of these problems arise from the fact that the guard uses Mutex with different kind of API than a public one – (1) it accesses the T directly and (2) it promises not to use the lock method. I think the right solution would be to communicate that fact to the type system. I thought of introducing a wrapper type that indicates that we’re using different kind of API, eg.:

struct AssertSync<T>(T);

The meaning of this type is “I will manually provide the Sync property of this object”. This type can have eg. unsafe constructor and implement Deref, or instead of Deref it can have unsafe method to access the interior type. With such a wrapper, we could redefine the guard struct as:

pub struct MutexGuard<'a, T: ?Sized + 'a> {
    __lock: AssertSync<&'a Mutex<T>>, // deals with "overly restrictive Sync"
    __poison: poison::Guard,
    __phantom: PhantomData<&'a mut T>, // deals with "Sync not restrictive enough"
}

Adding just __phantom, as @ubsan suggested, will deal with the first (not restrictive enough) issue, but not with the second (overly restrictive), as @RalfJung noted. That’s why I tried to find a way to “disable a field from auto-trait analysis”.

Alternatively, the wrapper struct could be something different (this may require some help from the compiler, I guess):

/// Store X, but treat as Y for the purpose of borrow-checking and auto-traits.
pub struct IWillUseXAsItWasY<X, Y>(X);

Also, I have a related thought about implementing smart pointers (let’s say that smart pointer is a type that provides a direct access to a reference (eg. through Deref). So Box, Rc, Arc, MutexGuard, but not Mutex). I think that any implementor of a smart pointer that uses unsafe code should provide an appropriate PhantomData field (or other field providing equivalent semantics wrt. auto-traits and borrow-checking). This rule could be enforced with a lint and would lessen the likelyhood for library writers to make the same mistake as the MutexGuard.

As a further thought in this area, and a argument in favor of @RalfJung’s ideas around unsafe fields – which btw don’t seem contradictory from having some kind of understanding about privacy – I was thinking about types that offer a combination of safe+unsafe interfaces. Take Cell, for instance: it offers a helper to get a *mut T to the inside. Now, Cell as it happens is not Sync, but it is Send, and that is based on its safe interface. But if I own a Cell<T> and I use as_ptr(), then I might wind up creating bad aliases to my Cell contents, such that when I send it, bad things will happen.

In other words, it seems like auto traits are naturally referring to the “safe” API of a type, but any type can offer an extended unsafe API, and you should have a way to signal that you are using that API.

1 Like

Agreed.

However, I do not understand how unsafe fields help here? We do want Cell to be Send, after all. I assume you don't want using as_ptr to affect auto trait impls, so I must be missing something.

Well, they don't solve the problem by any means, but I was imagining that if I had a Cell field and I know that I am will use the unsafe API in some non-local way (e.g., escaping the *mut u8 to somewhere else), then it would be appropriate for me to tag the field as unsafe:

struct Foo {
    unsafe x: Cell<u8>
}

After all, there are some extended invariants at play. (The type Cell<u8> doesn't tell the full story.)

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.