In the Reddit and Hacker News discussions of my recent blog post, some people raised concerns about the observation I based the post on: That if your program crashes, the bug can be outside of an unsafe
block. I think the main concern was that this makes unsafe
significantly less useful as a lint.
Now, of course unsafe
is not a lint, it has a fairly precise definition: Stuff is marked unsafe if it can violate the safety guarantees of your program; in particular, if it can cause a crash.
But then, unsafe
also servers to raise the attention of the reader and to make sure they double-check this piece of code. Just think of the GitHub bots that add extra warnings to PRs when you modify unsafe
code.
Certainly, it’d be nice if that bot could have caught evil
.
I think there is a way to extend unsafe to answer these concerns. I am not saying this is what I want to happen, I am not decided yet - but well, my brain came up with this idea, so now I’m dropping it here to see whether anybody things it is useful
Proposal: Unsafe types
My proposal is to add the notion of an unsafe type to Rust, e.g.
unsafe pub Vec<T> { ... }
The consequence of this annotation at the type would be that writing to a private field of this type becomes an unsafe operation (this includes “writes” of constructors). Furthermore, taking a mutable borrow of such a field becomes unsafe.
Motivation
The motivation for this is to “fix” the fact that code like evil
in my blog post can cause crashes, without being unsafe. The reason it can cause crashes is that it violates invariants. The reason that this code is not considered unsafe is that Rust does not know that it violates invariants. The solution is to tell it
So, semantically speaking, adding unsafe
to a type means “the private fields of this type may carry additional invariants that you, compiler, do not know anything about”. This has no operational consequences, but it means that whenever someone is writing to such a field, that could potentially break the invariant. The compiler cannot know if this particular write is okay, but it can at least make you aware that there is something extra to check here.
Public fields cannot carry invariants, and are hence excluded from this treatment.
Of course, invariants could also be violated by taking a mutable borrow of such a field and writing through it, so this also has to be unsafe. What would be safe is taking a raw pointer of this field, so if possible &mut v.len as *mut _
could be considered safe. However, I assume that people actually rarely create pointers to such fields and send these pointers all around the world. That would be dangerous exactly because everybody writing through these pointers has to be aware of the invariants.
Drawbacks
This adds more stuff to the language, extra complexity should not be added without a good reason.
This does not automatically make anything safer, or point programmers to anything new. People still have to actually tell the compiler that a particular type has additional invariants, if they forget, we’re back to the status quo.
But I think programmers usually, intuitively, know about the distinction between unsafe
code that relies on local invariants within the same function, and unsafe
code that relies on invariants which are maintained as a coordinated effort of the entire module. They only have to remember once to tell the compiler that a particular type carries such invariants, and then the compiler will keep reminding them that they have to double-check every write.
I don’t think we have a way to mark just the “left-hand part of an assignment” as unsafe, so e.g. an assignment to the len
field of a vector would now have to be
unsafe { v.len = f(); }
such that the unsafe
block also covers the entire right-hand side of the assignment. This is unnecessary. In principle, we could allow
unsafe { v.len } = f();
but I don’t think unsafe
l-values are a thing right now (and this is even worse, since this is only unsafe
if the l-value is used for writing). One could write
*unsafe{ &mut v.len } = f();
but oh my, please not^^.
Alternatives
The effect of the type annotation could be expanded to public fields. This would make the rule simpler. I think this is useless for public types, since public fields of public types cannot carry any useful invariant, and instead of making the rest of the world write unsafe
around writes to this field, you can just make it private. (Are there any examples of public types with additional invariants that also have public fields?) But maybe there’s a use-case here for public fields in private types. I don’t think however that this justifies adding an extra dependency on the visibility of the type.
Not just writes, but also reads could be considered unsafe
.
This would be necessary if the invariant of the user on this field is actually weaker than the base type, so that reading from this field and assuming it has the announced types would be wrong.
However, we have some types that come with so little (read: no) a-priori promises that I can’t think of any case where this is useful, and having the given type of the field be a lower bound to its actual, semantic type is, I think, a useful piece of documentation.
Instead of unsafe
types, we could have unsafe
fields, to mark the individual fields that carry additional invariants. These fields would have to be private. This may require more writing for types with many fields that carry invariants, but on the plus side this answers all questions about whether only private or also public fields are covered.