I think I see where our difference of opinions lies. (I do agree with a large majority of what you've said.) Essentially, I don't want to demand more restrictions on callers than I actually use. Additionally, forcing unsafe on users adds friction, and I'd want them to get something in exchange for that friction. If I don't feel comfortable claiming that what I provide is worth using unsafe (or a hypothetical weak_unsafe), then I won't make e.g. the traits used to interact with the DB be unsafe traits.
But I said I would mark the DB function with preconditions unsafe (based on what little I know about this case).
The DB may want to have a policy of programming defensively and limiting damage in case of its own bugs or violated preconditions causing its invariants being destroyed. That's fine, defensive programming is not contradicted by marking any API unsafe. Defense in depth is OK.
The other option, if you don't want to make the function unsafe, would be to make sure the database is still in a valid consistent state no matter how you call the function, and document what will happen if you call it wrong. It sounds like this doesn't really apply in this scenario because it sounds like the database could be arbitrarily corrupted potentially violating all the invariants assumed by the documentation of other functions, so it's probably better to just make it unsafe.
One could also explicitly allow "corrupted database states" as a legit possibility, and explicitly document what each function will do with corrupted states. Then it's fine, no need for unsafe, "corrupted database states" become part of the abstraction.
My rule is that a safe API should:
- never violate the documented invariants for any possible inputs
- document the behavior correctly for all possible inputs
One could explicitly allow "broken" states, but then I'd have to include that possibility in the documentation of the type and make sure the documentation of all functions is still always correct including in all such broken states. So then those "broken" states no longer violate the stated invariants and become part of the abstraction.
For example, I could explicitly describe TwoDigit like this:
/// A two-digit number or a non-number.
pub struct TwoDigit(u8);
A non-number would be similar to floating point NaN.
And then I'd document functions like top_digit like this:
impl TwoDigit {
// For numbers: returns the higher of its two digits.
// For non-numbers: returns an arbitrary value.
pub fn top_digit(self) -> u8 { self.0 / 10 }
}
and then new_unchecked wouldn't need to be unsafe:
impl TwoDigit {
/// For x < 100: returns the number x.
/// For x >= 100: returns a non-number.
pub fn new_unchecked(x: u8) -> Self {
Self(x)
}
}
But normally that's not the abstraction I want to represent because that's less comprehensible and less usable, less type-safe than ensuring that TwoDigit always represents a two-digit number.
Similarly, the DB could explicitly document, for instance: "if the inputs are such and such, the function may delete arbitrary rows in this table". Then the integrity of the database is maintained, and the function correctly documents behavior, so then such a function can be safe.
[T]::sort is fine: it explicitly allows all possible comparators, doesn't violate any invariants, and documents what happens for all possible comparators. So it doesn't need to be unsafe. The slice is still always in a valid state at the end, no slice invariants are violated. We have it correctly documented what the function will do - it will panic or arbitrarily permute the elements for non-linear orders:
If the implementation of
OrdforTdoes not implement a total order, the function may panic; even if the function exits normally, the resulting order of elements in the slice is unspecified.
I think [T]::sort is actually a good place to look to learn lessons about this. A lot of research went into Rust's current sorting algorithms, and that included a survey of safety properties of existing C, C++ and Rust sorting algorithms (which was mentioned earlier in this thread in another context, but is worth looking at on its own). Here are three of the safety properties of sorts that some sorting algorithms violate:
- Some existing sorting algorithms that have undefined behavior if the
Ordimplementation is incorrect. - Some existing sorting algorithms that don't put the elements back into the list in the correct order if the
Ordimplementation panics, leading to duplicated elements in the list if when the list itself is unwound (or the unwind is caught). - Some existing sorting algorithms discard changes made via interior mutability if the
Ordimplementation panics.
Which of these are memory safety problems? Obviously, 1 is – undefined behavior is always a problem and the sorting library can't rely on the safe trait Ord to be correct. But 2 and 3 are less clear, especially if the sorting algorithm puts requirements on which types it can sort.
One of the sorting algorithms the analysis looked at fails condition 2. It was written entirely in safe Rust, and requires that the types being sorted are Copy + Default (thus doesn't automatically create instant UB when it fails it). Presumably, due to being written in safe Rust, most people would consider it to be memory-safety sound.
But the author of the analysis I linked considered this to be unsound, and I agree – it would be natural to wrap the sorting algorithm so that it could, e.g., be used to sort a list of raw pointers (that were dereferenced during the sort). In this case, the correctness of the algorithm is necessary to ensure memory safety – but the algorithm is correct, as long as the Ord definition is correct. This means that we could "naturally" create a situation in which all the unsafe code is considered sound by reasonable standards, and all the safe code it depends on is considered correct by reasonable standards, and yet somehow there's a memory safety bug in the program anyway (there's a potential double-free if Ord::cmp panics and then the list gets dropped during the resulting unwind).
[T]::sort avoids this sort of problem by documenting extra safety guarantees (above those which would normally be required for memory safety) so that unsafe code can remain memory-safe while using it, and I think that it's correct to do so.
I would explain this differently. Implementation 2 would just be incorrect since it would contradict the docs: the documentation promises the end result is always a permutation. If you changed the docs to permit duplication of elements, then implementation 2 would be correct. But then the calling code that assumes no duplication would no longer be correct because it would be assuming something that's not promised by the docs. Either way you simply have incorrect, buggy code. Unsafe calls relying on such buggy code can easily be unsound.
I'm imagining a situation in which the docs simply don't make a statement either way about the possibility of duplication during panic (which seems quite plausible to me). It's not a problem you'd be likely to think about unless you'd been told about it – most functions don't put guarantees on the state of referenced values if a callback panics.
That sounds like a problem of incomplete docs to me, not a problem of "what do we even define as unsound". If the contract is underspecified there isn't really anything you can say at all. This is a very common problem, and Rust have gone some ways towards higher rigour here than C/C++, where things like "I passed you a raw pointer, who owns it now" is frequently undocumented even.
Well, the alternative is for the documentation of every safe function that takes a mutable reference and a callback (or trait used like a callback) to define what value it leaves in the mutable reference if the callback panics. Maybe that should be happening, but i doubt it is in practice.
(FWIW, I think that this is a design flaw in Rust, and panicking while holding a mutable reference should permanently make the thing that the reference was borrowed from unusable, i.e. the same idea as mutex poisoning but extended to all mutable references. But it would be too much of a backwards compatibility break to do that now.)
Having each mutable reference also carry an addional flag on the side (which you may be able to store in a high bit, unless the architecture uses it for something else) sounds expensive. As does all the checks and logic around it.
The actual better option would be to not have panic or unwind at all. Allocations should be falliable for example. And if the program absolutely cannot continue that should be an abort, just like on embedded.
I wasn't planning to store the spare bits at runtime, but at compile-time: apart from catch_unwind (which already has safety rails although they're designed to be bypassable if necessary), basically the only change from current behaviour would be skipping the destructors of things that were mutably borrowed at the time (there might be creative strategies available to reclaim the memory to avoid a leak, but not the stored data).
I agree with you that going without panic-unwinds would likely be better (and that there's a reason that Rust lets you turn them off). They make the language much harder to reason about, and block useful abstractions like replace_with, so there's a real cost to supporting them. They do have uses for fault isolation, but I feel like panic-unwinds might be on the wrong side of the tradeoff (and that there might hopefully be some other way to isolate failing parts of a program) – although one problem is that Rust tests are currently based around panic-unwinds even though they don't conceptually have to be, which would be a huge backwards comparibility problem.
I think this is starting to get offtopic, though – maybe this needs to be discussed in a different thread.
Trying to follow the arguments that unsafe should be for memory safety only: then why are some methods in std marked unsafe, like str:from_utf8_unchecked or char::from_u32_unchecked?
Especially the latter is documented as " This function is unsafe, as it may construct invalid char values." ... so? AFAICT an invalid char by itself value wouldn't be memory unsafe. As it takes a u32 it also wouldn't fall into potentially uninitialized memory territory. I.e., it strikes me as very much similar to @tczajka's TwoDigit::new_unchecked.
You could model something like panic unwind as an effect, that is part of the function signature. Just like async or const for example. That way it becomes obvious where panics could happen.
I think this was discussed above (or maybe in another recent thread) already. Char has niches based on the valid range for Unicode. Thus it can lead to memory unsoundness.
My question there is: how? And why wouldn't the same niches in TwoDigit::new_unchecked or Prime::new_unchecked not lead to the same potential unsoundness?
An invalid char is actually directly memory-unsafe – for any zero-sized type Zst, the compiler currently stores the Err option of Result<char, Zst> using the bit pattern 0x110000 (which is invalid for a char). If you had a char with that bit pattern, you could create an Ok of it and it would read back as an Err containing a ZST of your choice – and being able to conjure ZSTs breaks memory safety, as they're often used to represent proofs that some memory-safety property applies.
This happens because the compiler is aware (through magic rustc-specific attributes) of a number of invalid bit patterns for char, and thus is able to use those patterns to reduce the size of enums.
There's currently work underway in the compiler internals to make the magic usable even outside the standard library, but it's still experimental and it's likely going to be a while before people can use it on their own types.
But that boils down to "because char is in std, and the compiler makes use of knowledge of its internal layout for optimizations." So either:
a) if I re-implement char in my own crate, then these methods shouldn't be marked unsafe, only because the compiler wouldn't make these optimizations. This sounds highly inconsistent to me.
Also, technically, a 0x11000000 char then is not memory unsafe itself, only its use due the trusting assumptions made by the compiler. By the arguments in this thread (as I read them), these methods instead shouldn't be unsafe - the compiler instead just shouldn't make these optimizations, as those could be potentially memory unsafe, or double-check the char before every time.
b) once the machinery for these niches is made available, it could be applied to Prime::new_unchecked or TwoDigit::new_unchecked, which then means they absolutely should be marked unsafe, which goes against the arguments in this thread.
I suspect expressing niches based on prime numbers will never be possible, but range limited integers is absolutely planned as I understand it. And many people want to use alignment niches or high bits for pointers/references. Alignment niches would be equivalent to the "even" example.
The Prime example can be written to exploit niches in stable Rust today.
The actual examples don't really matter, what matters is the idea that they convey: having APIs enforcing a contract is valuable, independently of the implementation.
Thanks to lifetimes, I only have to look at the function signature to ensure that reference usage is valid. That's good.
Thanks to unsafe, I should only have to check invariants/preconditions at the abstraction boundaries. That would be good.
Modularity is how you get software engineering to scale. Asking people to be careful with "incorrect but not memory-unsafe" code is the opposite of that.
Only practical for u8, not for u32 or u64. Still, this can be useful, I believe Layout uses something like that for storing alignment compactly.
That's fine, the actual impl does not matter. As a user you should not care if my PrimeU64 impl uses an enum or a simple int internally. The niche usage is a nice optimization, but not what defines the API contract.
What defines the API contract is that Prime::get() always returns a prime and that you can trust it to always be the case.
If I were the author of the Prime example, I would ideally want it to fulfill two properties.
- The type invariant (always a prime) can be fully relied upon by downstream crates, including for soundness.
- The crate is easy to audit for use in projects that ban unsafe.
(1) implies that if a new_unchecked method exists, it must be unsafe to call. (2) would make me shy away from using an unsafe field, as, in absence of downstream unsafe code, the crate isn't doing anything inherently unsafe, and requiring an unsafe block for every access of the stored value would be undesirable.
My two cents is that such a field should never exist. Even if it's not insta-UB at the language level as currently implemented, I would consider having an &str to non-UTF8 data to be straight up invalid, regardless of if / how it might actually be observed in practice. Instead, it should be a &[u8] or bstr or some other type that does not imply being valid UTF-8 by its mere existence.
This case seems like it would be served well by the suggested UnsafeRead<T> wrapper. That also has the advantage that UnsafeRead<T> could manually implement Debug just to output unsafe value or some such, thus making it so you could soundly #[derive(Debug)] on your struct.
Overall, I agree with those who have said that making a field unsafe to write feels much more strongly motivated than to make it unsafe to read.
One thing I'm not sure I understand about the Vec use case is why there is so much emphasis on len. After all, safe code within the vec module could just as easily replace (or shrink) buf and similarly trigger undefined behavior. Overall, this seems similar to the way unsafe is designed in general, as exemplified by raw pointers at the language level: creating a raw pointer isn't unsafe, but dereferencing it is. In the end, the onus is on the unsafe code to assert that the pointer is valid, either by leveraging privacy to ensure it knows where the pointer came from, or by using an unsafe fn to bubble the safety requirements up to the caller.
Setting an incorrect len or buf seems exactly as (un)safe as creating an invalid-to-deference raw pointer. The potential UB happens when using that len or buf or pointer to actually do a read, and an unsafe block is already required for that in all cases. It's an invariant of Vec, not the individual fields, that buf has an allocation holding at least len valid elements.
In a sufficiently complicated module, I can see the desire of having guardrails along the lines of "please remind me when I set this value that I have invariants to uphold," but based on my understanding of the proposal and the language, unsafe fields feel like a bigger dilution of unsafe to me than having unsafe fns at the privacy boundary to uphold type invariants that may or may not now or in the future be used by code internal or external for soundness, which itself seems contentious in this discussion.
Perhaps some kind of lint attribute would make more sense to enforce field updates to explain how they are upholding the type's invariants?