Why even unused data needs to be valid

RalfJung · July 15, 2020, 11:35am

I think here you are mixing up Rust's two kinds of invariants. Only the validity invariant is UB when being violated. The validity invariant is fixed by the language spec, the user has no influence here. In contrast, what you are describing is a safety invariant.

The compiler doesn't care when code violates safety invariants, it doesn't even know what safety invariants are. You only get actual, Miri-detectable, "language UB" once the code does something that is specified as UB in the reference.

When libraries specify assumptions they make about user code, violations of those assumptions do not necessarily lead to language-level UB, but they could. We could call this "library UB", and it basically means you are leaving the stability guarantee provided by the library and may encounter undocumented behavior (which may or may not be language UB now, and that could change in the future as well with library upgrades).

For example, it is not UB to create a non-UTF-8 &str. But it could be UB to call a &str-taking method on such an ill-formed str (depending on what that method does, it may crucially rely on UTF-8). The ill-formed str violates the safety invariant but satisfies the validity invariant (the latter is the same as the validity invariant of &[u8]).

Topic		Replies	Views
Two Kinds of Invariants: Safety and Validity Unsafe Code Guidelines	27	6260	March 25, 2019
Mem::uninitialized, `!` and trap representations language design	56	7108	March 25, 2019
Terminology around unsafe, undefined behaviour, and invariants Unsafe Code Guidelines	40	3141	December 22, 2024
Role of UB / uninitialized memory Unsafe Code Guidelines	78	10083	March 25, 2019
Types as Contracts: Implementation and Evaluation Unsafe Code Guidelines	20	3346	March 25, 2019

Why even unused data needs to be valid

Related topics