Just a bit of terms definition:
UB is a term of art. It's a bad name, and basically everyone involved in compilers agrees. It doesn't mean "undefined behavior" as in "behavior which is not defined" anymore. Even in the C standard, UB is defined roughly as "behavior on which this standard places no restrictions," whereas cases which are merely not mentioned fall under unspecified behavior.
Unspecified behavior is the one somewhat close to the informal meaning, as it means something similar to "any valid behavior of the abstract machine," though in practice it is often limited to taking an unspecified choice of a given set of behaviors.
UB, however, is the mathematical contradiction. UB cannot happen, because it is not a thing that exists. If UB "happens", you no longer have a valid execution of the source language, and all restrictions on the program behavior never existed in the first place. This is why UB can time travel.
In a short overly simplified summary, unspecified behavior puts no (or limited) restriction on the execution of the unspecified behavior. UB puts no (and I mean no) restriction on the entire behavior.
Is this what the original C specification authors had in mind when they first listed some behavior as undefined? Some will say yes, some will say no, but this is the status of the current understanding, and how 99% of compiler authors/scholars/philosophers communicate about this.
Now what this issue is about is that std::collections
's behavior is in fact unspecified without bound in the face of misbehaving implementations of core safe traits. And because the encapsulation boundary of std::collections
is in fact all of std
, said unspecified behavior could do arbitrary "safe" modification of other items inside std
's encapsulation boundary (via global state or otherwise) which could invalidate any assumptions about correctness consumer unsafe
code makes, for any std
item.
The correct solution is to bound the range of unspecified behavior. The minimal and "correct" way to do so is to specify that the unspecified misbehavior of an item is limited to other items/behaviors which are derived-from the misbehaving item. The difficult part is then to actually define which items are derived-from.
My first draft, and I think a reasonable upper bound, would be to say roughly that
The behavior which results from such a logic error is not specified, but will not result in undefined behavior; all safe functions remain safe to call. Misbehavior resulting from a logic error is restricted to just the collection which experienced the logic error, any values holding a borrow of the collection, and any values later produced by misbehaving values. Misbehavior can include (but is not limited to) panics, incorrect results, aborts, memory leaks, and non-termination, but does not include any behavior considered undefined.
This ensures that a HashMap
misbehaving does not cause a Vec
to misbehave, nor another unrelated HashMap
, but allows everything that even potentially observes the misbehaving value to misbehave.
Individual collections can then put further bounds on misbehavior, if so desired.