Unsafe assertion invariants

As I understand Rust conception is compiler with strong invariants in safe mode and the manually maintaing these invariants in unsafe mode.

For now these statements are written in arbitrary form in the documentation:

/// This function is unsafe because it does not check that the bytes passed
/// to it are valid UTF-8. If this constraint is violated, it may cause
/// memory unsafety issues with future users of the `String`, as the rest of
/// the standard library assumes that `String`s are valid UTF-8.
pub unsafe fn from_utf8_unchecked(bytes: Vec<u8>) -> String

My suggestion is to add a standardized format for asserions:

/// #Invariatns
/// ```
/// assert!(String::from(bytes).is_ok());
/// ```
pub unsafe fn from_utf8_unchecked(bytes: Vec<u8>) -> String

These assertions could be placed inside a function and marked as debug_assert. But the difference is that the documentation in fact is part of the function signature and the user can easily read these statements without having to examine the function code itself. Also, the compiler can actually add these assertions (converting them to debug_assert*) to the beginning of the unsafe function or call them in debug mode (but there is problem: what if assert contain mutation? I don't know).

In addition as variant 'partial invariants', which will indicate that their violation is guaranteed to lead to an error, but their observance does not guarantee the absence of errors.

The goal of this proposal is to improve the experience of using unsafe features.

This proposal is just an idea 'it would be nice...'. I don't know how many unsafe function in std and how hard to update them all.

ps Did I just invent contract programming?)

Many unsafe pre-conditions cannot be expressed as simple assertions. So this will be applicable less often than you think.

Many unsafe functions already use debug assertions for verifiable pre-conditions. But doing that for from_utf8_unchecked is problematic, since the assertions has an O(n) cost, while the function itself is O(1). So the assertion would impose an unacceptable performance overhead in many use-cases.

Still, I think adding a machine readable way to specify pre- and post-conditions (be they safe or unsafe) could be valuable.

6 Likes

I think this project goal may meet your requirement. Instrument the Rust standard library with safety contracts

Many libstd functions already have debug_assert checking these things.

But doing that for from_utf8_unchecked is problematic, since the assertions has an O(n) cost, while the function itself is O(1)

Yes, but I talking about debug_assertion which doesn't added to release builds.

Many unsafe functions already use debug assertions for verifiable pre-conditions.

The difference is whether these statements are exhaustive. That is, whether following these statements guarantees the correct operation of the function.

I think this project goal may meet your requirement. Instrument the Rust standard library with safety contracts

Contract programming is some different thing. It looks like not Rust way: type String incapsulate invariant about correct utf8 (and do it in imperative way). Therefore, we pass the String type instead of &[u8] with utf8 assertion.

I think such contraptions are probably harmful.

Many libstd functions already have debug_assert checking these things.

My proposal is not only about automatic validation. Putting assertions in the documentation also makes the feature easier to use.

It's considered too slow, even for debug builds.

1 Like

Not to mention, if the caller is interested, String::from_utf8 already exists. One could make their own cfg-dependent call to _unchecked when needed if the assertion is actually interesting. I believe this is the case with most _unchecked APIs: there is an associated checked API that can be called when enforcement is wanted.

1 Like