Are AsRef and Borrow supposed to be deterministic?

Whoever is familiar with the topic? I assume in the internals forum there are people who can say whether that's a good idea or not. I realize that final decision is up to rust-lang teams, but I wanted to get a feedback before I make a PR.

2¢: I think that it would be good if there was a single place this concept was defined, rather than wording it anew for each trait where it comes up.

In a previous thread, there was quite a long discussion about how this it is hard to define and the existing library documentation for HashMap could be taken as meaning “may do absolutely anything, but not UB”, which is not really a useful claim since “anything” could be taken to include modifying application/system state in ways that might not be technically UB but are equally destructive to the desired functioning of a program. Here's what I wrote trying to define this concept; talking about the other end — the code that uses the trait — and Hash rather than Borrow, but I think it is the same general principle.

The above is talking about HashMap, not Hash, but that's just the opposite end of the contract — though harder to nail down, since it's not clear what the abstraction/encapsulation-boundary is for “thing that uses Borrow or Hash”.

So, I think that this is a coherent concept, but it is not simple and so it deserves its own name and reference page, not just an attempt to explain it in each trait or type whose contract involves it.

2 Likes

I've really wanted something like a guaranteed deterministic AsRef/Borrow for things like asserting invariants in a constructor and then using unsafe to perform subsequent conversions:

/// Wrapper type which *owns* an inner string-like buffer (no lifetime)
pub struct StringLike<Buf: AsRef<[u8]>> {
    // Owned buffer; deliberately not borrowed
    buffer: Buf
}

impl<Buf: AsRef<[u8]>> StringLike<Buf> {
    pub fn new(buffer: Buf) -> Result<Self, str::Utf8Error> {
        // Ensure buffer contains valid UTF-8
        str::from_utf8(buffer.as_ref())?;
        Ok(Self { buffer })
    }
}

impl<Buf: AsRef<[u8]>> AsRef<str> for StringLike<Buf> {
    fn as_ref(&self) -> &str {
        // (UN)SAFETY: `Buf` was checked for valid UTF-8 in the constructor,
        // but interior mutability could allow it to change(!!!)
        unsafe { str::from_utf8_unchecked(self.buffer.as_ref()) }
    }
}

(Note: per the comments the goal is for this type to own the buffer and not have a lifetime. Borrowing the value would prevent mutation but also adds a lifetime to StringLike, when the point of the example is to have an owned type without a lifetime)

...but notably this relies on deterministic AsRef as a safety invariant, which would seem to be a fraught endeavor. Interior mutability provides a backdoor to allow Buf to be changed behind our back, even though StringLike owns it.

1 Like

I'd say just submit it and see what T-libs-api says about it.

I don't know if it can say "must", even in just a GIGO-not-UB sense, since that requirement wasn't there before. But at least a "callers will generally expect that this returns conceptually-the-same thing every time, and thus you should stick to that behaviour"-style note seems perfectly reasonable.

1 Like

"Every time" is insufficient wording. Vec can't do that, for example. "From consecutive calls with no other operations"?

1 Like

I think interior mutability is much bigger concern than non-determinism / randomness. Any callback can modify the value, not to mention other threads.

In this particular case, you could create a new function, for example concat_deterministic (kind of like we have sort and sort_unstable), that (safely) assumes that borrow will be deterministic. If it isn't, it will return incorrect (but still sound - no UB) result.

Adding a semantic requirement on even basic traits like Borrow or Eq to be deterministic even in the face of interior mutability feels pointless, since Cell for example would instantly violate those rules.

Finally, an example of a trait implementation that makes sense (at least to me) to be actually randomized - Debug for a password struct - debug print would generate a new unique salt, and then show the saltand password hashed with that salt. That way, you can check from a debug log if it is a particular password.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.