Are AsRef and Borrow supposed to be deterministic?

The docs for AsRef and Borrow don't say they are supposed to always return the exact same value. (Borrow is supposed return something that compares the same, but technically it could still return different values that compare equal.)

I know for safety they can never be trusted, but where safety isn't an issue, would it be appropriate to treat these methods as always returning the exact same value?

For example, could slice::concat be changed from this:

fn concat<T, V: Borrow<[T]>>(slice: &[V]) -> Vec<T> {
    let size = slice.iter().map(|slice| slice.borrow().len()).sum();
    let mut result = Vec::with_capacity(size);
    for v in slice {
        result.extend_from_slice(v.borrow())
    }
    result
}

to this:

fn concat<T, V: Borrow<[T]>>(slice: &[V]) -> Vec<T> {
    let size = slice.iter().map(|slice| slice.borrow().len()).sum();
    let mut result = Vec::with_capacity(size);
    for v in slice {
        let tmp = v.borrow();
        if tmp.len() <= result.capacity() - result.len() {        
            result.extend_from_slice(tmp)
        }
    }
    result
}

The capacity check optimizes out vec reallocation code without compromising safety, but a weird borrow implementation could result in some items getting dropped.

1 Like

I think you wanted your example to be more generic than this. For slice: &[&[T]], slice.borrow().len() is equivalent to just slice.len().

I'm not sure even this is true. The docs say "In particular Eq , Ord and Hash must be equivalent for borrowed and owned values" but I think it's meant to mean "if both types implement these traits".

I'm all for semantic requirements that make sense. That's why we have Eq and not just a Fn(T, T) -> bool, or Ord and not just Fn(T, T) -> Ordering. Or course these cannot be relies upon in unsafe code for soundness, but you should be able to rely on them in safe code for "correctness" let's say.

I'm not sure if that is the right word, but you code should be free to panic, loop infinitely, leak an arbitrary amount of memory or just return wrong results if someone gives you a trait implementation that doesn't meet these semantic constraints.

When it comes to determinism, I'm not sure I can imagine a case where Borrow would benefit from not being deterministic. On the flip side, I'm not sure your example even benefits from it being deterministic that much:

Isn't the "capacity check" just moved out of the extend_from_slice? Is the implementation of extend_from_slice really anyhing else than "check if there is space, if there is not, extend the vector. Then copy the values". If you change that "check if there is space, if there is not, do nothing, otherwise copy the values" i don't see how that is any faster if no actual reallocations happen in either case (which they wouldn't if Borrow was deterministic).

1 Like

When I wrote "compare equal" I specifically meant it as a short for "Eq , Ord and Hash must be equivalent for borrowed and owned values".

Yes, the example could be simplified, but the original is doomed to use Borrow for back-compat:

The difference is in code bloat.

The first version is:

alloc
loop {
    if no capacity { 
       calculate new size and panic if capacity overflows
       realloc storage
       copy data over to new storage 
    }
    append
}

the second is:

alloc
loop {
    if capacity { append }
}

The code for reallocating the vector is a bit much, especially if used in a simple case like [a, b].concat(). The version with hoisted capacity check is almost just pure memcpys, with an extra branch that has to be there for safety.

Indeed, and that interpretation is then insufficient for the assumptions in the OP because you can have

struct NoEq { ... }
struct WeirdBorrow { ... }
impl Borrow<[NoEq]> for WeirdBorrow {
    // borrow that randomly returns
    // different length slices every call
}

and that's "OK" because [NoEq] doesn't implement Eq when NoEq doesn't.

You could also make the newtype implement Eq et al, but in a way that is insensitive to the length changes, and then make borrow() return random-length slice.

This is obviously very weird and doesn't seem useful at all. But my question is: can the libstd have a policy of garbage-in-garbage-out in such case?

And should the docs for Borrow and AsRef be updated to explicitly say that their methods are expected to return exactly the same reference every time?

2 Likes

You can also have:

struct NoEq { ... }

impl Borrow<i32> for NoEq

that returns a reference to a different i32 every time you call borrow, and this doesn't violate the rules because NoEq doesn't implement Eq.

Eq has to be reflexive, so with T: Eq and x: T, x == x, but you can't make two slices equal if their lengths aren't.

Ok, but what about AsRef?

So the first version will jump "far away" after the if evaluates to false, whereas the second will just follow to the next instruction? I'm no expert on how pipelining or branch prediction works to be able to tell if than is faster or not, but I guess that makes sense.

Also less code means happier cache, ram and binary size, right?

Yes, smaller binary size is generally better. The needless realloc code makes concat() not a zero-cost abstraction.