[Pre-RFC]: IntoOwned trait that harmonizes Cow and ToOwned

I often find myself with a dilemma in rust:

  • Some function foo doesn't know up front whether it will need an owned copy of its (potentially expensive to clone) arg, because it depends on the function's internal control flow. Especially when enum variants are involved.
  • The callers of foo sometimes have a convenient owned value they can donate, but sometimes have only a borrowed reference.

If foo requires owned input, callers who only have a borrowed value are forced to clone it, even tho the function may not actually need ownership after all.

If foo requires borrowed input, callers who could have easily donated an owned value are forced to pass a reference, and the function in turn is forced to clone that reference if it needs ownership after all.

The standard library suffers from this dilemma as well. To give two examples:

  • The Ord trait's cmp method requires borrowed args (which is annoying for a caller who has an owned Copy value, e.g. value.cmp(&2)), while the trait's max and min methods require owned args (even tho only one of the two needs to be owned; we just can't guess which one). Similar issues plague all the parent traits (PartialOrd, PartialEq, Eq), and these propagate to sugar syntax like < and !=.
  • Similarly, the various mathematical operators like Add and Mul take owned args, which only makes sense for Copy types. Even from an implementation convenience perspective (mutate and return an owned arg), only one of the args needs to be owned. And for a hypothetical type like an arbitrary precision integer, the output of a Mul will almost always be larger than either input, neither of which is mutated. And again, this awkwardness propagates to the sugar syntax like + and *.
  • HashMap::entry takes an owned key, but only needs an owned value if the method returns Entry::Vacant AND the user calls an insertion method such as Entry::or_default.

Several standard library concepts dance around this issue without really solving it:

  • std::borrow::Borrow and std::convert::AsRef are the opposite of what we need -- they allow a function that always wants a reference to accept a value or a reference from the caller, and the receiver can obtain a reference by calling its borrow or as_ref method, respectively.
  • std::borrow::Cow explicitly captures the ability to pass owned or borrowed data, but is very clunky to use and imposes lifetime noise. And it also hides the borrowed vs. owned status behind enum variants, which forces branching that cannot be eliminated with generics, even if the caller always passes either owned or borrowed.
  • std::borrow::ToOwned is frustratingly close... but its to_owned method receives &self and so would force cloning even for the T -> T case.

After playing around a bit, I settled on the following:

/// A zero-cost generalization of [`std::borrow::Cow`].
///
/// Some functions do not know up front whether they will need an owned copy
/// of their input, especially when dealing with enum variants. They must decide
/// whether to require callers to pass borrowed or owned values, which leads to
/// unnecessary cloning.
pub trait IntoOwned<T: Clone> {
    /// If true, [`Self::into_owned`] is guaranteed not to clone a borrowed value.
    fn is_owned(&self) -> bool;
    /// Returns a (possibly cloned) instance of `T`, consuming self.
    fn into_owned(self) -> T;
    /// Returns a borrowed reference to `T`
    fn as_ref(&self) -> &T;
}

An indecisive function can then accept impl IntoOwned<Foo> and chooses whether to call as_ref or into_owned, knowing that into_owned only triggers a clone if the caller passed a borrowed reference. Callers are under no obligation to pre-emptively clone their borrowed reference, knowing that the function will clone it only if needed.

As the playground example shows, the trait allows blanket impl for T, &T, &mut T, Cow<'a, T>, Box<T>, Arc<T>, etc.

One could also require impl Into<T> + Copy if they just want to improve ergonomics for functions that take e.g. integers and whose callers often have to dereference a borrowed reference (i.e. because of a match statement). Virtually all of the math and comparison functions suffer this problem, for example.

This is somewhat related to Ergonomics initiative discussion: Allowing owned values where references are expected, but would not require any language changes. It could even technically be a crate instead of going in the standard library, but that would not allow fixing up the ergonomics of existing standard library functions that suffer this problem.

So the question: Might a trait like this improve ergonomics enough to be worth considering for some future edition of Rust? What explorations would be helpful to answer that question? Did I miss any important details that would either kill this idea or make it even more appealing?

For example, updating library API to allow it would potentially churn several types and classes, but I think it would generally be forward compatible for users. Callers can still pass *x or x.clone() to functions that used to require T, and callers can still pass &x to functions that used to require &T.

7 Likes

Related, there was recently a #t-libs > Cow in core conversation that might be interesting.

There I was musing that arguably it should be FromBorrowed, since that way it's impl<A: Allocator> FromBorrowed<str> for String<A>, which is much better for coherence (since it wants to live in alloc) than impl<A: Allocator> IntoOwned<String<A>> for str which is a blanket on a foreign type.

(Though others pointed out that From<&str> might be fine, without needing the separate FromBorrowed trait.)

3 Likes

Can ToOwned be fixed by adding into_owned to the trait? It could have a default impl that forwards to to_owned.


The T: Clone bound is too restrictive, because there are types that can be constructed by copying/cloning data, but can't be cloned with Clone, e.g. Box<dyn Tr>.

2 Likes

Good point. The trait itself doesn't need T: Clone, even if some implementations do:

  • T and Box<T> don't need T: Clone
  • &T, Arc<T> and Cow<'_, T> do need T: Clone

ToOwned is not a template trait, so it can only have two blanket impl: T and [T]. Really, it's defined in terms of Borrow, which does have blanket impl for all the types we'd want (T, &T, Box<T>, Arc<T>, Cow<'_, T>, etc).

How would one "upgrade" the behavior of ToOwned::into_owned for e.g. Box or Cow, when neither implements ToOwned? It seems like the method would have to be defined in terms of a new into_owned method on Borrow?

I may be missing something but I don't see how that is related to what @kornel asked. Default impls is diffrent than blanket impls.

One could define a default impl. But it would always clone, which defeats the whole purpose of the new API. Presumably, one would want to override that default impl with something that does not clone, when an owned value is already available.

But I couldn't figure out how to do that, given the various trait blanket impl involved? For example, if I wanted <Box<T> as ToOwned>::into_owned to return *self, or <Cow<'_, T> as ToOwned>::into_owned to wrap Cow::into_owned, how would I override the default-provided trait method?

1 Like

I spent some more time in the playground, and ended up with something that more closely resembles ToOwned. It's still an independent trait, but maybe a step in the right direction?

pub trait IntoOwned<T: ?Sized> {
    type Owned: Borrow<T>;
    fn is_owned(&self) -> bool;
    fn into_owned(self) -> Self::Owned;
    fn borrow(&self) -> &T;
}

The above can still define blanket impl for T, Box<T>, Arc<T>, Cow<'_, T>, etc (with type Owned=T). But it takes on a distinctly ToOwned flavor with additional blanket impl of IntoOwned<[T]> for &[T], Vec<T> and Cow<'_, [T]> that all produce Vec<T>:

impl<T: Clone> IntoOwned<[T]> for &[T] {
    type Owned = Vec<T>;
    fn is_owned(&self) -> bool { false }
    fn into_owned(self) -> Vec<T> { self.to_vec() }
    fn borrow(&self) -> &[T] { self }
}

impl<T: Clone> IntoOwned<[T]> for Vec<T> {
    type Owned = Vec<T>;
    fn is_owned(&self) -> bool { true }
    fn into_owned(self) -> Vec<T> { self }
    fn borrow(&self) -> &[T] { self }
}

impl<'a, T: Clone> IntoOwned<[T]> for std::borrow::Cow<'a, [T]> {
    type Owned = Vec<T>;
    fn is_owned(&self) -> bool { matches!(self, Self::Owned(_)) }
    fn into_owned(self) -> Vec<T> { self.into_owned() }
    fn borrow(&self) -> &[T] { self }
}

Those new overloads rely on impl<T> Borrow<[T]> for Vec<T> and mimic impl<T> ToOwned for [T]. I think we could define similar impl for every other type that has ToOwned, but it would have to be done "manually" (three trait impl for each). Trying to generalize with impl<T: ToOwned> IntoOwned<T::Owned> for Cow<'_, T> produced a conflicting trait impl vs. impl<T> IntoOwned<T> for T.

Overall, it appears that ToOwned covers a strict subset of the types that can impl IntoOwned. And this seems to be because ToOwned is not a template trait (limited to blanket impl for T and [T]), which IntoOwned is a template trait and can support many more blanket impl.

This all comes down to a different approach to the trait signature. If ToOwned also took the same template arg as IntoOwned, it could cover the same set of types (playground).

IntoOwned could also provide to_owned and clone_into methods like ToOwned -- but that would require adding either T: Clone or T: ToOwned constraint the trait.

edit: ToOwned can't work, but not because it lacks into_owned, but because when creating String, the self is str rather than &str. It's missing a layer of indirection necessary to make it useful.

use std::borrow::Borrow;
use std::hash::Hash;
use std::collections::HashMap;

fn insert<T, Q, K, V>(hash: &mut HashMap<K, V>, k: T, v: V) 
    where K: Eq + Hash + Borrow<Q>,
    T: AsRef<Q> + ToOwned<Owned = K>,
    Q: Hash + Eq + ?Sized
{
    if !hash.contains_key(k.as_ref()) {
        hash.insert(k.to_owned(), v); // into_owned wouldn't help here
    }
}

fn main() {
    let mut h = HashMap::new();
    h.insert(String::from("owned"), 1u8);
    insert(&mut h, "borrowed", 1); // won't compile
}

It looks like the IntoOwned signature I originally proposed above doesn't seem to work at all with the hash table insert example. I tried a bunch of tweaks and there was always some compiler error. The closest I got (playground) used the signature:

fn insert<T, Q, K, V>(hash: &mut HashMap<K, V>, k: T, v: V) 
where
    K: Eq + Hash + Borrow<Q>,
    T: IntoOwned<Q, Owned = K>,
    Q: Hash + Eq,
{
    if !hash.contains_key(k.borrow()) {
        hash.insert(k.into_owned(), v);
    }
}

... but that fails because the path from T to Q to K allows multiple Q:

error[E0283]: type annotations needed
  --> src/main.rs:83:5
   |
83 |     insert(&mut h, "borrowed", 1);
   |     ^^^^^^ ------ type must be known at this point
   |     |
   |     cannot infer type of the type parameter `Q` declared on the function `insert`
   |
   = note: multiple `impl`s satisfying `String: Borrow<_>` found in the following crates: `alloc`, `core`:
           - impl Borrow<str> for String;
           - impl<T> Borrow<T> for T
             where T: ?Sized;

A different approach (playground) seems to address most of the issues by "reversing" things -- the template parameter is the owned type, and the associated type is the borrowed type.

pub trait IntoOwned<T>
{
    type Borrowed: ?Sized;
    fn is_owned(&self) -> bool;
    fn into_owned(self) -> T;
    fn to_borrowed(&self) -> &Self::Borrowed;
}

It eliminates the immediate ambiguous impl issue, and also reduces the number of impl needed in general. But it only supports "normal" Cow for T: Clone, not T: ToOwned, and manually adding e.g. IntoOwned<String> for Cow<'_, str> brings back the ambiguous impl.

Closer, but still not quite there...

Not sure if this is a good place to bring this up, or if it should be a separate thread, but one of the big limitations of the Borrow and ToOwned pattern today is it predates GATs and can't work with newtypes that wrap &'a T as MyNewtype<'a>. And unfortunately as there's no way to safely construct a &'a MyNewtype from &'a T, that means trying to make the Borrow/ToOwned pattern work (so you can use Cow) with custom newtypes pretty much requires unsafe code.

However, you can define a pair of traits built on GATs that can support both the Borrow pattern and custom newtype structs that take a lifetime, e.g. RefToOwned and OwnedToRef. It would be nice if there were traits that were this flexible that could also work with Cow.

5 Likes

I do think that's interesting, because it also affects Cow (which has a &).

Makes me wonder if there's space for a "GatCow" type that can go in Core and collects a bunch of these different wishlist items together.

7 Likes

Interesting! How general can that pair of traits be? They seem to be pretty narrowly used in that crate, but could one provide a blanket impl of OwnedToRef that allows both T and &T to act as &T, or a blanket impl of RefToOwned that obtains T from both T and &T? What about other things that impl Borrow or AsRef?

If such traits were in core, I think you could have a blanket impl of Borrow for T: OwnedToRef and ToOwned for T: RefToOwned. They do everything the other traits can, and more.

The existing blanket impls of Borrow for T, &T, &mut T, Cow, etc would almost certainly conflict? Ditto for the blanket impl of ToOwned for T?

Or is the idea that these proposed new blanket impl would replace all existing blanket impl?

1 Like

I guess we could try it out, by creating doppleganger traits in a build of std/core, and seeing what we can get away with adding and what we cannot?

Last time I tried that tho, I ran into the gotcha that std::collections::HashMap is a thin wrapper around the hashbrown crate, which has specific internal code just for std to use. So there's a circular build dependency of some kind that I couldn't figure out how to untangle. Pretty typical for compilers and language bootstrap situations.

aah yes, my bad!