Feedback/Brainstorming: Enhancing Trait Object Ergonomics

This topic is about ergonomics around cloning trait objects, but it also applies to other operations on trait objects. The more I think about this topic, the more questions pop up. Is there any RFC out there, that has already answered all / most of these questions, that may have even been approved, already, but not implemented, yet? I think most (if not all) people working with trait objects would appreciate better (and especially more safe) support at the language level.

The following code is copied from dyn-clone (including the example):

use crate::sealed::{Private, Sealed};

mod sealed {
    pub trait Sealed {}
    impl<T: Clone> Sealed for T {}
    pub struct Private;
}

pub trait DynClone: Sealed {
    // Not public API
    #[doc(hidden)]
    fn __clone_box(&self, _: Private) -> *mut ();
}

pub fn clone_box<T>(t: &T) -> Box<T>
where
    T: ?Sized + DynClone,
{
    let mut fat_ptr = t as *const T;
    unsafe {
        let data_ptr = &mut fat_ptr as *mut *const T as *mut *mut ();
        assert_eq!(*data_ptr as *const (), t as *const T as *const ());
        *data_ptr = <T as DynClone>::__clone_box(t, Private);
    }
    unsafe { Box::from_raw(fat_ptr as *mut T) }
}

impl<T> DynClone for T
where
    T: Clone,
{
    fn __clone_box(&self, _: Private) -> *mut () {
        Box::into_raw(Box::new(self.clone())) as *mut ()
    }
}

trait MyTrait: DynClone {
    fn recite(&self);
}

impl MyTrait for String {
    fn recite(&self) {
        println!("{} ?", self);
    }
}

fn main() {
    let line = "The slithy structs did gyre and gimble the namespace";

    // Build a trait object holding a String.
    // This requires String to implement MyTrait and std::clone::Clone.
    let x: Box<dyn MyTrait> = Box::new(String::from(line));
    let x_ref: &dyn MyTrait = &*x;

    x.recite();

    // The type of x2 is a Box<dyn MyTrait> cloned from x.
    let x2 = clone_box(x_ref);

    x2.recite();
}

That this works is actually quite interesting and makes me wonder why there is no easier way to achieve this. The part that strikes me as surprising is <T as DynClone>::__clone_box(t, Private) in fn clone_box<T>. T may be Sized and DynClone may be Sized, as well, but DynClone is only implemented for Sized types, which begs the question why the compiler doesn't throw an error when trying to attempt to call the function with an unsized type.

Since this code compiles, why shouldn't the following code compile, too? (I left out anything, that I didn't change)

// [...]

pub trait DynClone: Clone + Sealed {
    // Not public API
    #[doc(hidden)]
    fn __clone_box(&self, _: Private) -> *mut () {
        Box::into_raw(Box::new(self.clone())) as *mut ()
    }
}

impl<T: Clone> DynClone for T {}

// [...]

If it compiled, this should behave the same as the previous example. This means, having a super-trait, that is Sized should not necessarily prevent object safety. It might be treated like a trait with all methods being marked with where Self: Sized[1], but that's not all there is to it. The first example also shows, that it should be possible to call where Self: Sized methods from where Self: ?Sized methods, if the only reason they're not object-safe is, that they return Self (this could probably be more relaxed; the example simply doesn't prove anything else). In this example, -> Self would be treated like -> impl DynClone, being comparable to -> impl Iterator which is in use, today.

[1] This will likely be confusing, though, since Sized is currently inheritable, while it wouldn't be in my example. This seems like a stability issue and is probably the answer to the question, why it wouldn't work. Maybe, it requires some additional syntax to be explicit about it?

This opens up another question. Why wouldn't we be able to just write the following?

pub trait DynClone: Clone + Sealed {
    fn clone_box(&self) -> Box<dyn Self> {
        Box::new(self.clone())
    }
}

This would be performing the (unsafe) operations from the free-standing clone_box, without having to call it. That would make it more ergonomic to use.

The end goal would be for Clone in the standard library to expose this behavior, instead of having this be a feature of an external crate. If one deals with a Box<dyn {trait: Clone}>, one would never be able to call the regular clone method. If we could somehow re-use the same name, though, instead of having to use myfn_box everywhere for the polymorphic counterpart, that would be great. That means, calling Clone::clone on a sized type yields T while calling it on a DST yields Box<dyn <T as {trait: Clone}>> or similar.

pub trait Clone {
    fn clone(&self) -> Self where Self: Sized { ... }
}

// Kinda what I imagine, but obviously doesn't work, at the moment.
// Perhaps, it's not even the best approach
impl dyn Clone {
    pub fn clone(&self) -> Box<Self> {
        Box::new(Clone::clone(&self))
    }
}

However, being stuck with Box in a generic context is questionable, because it might be appropriate for other heap-allocating structures or custom allocators to be used, instead. Custom structures may have different internal layout requirements (minimum alignment) which do not allow conversion from Box<T> without reallocation.

This is where I'm currently stuck at. All of the proposed solutions don't really work out of the box, either. The examples are just conceptual, not how I imagine it'd actually look like, if it would become part of the language.

All in all, I'm neither satisfied with what Rust currently offers nor am I able to work out a solution I can proudly present as the solution to the problem at hand. This makes me unhappy, because I'm of the opinion, that if one wants to negatively critize someone/something, they should also invest some time to come up with an advice on how to improve, but I just can't come up with anyting that really works, yet and also didn't want to wait until I did, because who knows how long that will take.

I thought a bit more about the topic. Maybe, something like the following could work?

trait Clone {
    dyn fn clone(&self) -> Self;
    // Implies:
    // fn clone(&self) -> Self where Self: Sized;
    // This is implemented by the compiler, derived from the implementation of the clone method for `Sized` types:
    // fn __clone<A: Allocator>(&self, alloc: A) -> *mut u8
    // where Self: !Sized { … }
    // EDIT: Generics don't work, tho, so this is how it has to be defined:
    // fn __clone(&self, alloc: &mut dyn Allocator) -> *mut u8
    // where Self: !Sized { … }
}

Clone::__clone is an implementation detail. This would only solve part of the problem. dyn fns also have to be automatically implemented for DSTs:

// `dyn fn clone …` also implies (conceptually):
impl<T> T where T: Clone + !Sized {
    fn clone<A: Allocator>(&self, alloc: A) -> Box<Self, A> { … }
}

This should enable trait methods returning Self to be object-safe.

Maybe dyn fn isn't even needed and returning Self already implies all of the shown magic, although it raises questions about stability. It'd also only cover the simplest case. What about composite types containing Self? and what about Self as part of a parameter type? I haven't thought about any of those questions, yet.

__clone isn't object safe as it is generic. Generic functions can't be made object safe without embedding the compiler in the executable and codegening it at runtime.

1 Like

Oops! You're totally correct, of course. I had that in mind at the beginning, but somehow forgot about it while writing and didn't address that, at all.

This opens up a completely new question: "How do we make generic trait methods object-safe?"

I think, for simple trait bounds, the solution is quite simple: Generate a polymorphic function by "dynamifying" the generic function.

fn __clone<A: Allocator>(&self, alloc: A) -> *mut u8 …

becomes

fn __clone(&self, alloc: Box<dyn Allocator>) -> *mut u8 …

… except, that this is me cheating. It's actually

fn __clone<A: Allocator>(&self, alloc: Box<dyn Allocator, A>) -> *mut u8 …

and we're back at square 1. The only solution is to work with references:

fn __clone(&self, alloc: &mut dyn Allocator) -> *mut u8 …

Whether this has to be a mutable reference is a question for the allocator WG. I expect a shared reference with interior mutability to end up as the solution, but that's kinda out of scope for this discussion. A mutable reference will always work and thus be good enough for a prototype. This should hopefully work in theory, now.