Pre-RFC: Lightweight cross-casting (Closed)

Unfortunately, generic impl trump the particular scheme exposed below, and suggest that the user should instead advertise exactly which traits they wish to expose. This in turn is (partially?) implemented by the query_interface crate as noticed by @eddyb below. Let crates implement this functionality then!


  • Start Date: (fill me in with today’s date, YYYY-MM-DD)
  • RFC PR: (leave this empty)
  • Rust Issue: (leave this empty)

Summary

Provides barebone and lightweight cross-casting abilities:

  • the ability to determine at run-time whether a given struct or enum implements a given trait, and if so retrieving the virtual pointer.
  • using this virtual pointer, the ability to cross-cast from one trait reference to another, or form a new trait reference from scratch.

Motivation

The ability to query the Run-Time Type Information (thereafter abreviated RTTI) and use this information to cross-cast is essential in a number of situation to minimize dependencies.

An interesting example is the Acyclic Visitor pattern, which improves upon the well-known Visitor pattern by cutting away the dependencies, making it possible to add new elements to be visited without requiring modifying the existing elements.

In C++ (similarly to Java, or others) it would implemented thus:

class FruitVisitor { virtual ~FruitVisitor() {} };

class Fruit {
public:
    virtual void accept(FruitVisitor const& v) const = 0;
    virtual ~Fruit() {}
};

And adding a new fruit is done by providing a dedicated visitor with it:

class Apple: public Fruit {
public:
    virtual void accept(FruitVisitor const& v) const {
        if (AppleVisitor const* av = dynamic_cast<AppleVisitor>(&v)) {
            av->visit(*this);
        }
    }
};

class AppleVisitor: public FruitVisitor {
public:
    virtual void visit(Apple const& a) { ... }
};

This pattern, today, cannot be expressed in Rust as far as I know.

Detailed design

Goals

A quick check-list of intended goals:

  1. Allow querying, at run-time, the traits implemented by a piece of data.
  2. Allow cross-casting from one trait reference to another, not necessarily related, trait.
  3. Allow forming a trait reference from an opaque piece of data.

Constraints:

  • Lightweight: “You don’t pay for what you don’t use”
  • Uncompromising performance: “You could not code it yourself (much) faster”

Framework

Before we start, however, this author would like to first establish a framework which RTTI extensions to the language should follow.

It is paramount, to remain relevant in the embedded world or other memory-constrained environments, that Rust implementations strictly adhere to the “You don’t pay for what you don’t use” philosophy.

RTTI extensions are no exception to this guideline, and therefore this author firmly recommends that they avoid imposing a memory overhead (binary bloat) when unused. Going further, this author recommends that even when used, those extensions should only contribute toward memory footprint for those items with which they are used.

The implementation proposed below honours this quite simply by being lazy: the necessary information on which the run-time querying is done is captured by using a compiler intrinsic before type-erasure.

The main advantage is that the compiler may only emit this information for the types for which the intrinsic is called: it is lazy. Furthermore, it may use unnamed/mergeable symbols so that multiple emissions be folded together by the linker.

RTTI

The core of the implementation requires 2 intrinsics and 2 types:

  • get_known_impled_traits<T>() -> KnownImpledTraits
  • get_known_impling_data<T: ?Sized>() -> KnownImplingData

The two types are opaque, and contain the following methods:

impl KnownImpledTraits {
    pub fn get_data_id(&self) -> TypeId;
    pub fn get_vtable(&self, id: TypeId) -> Option<*mut ()>;
}

impl KnownImplingData {
    pub fn get_trait_id(&self) -> TypeId;
    pub fn get_vtable(&self, id: TypeId) -> Option<*mut ()>;
}

It is suggested that the underlying representation be akin to:

struct KnownImpledTraitsImpl {
    data_id: TypeId,
    traits: [u64],
}

pub struct KnownImpledTraits {
    ptr: &'static KnownImpledTraitsImpl,
}

Where traits is a an array containing the TypeId and virtual pointer (*mut ()) of the traits known to be implemented by this struct, leading to the following in memory representation stored (hopefully) in .rodata:

+--------+--------+--------+--------+-...-+--------+--------+-...
| TypeId |    N   | TypeId | TypeId | ... | *mut() | *mut() | ...
+--------+--------+--------+--------+-...-+--------+--------+-...

RTTI: Handling non object-safe traits

Given that the only purpose of RTTI is to deal with run-time polymorphism, it is proposed that:

  • only object-safe traits TypeId appear in KnownImpledTraits
  • that KnownImplingData be (TypeId, 0) for non object-safe traits

This makes it possible to handle all traits uniformly.

Obtaining a virtual pointer for a given struct and trait

From experience, this author estimates it is faster starting from the struct:

  • a struct probably implements less traits, especially for common traits such as Display or Debug.
  • a struct is probably more likely to know its traits.

This is something that can be tweaked on an individual basis, obviously, however this author proposes the following look-up:

fn get_vtable(data: KnownImpledTraits, trt: KnownImplingData) -> Option<*mut ()>
{
    data
        .get_vtable(trt.get_trait_id())
        .or_else(|| trt.get_vtable(data.get_data_id())
}

Cross-casting

Transforming one trait into another simply requires swapping the virtual pointer for the desired one. Using TraitObject it would boil down to two transmute. However TraitObject cannot be used with traits formed over DSTs, where more than a pointer’s worth of data is stored. This calls for two utilities:

unsafe fn switch_vtable<U: ?Sized, T: ?Sized>(t: &T, vtable: *mut ()) -> &U;

unsafe fn switch_vtable_mut<U: ?Sized, T: ?Sized>(t: &mut T, vtable: *mut ()) -> &mut U;

Extending Any

With this facility, it is trivial to switch the TypeId embedded in Any for the strictly more powerful KnownImpledTraits.

Once done, it is possible to add a cast_ref and cast_ref_mut methods:

pub trait Any: Reflect + 'static {
    fn get_type_id(&self) -> TypeId;
    fn get_known_impled_traits(&self) -> KnownImpledTraits;
}

impl<T: Reflect + 'static + ?Sized> Any for T {
    fn get_type_id(&self) -> TypeId {
        TypeId::of::<T>()
    }

    fn get_known_impled_traits(&self) -> KnownImpledTraits {
        intrinsics::get_known_impled_traits::<T>()
    }
}

impl Any {
    fn cast_ref<T: ?Sized>(&self) -> Option<&T> {
        get_vtable(self.get_known_impled_traits(), get_known_impling_data::<T>())
            .map(|vtable| unsafe { switch_vtable::<T>(self, vtable) })
    }

    fn cast_mut<T: ?Sized>(&mut self) -> Option<&mut T> {
        get_vtable(self.get_known_impled_traits(), get_known_impling_data::<T>())
            .map(|vtable| unsafe { switch_vtable_mut::<T>(self, vtable) })
    }
}

As well as adding a cast method to Box<Any>:

impl Box<Any> {
    fn cast<T: ?Sized>(self) -> Result<Box<T>, Box<Any>>;
}

Drawbacks

Complexity

This adds extra complexity to the language, and further RTTI extensions (such as reflection) would add further complexity.

On the other hand, it is opt-in, and if unused, cost-free.

Incompleteness

The scheme is incomplete and only handles a subset of the desired “inheritance” functionality. It does not address thin pointers or virtual fields.

On the other hand, it is a building brick that can be added immediately and it is minimal enough to be reusable in whole or in parts by any further scheme.

The compiler changes simply provide the minimum of functionality necessary for libraries to build upon.

Alternatives

Enriching the V-Table

This is the approach unfortunately taken by most C++ implementations. The v-table of each virtual type embeds the RTTI about this type, making it available to any user of the class.

The major drawback is binary bloat. All classes with a virtual function, whether they end up relying on RTTI or not, must pre-emptively embed this information.

As a result of the binary bloat, common implementation of C++ compilers have a switch available to disable the generation of RTTI. Unfortunately, the usage of this switch fractures the community, and libraries designed with or without RTTI often are incompatible with one another.

A prominent example of projects disabling RTTI is LLVM/Clang.

Annotating trait and struct

It would be possible to simply annotate the trait and struct with which RTTI should be used with a #[rtti] annotation. For future-proofing, it might be interesting to allow the selection of which RTTI to embed: reflection would require significantly more information than mere cross-casting.

Unfortunately this requires the author of the library to pre-emptively annotate. This author fears that this would lead to a fragmented eco-system, and ultimately require a switch for clients to selectively enable/disable the feature if they wish to do without the binary bloat.

Handling non object-safe traits

It would be possible to handle non object-safe traits differently, for example storing the value 1 in the associated v-pointer value would allow making the distinction between “this trait is not implemented” and “this traits is implemented but non object-safe”.

On the other hand, would it really be useful?

Perfect Hashing?

The presented layout tries to maximize cache-locality in the TypeId to make searching more efficient, but may not be optimal.

This author would oppose that it does not matter. As a part of Rust’s unstable ABI it may be changed at will, so it is easier to start simple with a binary search.

Unresolved questions

Modules and naming

The names of items and where to find them is a bikeshed discussion that can wait until it is decided whether this functionality is worth having or not.

This author expects that it will boil down to core::raw and intrinsics for some obvious parts, and a new core::rtti module would make sense for the functionality built on top.

Changes Log


Comments, criticisms, and feedback welcome.

Of these three goals, I personally want Rust not to have the first two, and don't know what the use case is for the third.

I think runtime casting between interfaces throws away most of the typing guarantees that Rust's parametric polymorphism provides. Your program will remain 'well typed,' but the type system is no longer helping you ensure that being well typed means doing what you want it to do. If you fail to implement some trait, you're just accidentally following the None branch now. If you fail to account for some trait in your casting branches, your types are just being processed on a different branch from what you expect. What I love about Rust is that these kinds of bugs are caught at compile time.

What's worse is that downcasting, reflection, and so on is the default polymorphism strategy in most object oriented language. This means that if this feature exists, many users will reach for it first, instead of learning about the advantages that a system based on constrained type variables can offer.

To me this is a damning drawback that is very difficult to ovecome, but you don't even mention it in the drawbacks section of the RFC. :-\ Features have downsides other than 'complexity', even if they're 'zero-cost'.


I believe that there are patterns that Rust does not enable right now, but I don't know enough about them. Could you rewrite the motivation section so that it could be informative and accessible to someone who does not know C++? I can't really understand what the motivating example means right now.

1 Like

I’d like to see an explanation for usecases of the visitor pattern where specialization fails to offer a solution.

As for actual RTTI, I already wrote my thoughts down a few weeks ago.

Regarding the motivation section, would a link to a page such as this one describing what the Acyclic Visitor Pattern is be helpful?

I thought about writing it in Java too, but it made the section longer and I did not see the benefit of exposing implementations in multiple languages. I'd rather improving the text description, if possible, but I am afraid that the benefits of the Acyclic Visitor upon the Visitor pattern seem so obvious to me that I'm failing to explain it better :sweat:


I am afraid that you are rowing against the current here. Demand for downcasting has been quite serious because it's a solution to real-world problems that users have today, and while we may argue about the form that we wish such support to take, I am afraid that it is necessary.

I would also point that Rust already today supports it (albeit in a limited way) with Any.

As for the usage of the third point: it's for controlling memory. Off the top of my head:

  • if you have a Vec<&Trait> that stores uniform objects, you may want to instead have a (*mut (), Vec<*mut ()>) which consumes only half the memory.
  • if you want to have "thin" pointers, you may morph a fat pointer into a thin pointer by gluing its v-pointer and data-pointer differently... and then reconstitute the fat pointer temporarily to use it.

Well, is that any different from any run-time check in general?

I mean, I can perfectly write a function that accepts Option<T> but does nothing in the None case. I certainly understand the feeling, I am quite the type freak myself, however in the end functions have a run-time behavior which is (hopefully) consistent.

I think however that there is a misunderstanding here. The caller has to pass a way to retrieve the KnownImpledTraits, which means that the callee must provide a way to pass this piece of information. An example would be to require the Any bound.

By requiring such a way to retrieve this information, the callee clearly signals that down-casting/cross-casting will occur: this is not a wild-wild-west where it can occur any time, anywhere.

Yes, users may overuse it. They may also create stringly-typed interfaces (don't laugh, I've had the horror to see it in C++...) and stick to primitive types, ...

... should we refrain from changes because they can be abused? I don't think so. I follow the "protect yourself from Murphy not Machiavelli" guideline, as users can always find "clever" way of subverting anything.

To me this is a damning drawback that is very difficult to ovecome, but you don't even mention it in the drawbacks section of the RFC. :-\ Features have downsides other than 'complexity', even if they're 'zero-cost'.

I am not opposed to including it, but would not want to phrase it in a patronizing way either. Suggestions are welcome.

Note that I am talking about the Acyclic Visitor pattern here, which is a variant of the Visitor pattern, but quite different.

The Visitor pattern would have a trait knowing all of the possible Visitee:

trait Visitor {
    fn visit(apple: &Apple);
    fn visit(orange: &Orange);
}

And therefore Apple knows about Orange (because it knows about Visitor) and Orange knows about Apple (because it knows about Visitor).

The Acyclic Visitor pattern is about cutting away those dependencies, to allow adding elements to the visited hierarchy from another module/library/crate (which is impossible in the Visitor pattern).

I am afraid that I fail to see how it could be implemented with specialization: the FruitVisitor has no associated function in my example because it knows no concrete fruit, where would you apply specialization?

I didn’t mean “add specialization naively” but rather w/ generics. Also, entire OOP hierarchies end up being awkward in Rust, do you have a concrete example that’s not reliant on “is-a” relationships?

It’s still not entirely clear to me which kind of visitor pattern usecases you’re interested in, but I can try (playpen):

trait Visitor {}

trait Accept<V: Visitor> {
    fn accept(&self, v: &mut V);
}

impl<V: Visitor, T> Accept<V> for T {
    default fn accept(&self, _: &mut V) {}
}

struct Apple;

struct MyAppleVisitor;
impl Visitor for MyAppleVisitor {}
impl Accept<MyAppleVisitor> for Apple {
    fn accept(&self, v: &mut MyAppleVisitor) {
        /* do something with */ v;
    }
}

It’s possible that the usecase you have in mind requires intersection impls, but until I understand more (and not in OOP terms, which for a language different enough like Rust tend to create lots of XY problems).

1 Like

How is this supposed to work if a crate turns an struct into a downcastable pointer, i.e. gets the list of known implied traits, then passes it to a different crate which declares and impls its own trait for that type? (That includes any trait which is generically impled for all types or all types impling a common trait, etc. Similar situations arise with generic traits.) It would be very surprising if downcasting silently failed, which is what would happen if the compiler generated the trait lists at the time it generates all other code/data, i.e. on-demand on a per-crate basis. In theory the compiler could add logic to notice when it’s building a ‘final’ binary like an executable or a cdylib, and generate the necessary runtime information at that point, but that’s a big deal. It’s not just a matter of merging lists from the dependent crates, at least in some cases: often an impl of a trait (or trait specialization) from one crate applies to a type from another, where neither crate depends on the other, and the impl isn’t otherwise needed - so the final link has to generate the code for it, which might involve bringing in more traits and more types in the form of specializations used by the body of the impl, and so on. This is also essentially O(n^2) as you have to match every trait against every type. And sometimes the result of a trait match is “stack overflow” rather than “yes” or “no”, necessarily so since trait matching is Turing complete; you could error out in that case, but there would be a bit of a pitfall if the author wasn’t expecting that trait to be matched against that particular type. Ditto for the handful of features that allow generic fns to error out at monomorphization time. And while loading code at runtime isn’t really supported today, how would that work with this feature in the future?

1 Like

Do you get why statements like this are not convincing?

You could replace the word "downcasting" in this pargraph with just about any language feature - "Garbage collection;" "exceptions;" "higher kinded polymorphism;" "named arguments" - and nothing would change. You've just asserted that we must have this feature, you haven't made any real case for it.

My opinion is that downcasting is rather like try/catch exceptions, honestly - it can be useful, but it has a much greater chance of being a footgun. Rust, of course, has catch_unwind today, so it's possible that a footgun feature that is needed in rare cases could be added with enough warning symbols around it. But you need to make a convincing case that it is, as you say, "necessary."

I do think Rust has a problem in this area, but I think of it rather differently. Users try to encode patterns they are used to from languages with class based inheritance, like C++ and Java, and can't, because we don't have downcasting between virtual objects. When they request assistance, our response is meager - we basically just tell them not to do that, because they can't. But we don't have a positive statement of what they can do instead. We need to develop the patterns of Rust, so that we have alternative patterns to assist users when they try to import patterns that don't work in this language.

@eddyb's question about specialization is on point here; I believe we can find patterns in that feature that are valid alternatives for adding this feature.

4 Likes

Maybe this would help clarify the motivating example - How is the acyclic visitor pattern differently from the patterns encoded by serde’s serialization / deserialization? These traits seem to have the benificial property you describe the acyclic visitor pattern as having.

https://docs.serde.rs/serde/index.html

1 Like

I was relying for this on the property that the implementation lies either in the struct crate and/or the trait crate. This is why two intrinsics are required:

  • one giving the known traits implemented for a struct or enum or primitive, ...
  • one giving the known struct/enum/primitives implementing a trait

I now see, indeed, that there is an issue arising from the blanket impls which does not arise at compile-time (because at compile-time both the struct and trait are in scope).

I did not anticipate this case, at all. I do not see a special instrumentation of the final binary as a solution either, unless it is compatible with DLLs.

I feel stumped at the moment, I'll sleep on it.

The Acyclic Visitor is akin to:

// Crate 1
trait FruitVisitor {}

trait Fruit {
    fn accept(&self, v: &mut FruitVisitor);
}

// Crate 2
struct Apple;

trait AppleVisitor: FruitVisitor {
    fn visit(&mut self, _: &Apple);
}

impl Fruit for Apple {
    fn accept(&self, v: &mut FruitVisitor) {
        if let v: &mut AppleVisitor = v {
            v.visit(self);
        }
    }
}

// Crate 3
struct MyAppleVisitor;

impl AppleVisitor for MyAppleVisitor {
    fn visit(&mut self, _: &Apple) {
        println!("I got an apple!");
    }
}

// Crate 4
fn apply(fruit: &Fruit, visitor: &FruitVisitor) {
    fruit.accept(visitor);
}

The key point is that Crate 4 has no knowledge of Apple (or other fruits) and any other crate can supply new fruits (each with their own visitor interface) and new visitors for existing fruits.

It is also possible for a visitor to implement multiple interfaces at once (AppleVisitor, OrangeVisitor and LemonVisitor) which allows it to visit those 3 fruits (but no other).


Maybe a better example would be a composite (much like a DOM):

trait Element {
    fn accept(&self, v: &mut ElementVisitor);
}

And then, some elements will have a Vec<Box<Element>> internally:

struct OrderedList {
    elts: Vec<Box<Element + Any>>,
}

trait OrderedListVisitor: ElementVisitor {
    fn start_visit(&mut self, ol: &OrderedList);
    fn end_visit(&mut self, ol: &OrderedList);
}

impl Element for OrderedList {
    fn accept(&self, v: &mut ElementVisitor) {
        if let Some(ref mut v) = v.downcast_mut::<OrderedListVisitor>() {
            v.visit(self);
        } else {
            for e in self.elts {
                e.accept(v);
            }
        }
    }
}

The OrderedList does not know what kind of ElementVisitor it receives, and does not know what concrete elements elts contains.

Note also that we cannot change to impl Element<V> for OrderedList where V: OrderedListVisitor: the elements in elts are of an unspecified type and the ordered list has no a priori knowledge of what specific visitors they require… so could not declare them as additional constraints on V at compile-time.

The scheme is fully open-ended, any downstream crate may come and add both an element and its visitor.

I do not see how specialization could assist here, since the resolution of specialization is performed at compile-time and the information known at compile-time is incomplete here.

1 Like

I have tried to further clarify what the Acyclic Visitor was about above. Its run-time nature is essential and cannot, as far as I know, be supported by specialization today. I refer you to the OrderedList example above.

The Serialize and Serializer traits of serde differ in that the Serializer has a fully known interface, the behavior of the visited element (implementing Serialize) does not vary based on the dynamic type of the Serializer.

I am open to alternative solutions, certainly. Let me present a problem then:

// Crate 0
trait Element {
    // whatever you wish here
}

// Crate 1 (dep: Crate 0)
struct OrderedList {
    elts: Vec<Box<Element>>,
}

// Crate 2 (dep: Crate 0)
struct UnorderedList {
    elts: HashMap<u64, Box<Element>>,
}

// Crate 3 (dep: Crate 0)
struct Button {
    text: String, 
}

Those 3 elements all implement Element, they may be arbitrarily nested.

The frame:

  • you are given a starting point of type &Element
  • it may (or may not) be the root of a tree of elements
  • this tree may (or may not) contain elements that are not known to you (coming from other crates)

The actions:

  • you are allowed to introduce traits, and methods on those traits, as necessary
  • you may assume that the elements from other crates will follow guidelines from Crate 0 (notably, in being transparent and forwarding the visitor to child elements)
  • you may not assume that the other crates know about Crate 1, 2 or 3; they only depend on Crate 0

The challenge:

  • How do you print the message of all buttons, in depth-first, breadth-first?
  • How easy is it to extend this to another functionality?

I am open to solutions, however the key-point is really the ability to extend the functionality nearly arbitrarily: this is the point of the visitor, after all.

2 Likes

Thanks for these posts, its a lot more data with which to consider the problem. :slight_smile:

Thanks for all your remarks, it just seems so obvious to me that I have some issues understanding what others need to see the problem as clearly as I do so it helps being prodded in the right direction!

This bit particularly doesn't even sound like it needs specialization. I mean, that's literally what you would encode in bounds as V: Visitor<T> or T: Accept<V>.
It's how the trait system in its open-ended orthogonality is superior to OOP in most cases. Some niches are not fully covered yet, but a bit of XY problem rewriting can steer away from them.

However:

The visitor pattern as usually abused in OOP contexts can be replaced with pattern matching, or where needed, static multidispatch, the exception being open-ended hierarchies that require dynamic dispatch for reasons such as heterogeneous collections/graphs.

If you design a system to require dynamic multidispatch, well, then you need dynamic multidispatch and you either craft it yourself, use a library or get it into the language (the latter allowing much more flexibility).

My personal opinion is that we need tooling for building non-trivial graphs with static dispatch where possible and minimal RTTI - based on enums like in LLVM, most likely, elsewhere.
Effectively "solve" the open set problem by making it a closed set automatically. I have my doubts this belongs in a systems language, but it's a more appealing compromise than OOP either way.

EDIT: Went looking for query_interface and got into an unrelated rabbit hole, would've posted sooner.

1 Like

I think there is a problem here, which is that the way that serde expresses the visitor pattern results in non-object-safe traits. This means, for example, you can’t have a heterogenous collection of serializeable objects, which is frustrating when you’re trying to implement a schema which requires it (speaking from experience here).

I think the solution to this problem might be higher ranked type parameters in trait bounds, so that you could have a definition of Serialize like this, which is object safe if you know the Serializer:

trait Serialize<S: Serializer> {
    fn serialize(&self, serializer: &mut S) -> Result<(), S::Error>;
}

The problem with this definition is that it leaks the serializer parameter, but with higher rank trait bounds you could solve this with for<S: Serializer> Serialize<S>.

1 Like

After the issue brought forth by @comex, I have arrived at a similar conclusion.

Given the presence of “wildcard” implementations, the creator of the Any box has to advertise which traits are available. It could (perhaps) be automated, but this would be sugar. Instead, I would envisage something closer to the query_interface library you linked (nice library!): the caller would build a list of traits implemented by its type and then point Any to this list.

This also has the advantage that the caller explicitly picks which traits to advertise (and might be willing to narrow down the selection), which greatly diminishes the issues @withoutboats raised about down casting “breaking” the signature contract.

At this point, the main features the language need provide to support this scheme seem to be const fn (to have the ability to store this trait ID -> vptr table in .rodata).

And best of all, the user is fully in control of the memory footprint added by the scheme :heart:

I suppose, however, that at some point this should be migrated in the standard library, to promote interoperability… but since we first need experience with those, this is a RFC that will have to wait for a while.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.