Unfortunately, generic impl
trump the particular scheme exposed below, and suggest that the user should instead advertise exactly which traits they wish to expose. This in turn is (partially?) implemented by the query_interface crate as noticed by @eddyb below. Let crates implement this functionality then!
- Start Date: (fill me in with today’s date, YYYY-MM-DD)
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)
Summary
Provides barebone and lightweight cross-casting abilities:
- the ability to determine at run-time whether a given
struct
orenum
implements a giventrait
, and if so retrieving the virtual pointer. - using this virtual pointer, the ability to cross-cast from one
trait
reference to another, or form a newtrait
reference from scratch.
Motivation
The ability to query the Run-Time Type Information (thereafter abreviated RTTI) and use this information to cross-cast is essential in a number of situation to minimize dependencies.
An interesting example is the Acyclic Visitor pattern, which improves upon the well-known Visitor pattern by cutting away the dependencies, making it possible to add new elements to be visited without requiring modifying the existing elements.
In C++ (similarly to Java, or others) it would implemented thus:
class FruitVisitor { virtual ~FruitVisitor() {} };
class Fruit {
public:
virtual void accept(FruitVisitor const& v) const = 0;
virtual ~Fruit() {}
};
And adding a new fruit is done by providing a dedicated visitor with it:
class Apple: public Fruit {
public:
virtual void accept(FruitVisitor const& v) const {
if (AppleVisitor const* av = dynamic_cast<AppleVisitor>(&v)) {
av->visit(*this);
}
}
};
class AppleVisitor: public FruitVisitor {
public:
virtual void visit(Apple const& a) { ... }
};
This pattern, today, cannot be expressed in Rust as far as I know.
Detailed design
Goals
A quick check-list of intended goals:
- Allow querying, at run-time, the traits implemented by a piece of data.
- Allow cross-casting from one trait reference to another, not necessarily related, trait.
- Allow forming a trait reference from an opaque piece of data.
Constraints:
- Lightweight: “You don’t pay for what you don’t use”
- Uncompromising performance: “You could not code it yourself (much) faster”
Framework
Before we start, however, this author would like to first establish a framework which RTTI extensions to the language should follow.
It is paramount, to remain relevant in the embedded world or other memory-constrained environments, that Rust implementations strictly adhere to the “You don’t pay for what you don’t use” philosophy.
RTTI extensions are no exception to this guideline, and therefore this author firmly recommends that they avoid imposing a memory overhead (binary bloat) when unused. Going further, this author recommends that even when used, those extensions should only contribute toward memory footprint for those items with which they are used.
The implementation proposed below honours this quite simply by being lazy: the necessary information on which the run-time querying is done is captured by using a compiler intrinsic before type-erasure.
The main advantage is that the compiler may only emit this information for the types for which the intrinsic is called: it is lazy. Furthermore, it may use unnamed/mergeable symbols so that multiple emissions be folded together by the linker.
RTTI
The core of the implementation requires 2 intrinsics and 2 types:
get_known_impled_traits<T>() -> KnownImpledTraits
get_known_impling_data<T: ?Sized>() -> KnownImplingData
The two types are opaque, and contain the following methods:
impl KnownImpledTraits {
pub fn get_data_id(&self) -> TypeId;
pub fn get_vtable(&self, id: TypeId) -> Option<*mut ()>;
}
impl KnownImplingData {
pub fn get_trait_id(&self) -> TypeId;
pub fn get_vtable(&self, id: TypeId) -> Option<*mut ()>;
}
It is suggested that the underlying representation be akin to:
struct KnownImpledTraitsImpl {
data_id: TypeId,
traits: [u64],
}
pub struct KnownImpledTraits {
ptr: &'static KnownImpledTraitsImpl,
}
Where traits
is a an array containing the TypeId
and virtual pointer (*mut ()
) of the traits known to be implemented by this struct
, leading to the following in memory representation stored (hopefully) in .rodata
:
+--------+--------+--------+--------+-...-+--------+--------+-...
| TypeId | N | TypeId | TypeId | ... | *mut() | *mut() | ...
+--------+--------+--------+--------+-...-+--------+--------+-...
RTTI: Handling non object-safe traits
Given that the only purpose of RTTI is to deal with run-time polymorphism, it is proposed that:
- only object-safe traits
TypeId
appear inKnownImpledTraits
- that
KnownImplingData
be(TypeId, 0)
for non object-safe traits
This makes it possible to handle all traits uniformly.
Obtaining a virtual pointer for a given struct
and trait
From experience, this author estimates it is faster starting from the struct
:
- a
struct
probably implements less traits, especially for common traits such asDisplay
orDebug
. - a
struct
is probably more likely to know its traits.
This is something that can be tweaked on an individual basis, obviously, however this author proposes the following look-up:
fn get_vtable(data: KnownImpledTraits, trt: KnownImplingData) -> Option<*mut ()>
{
data
.get_vtable(trt.get_trait_id())
.or_else(|| trt.get_vtable(data.get_data_id())
}
Cross-casting
Transforming one trait into another simply requires swapping the virtual pointer for the desired one. Using TraitObject
it would boil down to two transmute
. However TraitObject
cannot be used with traits formed over DSTs, where more than a pointer’s worth of data is stored. This calls for two utilities:
unsafe fn switch_vtable<U: ?Sized, T: ?Sized>(t: &T, vtable: *mut ()) -> &U;
unsafe fn switch_vtable_mut<U: ?Sized, T: ?Sized>(t: &mut T, vtable: *mut ()) -> &mut U;
Extending Any
With this facility, it is trivial to switch the TypeId
embedded in Any
for the strictly more powerful KnownImpledTraits
.
Once done, it is possible to add a cast_ref
and cast_ref_mut
methods:
pub trait Any: Reflect + 'static {
fn get_type_id(&self) -> TypeId;
fn get_known_impled_traits(&self) -> KnownImpledTraits;
}
impl<T: Reflect + 'static + ?Sized> Any for T {
fn get_type_id(&self) -> TypeId {
TypeId::of::<T>()
}
fn get_known_impled_traits(&self) -> KnownImpledTraits {
intrinsics::get_known_impled_traits::<T>()
}
}
impl Any {
fn cast_ref<T: ?Sized>(&self) -> Option<&T> {
get_vtable(self.get_known_impled_traits(), get_known_impling_data::<T>())
.map(|vtable| unsafe { switch_vtable::<T>(self, vtable) })
}
fn cast_mut<T: ?Sized>(&mut self) -> Option<&mut T> {
get_vtable(self.get_known_impled_traits(), get_known_impling_data::<T>())
.map(|vtable| unsafe { switch_vtable_mut::<T>(self, vtable) })
}
}
As well as adding a cast
method to Box<Any>
:
impl Box<Any> {
fn cast<T: ?Sized>(self) -> Result<Box<T>, Box<Any>>;
}
Drawbacks
Complexity
This adds extra complexity to the language, and further RTTI extensions (such as reflection) would add further complexity.
On the other hand, it is opt-in, and if unused, cost-free.
Incompleteness
The scheme is incomplete and only handles a subset of the desired “inheritance” functionality. It does not address thin pointers or virtual fields.
On the other hand, it is a building brick that can be added immediately and it is minimal enough to be reusable in whole or in parts by any further scheme.
The compiler changes simply provide the minimum of functionality necessary for libraries to build upon.
Alternatives
Enriching the V-Table
This is the approach unfortunately taken by most C++ implementations. The v-table of each virtual type embeds the RTTI about this type, making it available to any user of the class.
The major drawback is binary bloat. All classes with a virtual function, whether they end up relying on RTTI or not, must pre-emptively embed this information.
As a result of the binary bloat, common implementation of C++ compilers have a switch available to disable the generation of RTTI. Unfortunately, the usage of this switch fractures the community, and libraries designed with or without RTTI often are incompatible with one another.
A prominent example of projects disabling RTTI is LLVM/Clang.
Annotating trait
and struct
It would be possible to simply annotate the trait
and struct
with which RTTI should be used with a #[rtti]
annotation. For future-proofing, it might be interesting to allow the selection of which RTTI to embed: reflection would require significantly more information than mere cross-casting.
Unfortunately this requires the author of the library to pre-emptively annotate. This author fears that this would lead to a fragmented eco-system, and ultimately require a switch for clients to selectively enable/disable the feature if they wish to do without the binary bloat.
Handling non object-safe traits
It would be possible to handle non object-safe traits differently, for example storing the value 1 in the associated v-pointer value would allow making the distinction between “this trait is not implemented” and “this traits is implemented but non object-safe”.
On the other hand, would it really be useful?
Perfect Hashing?
The presented layout tries to maximize cache-locality in the TypeId
to make searching more efficient, but may not be optimal.
This author would oppose that it does not matter. As a part of Rust’s unstable ABI it may be changed at will, so it is easier to start simple with a binary search.
Unresolved questions
Modules and naming
The names of items and where to find them is a bikeshed discussion that can wait until it is decided whether this functionality is worth having or not.
This author expects that it will boil down to core::raw
and intrinsics
for some obvious parts, and a new core::rtti
module would make sense for the functionality built on top.
Changes Log
Comments, criticisms, and feedback welcome.