This Pre-RFC now lives on github, where it easier to make it evolve: rust-poly/doc/0000-disjoint-polymorphism.md at master · matthieu-m/rust-poly · GitHub
Last edited 2015-05-15 (9 posts in the thread), see Change Log at the end
I have followed the various RFCs for adding "inheritance" in the language (as defined by Servo's extended requirements), but their invasive nature always bugged me.
I very much appreciate the disjoint nature of Rust's trait
and struct
, which neatly separate payload from behaviour. It is ultimately more flexible than the traditional OO-like code. Servo's requirements add a new dimension usage, but unfortunately most existing RFCs (as referenced in Summary of efficient inheritance RFCs) seem to end up mixing those properties. A notable exception is Fat Object, and this RFC turned out somewhat similar to it.
I have thus tried to take a stab at it, and I am satisfied enough with the current production that here I am solliciting feedback.
Alright, enough talking.
Note: if you are looking for a revolution, you are going to be thoroughly disappointed; an explicit goal was to integrate seamlessly in the language and after several iterations I ended up with a mostly-library solution. I believe it to be a good thing.
Note: this RFC is fairly long, at over 35,000 characters, the impatient can look for the DOM word to get a quick feel of it.
- Start Date: (fill me in with today's date, YYYY-MM-DD)
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)
Summary
Solving the Servo DOM design requirements (and beyond), while integrating smoothly with the already existing trait
mechanism.
This RFC provides:
- Data Polymorphism
- Trait Polymorphism
- Thin Polymorphic Pointer/References Support
- Runtime Type Information (to handle safe down-cast)
Motivation
Rust currently supports polymorphism through its traits, however the experience in Servo has raised a number of requirements which are not fulfilled.
A summary of those requirements is given here:
- cheap field access from internal methods;
- cheap dynamic dispatch of methods;
- cheap down-casting;
- thin pointers;
- sharing of fields and methods between definitions;
- safe, i.e., doesn't require a bunch of transmutes or other unsafe code to be usable;
- syntactically lightweight or implicit upcasting;
- calling functions through smartpointers, e.g. fn foo(JSRef, ...);
- static dispatch of methods.
There have already been a number of proposals (see Summary of Efficient Inheritance RFCs).
This RFC is similar in nature to Fat Objects and tries not to focus on building independent bricks, but instead focus on maximizing integration with the existing code and avoid splitting the Rust landscape into two incompatible run-time polymorphism paradigms, which would hurt re-usability.
This RFC thus designs two disjoint polymorphism paths (one for data, one for interfaces), and a couple core struct
s to bridge the two, giving ad-hoc inheritance to those who wishes it in manner that let them use all pre-existing struct
s, trait
s and functions defined around them. Likewise, the struct
s and trait
s they create can be used without "inheritance".
It manages to do so with minimal compiler support, thus opening the door to other library schemes.
Detailed design
This RFC is rather long, as there is a lot to cover, and therefore it is separated in multiple subsections, each building (potentially) upon the precedent. Some sections (or parts) could potentially be used independently of the others.
In keeping with Rust tradition, this RFC preserves the orthogonality of defining data structures (in struct
) and defining interfaces (in trait
). By doing so, it maximizes the opportunity to mix and match "object-like" polymorphism and trait polymorphism.
As usual, all names are subject to discussion.
Transmutable, Coercible, Convertible
First things first, let us introduce a couple terms.
Transmutable
A type U
is said to be transmutable to a type T
if the compiler allows std::mem::transmute
to convert from U
to T
. Doing so might lead to undefined results, of course, which is why the method is unsafe
.
As an example, Box<T>
is transmutable to *const T
, irrespectively of whether it is sensible to do so or not.
This relationship is (probably) symmetric.
Coercible
A type U
is said to be coercible to a type T
when transmuting U
to T
"works". This is voluntarily vague, and sufficient for this RFC.
As an example, &str
is coercible to &[u8]
, however the reverse is not true since not every byte sequence is a valid UTF-8 sequence.
This relationship is therefore asymmetric. It is also transitive.
The following intrinsic traits core::mem::{Coerce<T>,CoerceRef<T>,CoerceRefMut<T>}
are introduced to allow querying this relationship:
unsafe trait Coerce<Target> {
fn coerce(self) -> Target { mem::transmute(self) }
}
unsafe trait CoerceRef<Target> {
fn coerce_ref(&self) -> &Target { mem::transmute(&self) }
}
trait CoerceRefMut<Target>: CoerceRef<Target> {
fn coerce_ref_mut(&mut self) -> &mut Target { mem::transmute(&mut self) }
}
The following blanket implementation is provided too:
unsafe impl<A, B> CoerceRef for A
where A: Coerce<B>
{
}
It is expected that the compiler then allows implicit coercion from A
to B
when A: Coerce<B>
, from &A
to &B
when A: CoerceRef<B>
and from &mut A
to &mut B
when A: CoerceRefMut<B>
.
Note: if the compiler chases down coercions (as they are transitive) it must take care to avoid running in circles.
Note: mutability references allow modifying the inner part and therefore while
str: CoerceRef<[u8]>
is viable,str: CoerceRefMut<[u8]>
is not because it would allow mutating individual bytes and the resultingstr
would not necessarily hold a valid UTF-8 sequence.
Convertible
A type U
is said to be convertible to a type T
if there exists a valid transformation from U
to T
.
As an example, 'a String
is convertible to &'a str
, however the reverse is not true as while a String
could be produced it would not necessarily have the lifetime 'a
.
This relationship is therefore asymmetric. It is also transitive.
Data Polymorphism
Data Polymorphism concerns itself in making it possible to safely access part of a &U
via a &T
.
Syntax
Grammar:
struct StructName: [pub] StructName (+ [pub] StructName)* {
// other attributes
}
Example:
struct FirstParent;
struct SecondParent {
a: int
}
impl SecondParent() {
fn increment(&mut self) { self.a += 1; }
}
struct Child: FirstParent + pub SecondParent {
// other attributes
}
fn usage(d: &Child) {
d.increment(); // calls SecondParent::increment()
println!("{}", d.a);
}
Semantics
By deriving from another struct
, the derived struct
embeds its parents' fields. However, due to encapsulation, it can only access fields that it could access if the parent was an attribute.
In essence, the previous example could be rewritten:
struct Child {
_super_first: FirstParent, // 0 bytes, but mentioned anyway
pub _super_second: SecondParent,
// other attributes
}
fn usage(child: &Child) {
d._super_second.increment();
println!("{}", child._super_second.a);
}
Guarantees
In addition to the syntactic sugar, a number of guarantees are made:
Child
is said to derive fromParent
ifParent
is an immediate ofChild
or if any base ofChild
derives fromParent
.&Child
is convertible to&Parent
, ifChild
derives fromParent
.&Child
is coercible to&Parent
, ifChild
derives fromParent
andParent
is an emptystruct
.&Child
is coercible to&Parent
, ifChild
derives fromParent
andParent
is the first non-empty base ofChild
.&Child
is coercible to&Parent
, if,T
being the first non-empty base ofChild
,&T
is coercible to&Parent
.
For ease of use, the compiler should perform implicit upcasts as necessary, either to access the fields or methods of the Parent
. Down-casts will be dealt with in a later chapter.
As shown in the semantics section, pub
affects whether a parent is publicly exposed or not. &Child
is only safely coercible to &Parent
in a given lexical scope if that relationship is accessible from this scope.
Note: the compiler should automatically implement the
Coerce
andCoerceRef
traits, when it makes sense, betweenParent
andChild
. It should never, however, implement theCoerceRefMut
trait on its own as it might allow violating some of the child invariants (which need be stricter than the parent per the Liskov's substitution principle); this implementation is left to the user.
Common Ancestor
This RFC defines that ()
(the unit type) is a common ancestor to all struct
:
- For any
T
,&T
is coercible to&()
. ()
being an emptystruct
, it does not prevent coercibility to any other parent.
Disambiguation
If an ambiguity arises due to conflicting field names or method names, disambiguation requires explicitly casting to the appropriate reference type or simply naming the parent:
fn usage(child: &Child) {
SecondParent::increment(child);
println!("{}", (child as &SecondParent).a);
}
Trait Polymorphism
Trait Polymorphism concerns itself in making it possible to safely access part of the interface of a &U
via a &T
.
State of the art
This is already possible in Rust today:
trait FirstBase {
fn first_method(&self);
}
trait SecondBase {
fn second_method(&self);
}
trait Derived: FirstBase + SecondBase {
fn derived_method(&self);
}
Static Dispatch
FIXME (2015-05-15): this section may still require a review
It is possible to guarantee static dispatch of a trait
method, by explicitly selecting the enum
or struct
for which it was implemented.
For reminder:
struct Child: FirstParent, SecondParent { ... }
impl FirstBase for FirstParent { ... }
impl Child {
fn first_method(&self);
}
fn doit(child: &Child) {
<Child as FirstBase>::first_method(child);
}
A more complicated case may imply also having a struct
ambiguity, it is resolved similarly:
impl SecondBase for FirstParent { ... }
impl SecondBase for SecondParent { ... }
// Note: SecondBase not implemented for Child
fn doit(child: &Child) {
<SecondParent as SecondBase>::second_method(child);
}
Trait implementation re-use
Much like a struct
may access its parent's field or method unambiguously, so may a trait
.
By default, should a parent of a struct
implement a trait
, it is not necessary for the struct
to implement any method, though it still has to explicitly declare the trait
implementation:
impl FirstBase for FirstParent {
//
}
impl FirstBase for Child {} // automatic, re-use FirstParent's implementation
In the case that multiple parents of a struct
could provide the trait
implementation, then the implementation requires disambiguation. A manual implementation must be provided, it may delegate using Static Dispatch as appropriate:
impl FirstBase for Child {
fn first_method(&self) {
<SecondParent as FirstBase>::first_method(self);
}
}
A macro could be provide to perform the boring job of forwarding the arguments appropriately, although this case should probably be rare enough in practice that it might just no be worth it.
core::rtti
In order to provide all the necessary functionality, a new module is introduced: core::rtti
, re-exported as std::rtti
as part of the std facade.
TypeInfo
, VTable
and VRef
The intrinsic types core::rtti::{TypeInfo,VTable}
and the library type core::rtti::VRef<Trait>
are introduced:
struct TypeInfo {
size: usize,
align: usize, // Note: log2(align) could be packed in unused bytes of size.
trait_id: TypeId,
struct_id: TypeId,
// ... others ?
down_cast_trait: fn (TypeId) -> Option<&'static VTable>,
down_cast_struct: fn (TypeId) -> bool,
}
struct VTable {
type_info: &'static TypeInfo,
}
struct VRef<Trait> {
vtable: &'static VTable,
}
impl<Trait> VRef<Trait> {
fn get_type_info(&self) -> &'static TypeInfo;
fn drop(&self, it: &mut ()) {
unsafe {
// only valid for below layout on x86/x86_64
let vptr = self.vtable as *const isize;
let drop = vptr.offset(1);
let drop: fn (&mut ()) -> () = mem::transmute(*drop);
drop(it)
}
}
fn get_vtable(&self) -> &'static VTable;
// ...
}
The methods of the VTable
are not explicit mentioned because their number depends on the particular trait being implemented, so instead a raw memory layout similar to the following is expected:
trait A: Clone + B + E {}
trait B { fn bar(&self); }
trait E {}
//
// Memory Layout of the VTable of A
//
// Each cell [n] represent a pointer-sized memory cell (on x86/x86_64)
// where n is the compile-time index giving access to its content.
//
// VRef<A> VRef<Clone> VRef<B> VRef<E>
// | | | |
// +-+ +-+ | +-+
// [0] VTable (A) + + + + | + +
// + + + + | + +
// [1] Drop::drop + + + + | + +
// + + + + | +-+
// [2] Clone::clone + + + + |
// + + + + |
// [3] Clone::clone_from + + + + |
// + + +-+ +-+
// [4] VTable (B) + + + +
// + + + +
// [5] Drop::drop + + + +
// + + + +
// [6] B::bar + + + +
// +-+ +-+
//
// Both VTable (A) and VTable (B) refer to TypeInfo (A).
//
The following guaranteed are made:
VRef<Derived>
is convertible toVRef<Base>
ifBase
is a super-trait ofDerived
.VRef<Base>
is dynamically convertible toVRef<Derived>
ifBase
is a super-trait ofDerived
, the conversion may fail if the current object does not implementDerived
though.VRef<Derived>
is coercible toVRef<Base>
ifBase
is an empty trait.VRef<Derived>
is coercible toVRef<Base>
ifBase
is the first non-empty super-trait ofDerived
.VRef<Derived>
is coercible toVRef<Base>
if,T
being the first non-empty super-trait ofDerived
,VRef<T>
is coercible toVRef<Base>
.
The design of
TypeInfo
andVTable
was done with simplicity in mind, as a proof of concept. Their final implementation might varied wildly depending on target architecture, compilation options and benchmark results.
Class
and DynClass
The library types core::rtti::{Class<Trait, Struct>, DynClass<Trait, Struct>}
are introduced:
#[repr(C)] // Need to ensure layout compatibility with DynClass
struct Class<T, S>
where S: T
{
vref: VRef<T>,
data: S,
}
#[repr(C)] // Need to ensure layout compatibility with Class
struct DynClass<T, S> {
vref: VRef<T>,
data: S,
_: [u8], // Attempt to prevent `DynClass` from being accidentally used by value
}
impl<T, S> DynClass {
fn as_trait(&self) -> &T;
fn as_trait_mut(&mut self) -> &mut T;
fn as_struct(&self) -> &S;
fn as_struct_mut(&mut self) -> &mut S;
}
It should be noted that a Class
is just a package and does not in itself allow any polymorphism. Polymorphism is delegated to DynClass
, which can be created from a Class
:
Class<T, S>
is convertible toDynClass<T, S>
&Class<T, S>
is coercible to&DynClass<T, S>
&mut Class<T, S>
is coercible to&mut
DynClass<T, S>`
DynClass
itself implements type erasure and thus offers polymorphic behaviour. Based on the coercible properties of VRef
and struct
, we get:
&DynClass<D, C>
is coercible to&DynClass<B, C>
ifVRef<D>
is coercible toVRef<B>
.&mut DynClass<D, C>
is coercible to&mut DynClass<B, C>
ifVRef<D>
is coercible toVRef<B>
.&DynClass<D, C>
is coercible to&DynClass<D, P>
if&C
is coercible to&P
.&mut DynClass<D, C>
is coercible to&mut DynClass<D, P>
if&mut C
is coercible to&mut P
.
Note: unlike
Class
,DynClass
does not require thatS
implementsT
. This is intentional, it allows flexibility in the coercion of references. It is still safe because it can only be created fromClass
so that the actual type stored (most-derived type) does implementT
.
Dyn
While one can always use ()
as a common ancestor in the above types, this RFC introduces a few type aliases core::rtti::Dyn<Trait>
as a short-hand.
type Dyn<T> = DynClass<T, ()>;
DownCast
, DownCastRef
and DownCastRefMut
In order to control down-casts, the traits core::rtti::{DownCast<Target>, DownCastRef<Target>, DownCastRefMut<Target>}
are introduced.
trait DownCast<Target> {
fn down_cast(Self) -> Result<Target, Self>;
unsafe fn unchecked_down_cast(Self) -> Target;
}
trait DownCastRef<Target> {
fn down_cast_ref(&Self) -> Option<&Target>;
unsafe fn unchecked_down_cast_ref(&Self) -> &Target;
}
trait DownCastRefMut<Target> {
fn down_cast_ref_mut(&mut Self) -> Option<&mut Target>;
unsafe fn unchecked_down_cast_ref_mut(&mut Self) -> &mut Target;
}
The unsafe
functions are provided for speed, they performs the necessary runtime adjustments (if any) without checking whether the down-cast is valid or not.
On the contrary, the safe functions perform the check, and indicate the cast failure thanks to their more elaborate return type.
Note: there is no blanket implementation of
DownCastRef
orDownCastRefMut
based onDownCast
because the latter allow more operations (offset adjustments) than the former.
Casting
It would be better if the compiler could perform the up-casts implicitly, much like with Deref
, there seems to be little point in having the user fiddling with as
over and over.
The type DynClass
implement the DownCast
and Coerce
traits as necessary.
impl<T, S> Drop for DynClass<T, S> {
fn drop(&mut self) {
unsafe {
let data: &mut () = core::mem::transmute(&mut self.data);
self.vref.get_type_info().drop(data);
}
}
}
unsafe impl<T, S, B, P> Coerce<Box<DynClass<B, P>>> for Box<DynClass<T, S>>
where VRef<T>: Coerce<VRef<B>>,
S: P, S: CoerceRef<P>
{
}
unsafe impl<T, S, B, P> CoerceRef<DynClass<B, P>> for DynClass<T, S>
where VRef<T>: Coerce<VRef<B>>,
S: P, S: CoerceRef<P>
{
}
impl<T, S, B, P> CoerceRefMut<DynClass<B, P>> for DynClass<T, S>
where VRef<T>: Coerce<VRef<B>>,
S: P, S: CoerceRefMut<P>
{
}
impl<T, S, D, C> DownCast<Box<DynClass<D, C>>> for Box<DynClass<T, S>>
where D: T,
C: S, C: CoerceRef<S>
{
// ...
}
impl<T, S, D, C> DownCastRef<DynClass<D, C>> for DynClass<T, S>
where VRef<D>: Coerce<VRef<T>>,
C: S, C: CoerceRef<S>
{
// ...
}
impl<T, S, D, C> DownCastRefMut<DynClass<D, C>> for DynClass<T, S>
where VRef<D>: Coerce<VRef<T>>,
C: S, C: CoerceRefMut<S>
{
// ...
}
Interaction with Box
, Rc
, ...
There is no built-in way in Box::new
to select the desired size and alignment, so dedicated helper functions should be provided to convert in and out of DynClass
:
use core::mem;
fn make_dynamic<T, S>(original: Box<Class<T, S>>) -> Box<DynClass<T, S>> {
unsafe { mem::transmute(original) }
}
fn make_static<T, S>(original: Box<DynClass<T, S>>) -> Result<Box<Class<T, S>, Box<DynClass<T, S>>> {
let allowed = original.vref.get_type_info().trait_id == TypeId::of::<T>() &&
original.vref.get_type_info().struct_id == TypeId::of::<S>();
if (allowed) { Ok(mem::transmute(original)) }
else { Err(original) }
}
fn up_cast<T, S, B, P>(original: Box<DynClass<T, S>>) -> Box<DynClass<B, P>>
where T: B, S: P, S: CoerceRef<P>;
Sanity Check
Let us verify how each of the requirements were addressed in this RFC.
- cheap field access from internal methods
- provided via monomorphization (of
trait
methods) or via&Struct
conversions.
- provided via monomorphization (of
- cheap dynamic dispatch of methods
- the call is performed like with regular traits, this RFC only changes the packaging.
- cheap down-casting
- down-casting to the most-derived trait/struct may be cheap, a simple comparison of
TypeId
. - down-casting to an intermediate trait/struct may be slightly more costly, as a few more comparisons will be necessary.
- down-casting to the most-derived trait/struct may be cheap, a simple comparison of
- thin pointers
- provided, though lack of integrated support make them slightly awkward.
- sharing of fields and methods between definitions
- provided by
struct
andtrait
polymorphism.
- provided by
- safe, i.e., doesn't require a bunch of transmutes or other
unsafe
code to be usableunsafe
will be present in thecore
library, however user code should not need it, though some checks bypass are so provided.
- syntactically lightweight or implicit upcasting
- it is expected that the compiler will perform the upcasting by itself (through tight integration of
Coerce
andCoerceRef
), otherwise calling the methods will be necessary.
- it is expected that the compiler will perform the upcasting by itself (through tight integration of
- calling functions through smart pointers, e.g. fn foo(JSRef, ...)
- through
Deref
- through
- static dispatch of methods
- provided by the regular procedural syntax call (see Static Dispatch)
Implementing the DOM according to requirements
Let us now how an example of a simple DOM would look like given those facilities, as it is the reference example used by the existing RFCs.
trait Node {}
struct NodeData {
parent: Rc<DynClass<Node, NodeData>>,
first_child: Rc<DynClass<Node, NodeData>>,
}
impl Node for NodeData {}
struct TextNode: NodeData {
}
impl Node for TextNode {}
trait Element: Node {
fn before_set_attr(&mut self, key: &str, val: &str) { ... }
fn after_set_attr(&mut self, key: &str, val: &str) { ... }
}
struct ElementData: NodeData {
attrs: HashMap<String, String>,
}
// Note: private access to ElementData::data, ensuring invariants;
// also, this method is always statically dispatched and thus inlinable.
impl ElementData {
fn set_attribute(&mut self, key: &str, value: &str) {
self.before_set_attr(key, value);
// update
self.after_set_attr(key, value);
}
}
impl Element for ElementData {}
struct HTMLImageElement: ElementData {
}
impl Node for HTMLImageElement {}
impl Element for HTMLImageElement {
fn before_set_attr(&mut self, key: &str, val: &str) {
if key == "src" {
// remove cached image
}
<ElementData as Element>::before_set_attr(key, value);
}
}
struct HTMLVideoElement: ElementData {
cross_origin: bool,
}
impl Node for HTMLVideoElement {}
impl Element for HTMLVideoElement {
fn after_set_attr(&mut self, key: &str, value: &str) {
if key == "crossOrigin" {
self.cross_origin = (value == "true");
}
<ElementData as Element>::after_set_attr(key, value);
}
}
fn process_any_element<'a>(element: &'a DynClass<Element, ElementData>) { ... }
fn main() {
let video_element: Rc<Class<Element, HTMLVideoElement>> =
Rc::new(Class::new(HTMLVideoElement::new(...)));
process_any_element(&*video_element);
let node = video_element.first_child.clone();
if let Some(element): Option<&DynClass<Element, ElementData>> = node.down_cast_ref() {
// ...
} else if let Some(text): Option<&DynClass<Element, TextNode>> = node.down_cast_ref() {
// unreachable, because TextNode derives from ElementData ...
// ...
} else {
// ...
}
}
Wrapping up
This RFC proposes an extremely lightweight way to add polymorphism. In terms of compiler integration. Only 2 intrinsic types (TypeInfo
and VTable
) and 2 intrinsic traits (Coerce
, CoerceRef
) are necessary, with all the functionality then being implemented in library code.
Furthermore, it supplements already existing Rust polymorphism without duplicating its functionality, therefore guaranteeing to the user that libraries using this functionality or not can still be hooked up together with little to no boilerplate.
Yet, despite being lightweight and rusty, it is quite possible to translate traditional objects hierarchy with a one-to-one mapping, use type aliases to mask the novelty and type inference to avoid ever mentioning it directly.
Drawbacks
- no sugar around
box
: unfortunately, the story aboutbox
is unclear yet, so we can only provide a helper method, this makes creating/cloning smart pointers awkward. - heavier syntax (
&T
vs&Dyn<T>
): it is expected that the need for such bundling be rare, and it is possible to convert to&T
immediately (so that onlystruct
code and notfn
code be affected). - opinionated: not many building blocks here.
Alternatives
Comparison to existing RFCs
There are many other RFCs, as already mentioned:
- #9: Fat Objects
- #11: Extending Enums
- #223: Trait Based Inheritance
- #250: Associated Field Inheritance
This RFC emphasizes flexibility and a clean separation of concern between payload (struct
), behaviour (trait
) and usage (&T
or &Dyn<T>
). The same struct
or trait
can freely be shared in situations where thin pointers are desirable or not.
This RFC can be seen as a refined version of Fat Objects (#9), proposing a more fully fleshed out implementation and simplifying the implementation of non-virtual methods by simply adding them to the struct
rather than creating an extraneous trait
. It could actually benefit from a better integration of custom DSTs in the language, as DynClass
is not too well integrated.
Compared to ...
(#11), this RFC does not require distinguishing between enum
that can be extended and enum
that cannot (mixes payload and usage). This distinction already exists today in C++ (inheriting a class without a virtual
destructor) and has proven to be a pain point of the language; it introduces a split in the language ecosystem between those struct
that can be extended and those that cannot. On the contrary, this RFC emphasizes that every existing trait
and struct
can be reused, and no foresight is necessary when designing new ones. It is also significantly less ambitious, and does not attempt any large scale changes to the language beyond fulfilling the given requirements.
Compared to Extend
(#223) or associated fields (#250), this RFC does not inject data in traits (mixes payload and behaviour). I consider this its main advantage. It neatly sidesteps the issue of splitting the ecosystem into stateful traits and stateless traits, and therefore guarantees that traits can be shared between any library, in any direction.
Compared to the associated fields (#250), this RFC's approach to fields is both cheaper than the indirect fields approach (with its required offset in v-table per field) and less constrained than the #[repr(fixed)]
approach (which precludes implementing two fixed traits with contradicting requirements). It also does not require the compiler to try and guarantee the non-aliasing of fields. On the other hand, it is obviously less flexible given its conservative choice (no renaming/re-arrangement).
Compared to the associated fields (#250), this RFC's approach does not require that common fields be public, which is a violation of encapsulation. The struct
can define methods with exclusive access to its fields, guaranteeing the invariants of its choice, and because those methods are not polymorphic they can be easily inlined. Still, if desired, its fields can be public.
Compared to the Internal Vtable (#250), this RFC once again avoids enforcing that a struct
or trait
only be usable in a particular way (mixes payload and usage). This once again allows using either the struct
or trait
in other contexts, where this particular representation would be less attractive (it is known that LLVM has issues with devirtualizing calls through internal v-pointers, for example).
Explicit Coercions
It is possible to require explicit coercions, however it seems strange in light of the Deref
precedent and may lead to a usability hit because of the extra verbosity. Still, inference might make it sufficiently lightweight.
No disambiguation in Data Polymorphism
It is possible to remove the necessity for disambiguation in parent's field or method selection by simply forbidding ambiguities, after all this can be checked easily enough.
Given that the necessity for disambiguation already arises when a single struct
implements multiple trait
with the same method name, it seems little effort to allow it.
On the other hand, the potential usability blow in forbidding ambiguities would be important. There is a reason it is allowed for trait
s: requiring two 3rd party libraries not to impinge on each other names is reminiscent of the C days, where there were no namespace.
Alternative Syntax for Data Polymorphism and disambiguation
Another syntax could be considered to provide data polymorphism, for example by using an attribute on a data-member (as #230).
Similarly, another syntax could be considered to provide disambiguation. C++ for example uses child.Parent::method()
.
The current syntaxes, however, were selected by mirroring the syntax of trait polymorphism and trait disambiguation (respectively), increasing their accessibility and trying to avoid making the learning curve even steeper than it already is.
Still, as with naming, this is open to debate.
Public/Private Data Polymorphism
As proposed, a struct
may derive from another either publicly struct A: pub B
or privately struct A: B
with the latter being the default (as is usual in Rust: more capability is opt-in).
This RFC would be pretty much as useful without this distinction, in which case there are two alternatives:
- only public polymorphism: the diamond inheritance problem becomes bothersome, as the work-around today relies on privacy to enforce data coherence.
- only private polymorphism: in which case the user can publicize the relationship by implementing
Coerce
andCoerceRef
manually; those traits areunsafe
though.
I would lean toward the latter solution as it offers more encapsulation. It is in line with fields (private by default) and not with trait
s (extended traits are public), and therefore since it adds fields it seems logical to follow their precedent.
Dyn
?
The Dyn
alias is optional. It is purely added for convenience.
No common ancestor ()
There does not seem to be any disadvantage in having ()
be a common ancestor to all struct
given that it has not capability and no overhead.
Still, if it were rejected, the Dyn
alias could still be implemented though as full-blown struct
instead.
DynMultiClass
?
In order to navigate through an inheritance tree, more information might be required than just a thin pointer, so as to be able to point within the data.
It is possible to arrange the VTable
structure such that navigating is easy, as demonstrated, by encoding a back pointer to the most derived type TypeInfo
to the start of the VTable
before each subtype's inlined list of methods. It might or might not be desirable, the cost being relatively minor as a single instance of a given VTable
exists, and this instance is tightly packed in ROM.
This strategy, however, is probably less desirable for struct
, since those are destined to be massively instantiated. In this case, a dedicated struct DynMultiClass
could be added to core::rtti
, which would maintain an third field (compared to DynClass
): an offset in the struct
.
This RFC seems complicated enough as it is, without inviting more trouble, so this is tabled for now.
Note: the signature of
down_cast_struct
would becomefn (TypeId, isize) -> Option<isize>
for example.
Drop
In this proposal, when rtti
is enabled and a trait has multiple non-empty bases, the drop
glue is duplicated at the start of the VTable
of each non-empty base.
An alternative would be, in this case, to hoist it inside TypeInfo
. Of course, it would then require two dereferences instead of one to access drop
, which might cause performance penalties.
However, one more pointer in the VTable
is not that costly (there are only so many of them), therefore this RFC promotes duplication, at least as the default case.
down_cast_trait
/down_cast_struct
unification?
For clarity, two methods were provided.
Having two methods can be beneficial, as each is significantly simpler, however it can also lead to extra overhead (two indirect calls) when down-casting on both the trait
and struct
axes. A unified method could be implemented, either as replacement or in addition.
Diamond inheritance handling
Diamond inheritance in trait
s is already handled: the caller must disambiguate which super-trait she wishes to call.
Diamond inheritance in struct
s is however much more complicated, there is a significant difference between having one field or two. A cursory search reveals that few languages have attempted to provide a method to share fields, most being content enough to be able to disambiguate.
Furthermore, a user-land solution to inconsistency exists: if parents are private (as the default), then it is up to the child struct
to re-export their methods, and doing so it can easily enough ensure the consistency of both data. It is, of course, somewhat onerous and error-prone.
It seems that any language solution to this problem could be added backward compatibly.
This RFC seems complicated enough as it is, without inviting more trouble, so this is tabled for now.
Unresolved questions
trait
impl
cannot be private today, should we try to keep the Child/Parent relationship secret and if so how? Wiring privacy intoC: P
checks?- How to guarantee that
VRef
only take atrait
parameter? The answer is probably linked to how to getVRef
to obtain its&'static VTable
pointer.
Change Log
2015-05-15
- Introduce
CoerceRefMut
:&str
is coercible to&[u8]
but making&mut str
coercible to&mut [u8]
may violate its soundness. ThusCoerceRefMut
is now opt-in. - Introduce
DownCastRefMut
for the same reason. - Refactor
Coerce
and co to avoid associated types, as it seems that this prevents them from being implemented multiple times (with different targets) for the same time. In hindsight, it kind of feel obvious. - Refactor
DownCast
and co for the same reasons. - Remove any talk about tail-padding, as per cmr comment; it's left to an implementation detail.
- Remove any talk about
cfg
attributes, it's distracting. - Expand Static Dispatch section
DynClass
cannot implement multipleDeref
, so move to explicit methods