OOP inheritance adapted to Rust


#1

There is an explosion of different proposals for single inheritance with different solutions to DOM problem. However I was wondering how a typical implementation of inheritance from mainstream languages will work with Rust which is highly focused on safety and performance and embraces the principle of the least surprise.

I started with the most low-level implementation out there which can be found in C++. Restricted to just single inheritance it is implemented by composition. Deriving class contains the base class in its prefix and all new fields are placed after that prefix. This way upcast to base class is trivial. Also methods of base class can be called on derived instances as the layout of relevant fields is the same.

Object slicing

The biggest problem of inheritance in C++ is called slicing. It can have two different forms - slicing on values and slicing on references. The former would look like this:

// Value extends BaseValue, both defines method foo()
let v = Value;
let bv = v.clone() as BaseValue; // explicit upcast
let bw : BaseValue = v.clone();  // implicit upcast
bv.foo() // calls BaseValue::foo()
bw.foo() // calls BaseValue::foo()

The problem is that only the BaseValue part of the value is moved. But what we should do with the remaining fields that were not moved and are about to be destroyed? For derived types without Drop the fields can be simply dropped. But for types implementing Drop trait the compiler should not allow partial moves. It is probably best rule to completely forbid implicit upcasting of values.

The object slicing can happen in assignments, when passing parameters to functions and when passing self to methods. This is problem in library interfaces as derived types will be passed incorrectly to functions taking parameters by value. Such functions would need to be generic to take any derived type by value correctly. It also requires that type parameters bounds can express that a type has to be derivate of some base type. Because it is not feasible to design public interfaces on a large scale only with generics because of hypothetical derived types we would not loose much if we disallow extending types from external crate. In any case the rules can be made more benevolent in the future.

Reference slicing

Objects can be passed to functions also by reference. This leads us to the second form of slicing problem that manifests itself in the following example:

// Value extends BaseValue, both defines method foo()
let v = Value;
let baseref = &v as &BaseValue;
baseref.foo() // ???

What method is called on the last line? It depends on the exact implementation of method dispatch. In C++ it would be either BaseValue::foo or Value::foo depending on declaration of BaseValue::foo. If it is declared virtual it will be Value::foo, otherwise BaseValue::foo which is most likely not what programmer intended to call. For performance reasons virtual methods are not popular in C++ as virtual methods are dynamically dispatched which is more costly than static dispatch and also prevents inlining of methods. However, for the sake of correctness Java makes all methods unconditionally virtual.

Rust is focused on safety as well as performance. If safety measures of Java are applied, structs could be declared virtual and all methods of such structs and its derivates will be dynamically dispatched. Or methods could be declared virtual individually the same way they are in C++. In the later case the struct will be called virtual if there is at least one virtual method in its implementation. Virtual structs will have a hidden field that contains pointer to a vtable for all virtual methods.

With introduction of vtable pointer the new risk for object slicing appears. When only part of the object is copied the vtable pointer has to be updated to match the new type otherwise calls to virtual methods will access field that no longer exist, causing crashes and compromise memory safety.

To eliminate dangers of reference slicing completely, shadowing of non-virtual methods in derived structs should be forbidden. This rule is problematic in respect to public library interfaces because adding new methods to a struct can make some code fail to compile. That will prompt library authors to declare stucts virtual just to stay out of way of their users. Not a good practice. Also parameters passed by value are still problematic and libraries with virtual structs will try to switch to reference passing or generic implementations. The same measure as with object slicing can be used - to forbid inheritance between crates.

One dynamic dispatch mechanism

So far we have seen that virtual methods can solve some of the problems with slicing. Unfortunately virtual methods form mechanism very similar to traits system already present in Rust. It is undesirable to have two parallel dynamic dispatch mechanisms for several reasons. Most important is the risk of ecosystem schism where some code will use traits and other virtual methods and interoperability will suffer. Even if inter-crate inheritance is forbidden this is bad situation.

Therefore we would like to unify virtual structs with traits. This way we can even overcome many of downsides from forbidden inter-crate inheritance. So we restrict struct impls only to non-virtual methods and forbid extending structs from extern structs and also forbid shadowing of methods on derived types. This way we avoid slicing problems and problems with library interfaces. The virtual part of inheritance will be resolved using traits. There are different ways how to do it.

  1. Structs can “inherit” from traits or structs, but not from both at the same time.

    trait Tr { … } struct S : Tr { … } impl Tr for S { … } struct S2 : S { … }

Struct S will contain vtable pointer similarly to typical implementation in other languages. The downside is that it is not possible to add new virtual methods in derived stucts as it will also need to define new trait that extends the base trait. Alternatively we can allow to extend struct and trait at the same time if the trait extends trait that base struct extends. Following will be allowed:

trait Tr2 : Tr { ... }
struct S3 : S + Tr2 { ... }  // this is allowed
  1. Allow binding traits to object independently from struct declarations (like in the RFC #9). Suppose that there exists some type FatObjecct<S, T> where S is struct or enum type and T is a trait. FatObject<S, T> will internaly store pointer to vtable of implementation of trait T for S and the actual value of type S. It is bit for bit the same representation as in 1). The advantage is that different combinations of T and S can be easily constructed and that S itself is not tied to any particular trait. In fact such FatObject can be used with no inheritance at all to solve real problems easily and effectively. The FatObject<S, T> can be coersed to &T so fat pointers can be passed around program or to library interfaces. There would be also dynamically sized type Fat<T> constructible from FatObject<S, T> that can be used to pass around thin pointers to instances with type erasure. The tricky part is conversion from Box<FatObject<S, T>> to Box<Fat<T>> which would be very helpful but is outside of the scope of this text. Also bounds on type parameters that can restrict types by inheritance from some base type will be nice.

  2. Some other schemes that binds trait implementations to nodes of inheritance hierarchy.

Conclusion

Because of the infamous slicing problems any typical inheritance implementation will to be rather limited in respect to usage in library interfaces. In my opinion inheritance can be used without any risks to performance and memory safety only inside single crate and it can be forced by language rules. Therefore there is no need to make inheritance un-ergonomic. Making bond between inheritance and traits will ensure that traits are still the only viable option for library interfaces. Besides modeling problems in an OOP fashion inheritance can be good tool for code reuse and reduce size of code in many situations.

It is quite clear that inheritance and dynamic dispatch are largely independent problems albeit historically solved together in many existing object-oriented languages where dynamic dispatch also helps solve slicing problems. Rust is not object-oriented and has highly independent dynamic dispatch mechanism so there is a opportunity to make inheritance and dynamic dispatch truly independant features.


#2

I would suggest looking into the approach chosen by Go and adapting it for explicit trait implementations of rust instead of implicit implementations of Go.

The idea is not to have inheritance at all, but instead introduce a way to say "Implement trait T for struct X by delegating all methods not explicitly defined to member y". Something like:

trait Tr { ... }
struct S { ... }
impl Tr for S { ... }
struct S2 {
    s : S;
    ...
}
#[delegate(S::s)]
impl Tr for S2 { /* nothing needed here, but overrides are permitted */ }

Any case that would be problem for the borrow checker would be prohibited. One I can think of is if S2 is Drop, S is not Copy and Tr has method that takes self by value (so it moves s out, but then the drop implementation would get partially invalid object). But that should be pretty rare; most traits take self by reference.

Note, that whenever polymorphism is static (using generics) the compiler can statically make out what to call and for trait references the dispatch is dynamic either way, so it does not add any performance penalty either.


#3

Here is an article about adding traits with required fields to a java-like language. It doesn’t mention memory layout, but the notation looks very neat. Perhaps it can be an inspiration.

Bettini et al, “Implementing Software Product Lines using Traits” (2010) http://www.di.unito.it/~damiani/papers/oops10.pdf

EDIT: This is basically “Composition by multiple struct inheritance”.