Summary: This post proposes a mechanism for type-erased generics by restoring parametricity of types in a limited fashion.
Rust’s primary mechanism for generics is through monomorphization, following the same path as C++. This is generally good for performance and optimization, but can be bad for binary size, but more importantly it does not offer the full expressiveness of higher-rank types (not to be confused with higher-kinded types nor higher-rank trait bounds). It’s also worse for compile time.
The lack of expressivity is evidenced by the fact that trait objects do not support generics, since you can’t exactly monomorphize at run time. In a way, monomorphization-based generics simply aren’t “first-class” in Rust: you can’t store them in variables without fixing all the type parameters.
What would be nice is if the compiler could distinguish between generics that need to be monormophized vs generics that are erasable. Consider this generic:
fn swap_refs<'a, T>(x: &mut &'a T, y: &mut &'a T) { std::mem::swap(x, y) }
Monomorphizing this kind of function is quite wasteful – the representation of references (or pointers) is independent of what they point to, so all specializations are identical in implementation. But it’s not sufficient to only look at the type signature of the function itself, because types in Rust are not parametric with the introduction of specializations. The debugit crate exemplifies this.
To make this idea work, it is necessary to have an explicit way to mark a parameter as parametric. One approach would be to introduce a new OIBIT named, say, Concrete
. By default all type parameters today are implicitly Concrete
, but one can opt out by writing T: ?Concrete
. Intuitively, Concrete
type is one that is known at compile time and can thus be monomorphized.
To preserve parametricity, the compiler must forbid usage of trait methods that could possibly be specialized if the type is ?Concrete
.
Every variable must have a Concrete
type, and every function must have Concrete
arguments and Concrete
return values. For most data types, we expect an implicit rule like:
impl<T: Concrete + ?Sized> Concrete for MyStruct<T> {}
The exceptions to this rule should be the pointer-like types, &T
, &mut T
, *const T
, and *mut T
, as well as aggregates of such. DSTs complicate this idea somewhat: the representation of pointers depends on whether the pointee type is Sized
. Therefore, we are forced to limit ourselves to Sized
types only:
impl<'a, T: ?Concrete> Concrete for &'a T {}
impl<'a, T: ?Concrete> Concrete for &'a mut T {}
impl<T: ?Concrete> Concrete for *const T {}
impl<T: ?Concrete> Concrete for *mut T {}
impl<T: ?Concrete> Concrete for Box<T> {}
impl<T: ?Concrete> Concrete for Rc<T> {}
impl<T: ?Concrete> Concrete for Arc<T> {}
Note that std::mem::size_of
mustn’t be generalized, because it only makes sense if T: Concrete
.
Now, here’s the critical rule for generics: if a generic contains only type parameters that have no bounds other than marker traits (“trivial bounds”), then the function can be type-erased. In other words, ?Concrete
type parameters behave just like lifetime parameters today: they are guaranteed to vanish before translation.
For example, this generic can be type-erased because T
only has a marker trait bound (Sized
):
fn swap_refs<'a, T: ?Concrete>(x: &mut &'a T, y: &mut &'a T) { /* ... */ }
From here, it follows that if a trait has something like this for a method, e.g.
trait Blah {
fn blah_swap_refs<'a, T: ?Concrete>(&self, x: &mut &'a T, y: &mut &'a T);
}
the generic would no longer cause object-unsafety.
We could push this idea further: the generic swap_refs
can be turned into a bona fide function pointer:
let f: for<'a, T: ?Concrete> fn(&mut &'a T, &mut &'a T) = swap_refs;
Here for<'a, T: ?Concrete>
does not designate a higher-rank trait bound, but more of a higher-rank type. This shows that type-erased generics can be treated as ordinary objects, just like function pointers.
One peculiar feature that type-erased generics can have is the ability to call them without fixing a particular T
. In other words, an example such as the following will not complain with “type annotations needed”,
fn foo<T: ?Concrete>() {}
fn main() { foo(); }
because parametricity ensures that it doesn’t really matter what you T
pick.
At this point, I’ve probably relied on the simple swap_refs
example a bit too much. You might think “What’s the point of this nonsense? You can only quantify over type parameters with trivial bounds!” To that, I present a more elaborate example:
// if you want more flexibility, you can use Box<Fn> instead of just fn
struct AnimalVtable<T: ?Concrete> {
new: fn(&'static str) -> FatBox<T>;
name: fn(&T) -> &'static str;
clone: fn(&T) -> FatBox<T>;
}
// this function is type-erased despite being a generic!
// be sure to pass &ANIMAL_VTABLE_FOR_SHEEP as argument
fn make_dolly<T: ?Concrete>(animal_vtable: &AnimalVtable<T>) -> (FatBox<T>, FatBox<T>) {
let dolly1 = animal_vtable.new("Dolly");
let dolly2 = animal_vtable.clone(dolly1.as_ref());
assert_eq!(animal_vtable.name(dolly1.as_ref()), animal_vtable.name(dolly2.as_ref()));
(dolly1, dolly2)
}
struct Sheep { name: &'static str }
impl Sheep {
fn new(name: &'static str) -> FatBox<Self> {
Box::new(Sheep { name: name }) as _
}
fn name(&self) -> &'static str {
self.name
}
fn clone(&self) -> FatBox<Self> {
self.into_inner().clone() as _
}
}
const ANIMAL_VTABLE_FOR_SHEEP: AnimalVtable<Sheep> = AnimalVtable {
new: Sheep::new,
name: Sheep::name,
clone: Sheep::clone,
};
(See this post for more info on FatBox and why is needed.)
You see, you can encode interesting behavior even with trivial bounds like T: ?Concrete
. Using Rust today, it’s not possible to write the make_dolly
function, at least without compromising type safety and usability.
There’s another use case for type-erased generics: since higher-ranked types are possible for them, it follows that one can implement sound unchecked indexing without using lifetimes as a kludge.
Comments and feedback are highly appreciated (especially if you find a flaw in the logic here).
Edit: Corrected example in response to matklad.