Premonomorphization

My thread on generic externs ("Generic" externs for high-level FFI - #12 by mcy) got me thinking about another C++ feature that we don't currently have an equivalent for: explicit instantiation.

In C++, I can write:

// foo.h
template <typename T>
void Foo(T t) { ... }

// Forward-declare a specific instantiation of the template:
extern template void Foo<int>(int);

// foo.cc

// Explicitly instantiate the template.
template void Foo<int>(int);

extern template makes it so that calling code that happens to instantiate Foo<int> will not actually go and generate code for it, and simply emit a relocation (sound familiar)? The template ... item in the .cc file then provides the definition.

Like in Rust, all templates in C++ are implicitly inline, but you can explicitly "outline" specific common instantiations. I can see this being useful for certain commonly-monomorphized generic functions to help cut down on build times.

The way I imagine it working in Rust (please, please, please do not bikeshed the syntax; it's a strawman) is like so:

struct Foo { ... }
impl Foo {
  fn bar<T>(t: T) { ... }
}

monomorphize!(Foo::bar::<i32>);

This would result in:

  • An instantiation of bar for T = i32, included as an exported linker symbol in the resulting .rlib.
  • Some kind of metadata in the .rlib saying that this specific instantiation is "externed".

Dependents can then skip instantiation if they see this metadata and emit a symbol reference as with any ordinary function.

Extra stuff:

  • We should allow monomorphize!(#[inline] Foo::bar::<i32>), which causes LLVM bitcode to be included for the instantiation, as would happen for a "normal" inline functions (v.s. how each crate would generate its own bitcode for the instantiation).
  • It might be worthwhile to use this mechanism to skip going through the frontend for re-instantiations of generics. For example, if a crate causes Foo::bar::<i32> to be instantiated into LLVM bitcode, it should advertise it to downstream crates for re-use (as if by monomorphize!(#[inline])). If we already make this optimization, awesome! We can hide this behind a -Z flag to opt in/opy if we're worried about disk usage as with -Cembed-bitcode.
  • I don't think we need to worry about orphan rules here, because the result of instantiation is the same across all crates.

Given that this is a pure optimization, I don't think it's worthwhile to invent novel syntax like C++ does. Just add a compiler-provided macro or attribute and don't think about it further.

2 Likes

IIUC, this is implemented as -Zshare-generics and is (or at least was at one point) on by default for unoptimized (and size optimized) builds, but off for full optimization builds.

3 Likes

That is still the current state.

Ah. That is somewhat different from what I would propose, since I wanted to pipe bitcode through to make them inlineable. I suspect there is some work to be done here to make it so that we do not lose perf benefits for -O2, as discussed on the CL.