Premonomorphization

mcy · July 21, 2021, 8:56pm

My thread on generic externs ("Generic" externs for high-level FFI - #12 by mcy) got me thinking about another C++ feature that we don't currently have an equivalent for: explicit instantiation.

In C++, I can write:

// foo.h
template <typename T>
void Foo(T t) { ... }

// Forward-declare a specific instantiation of the template:
extern template void Foo<int>(int);

// foo.cc

// Explicitly instantiate the template.
template void Foo<int>(int);

extern template makes it so that calling code that happens to instantiate Foo<int> will not actually go and generate code for it, and simply emit a relocation (sound familiar)? The template ... item in the .cc file then provides the definition.

Like in Rust, all templates in C++ are implicitly inline, but you can explicitly "outline" specific common instantiations. I can see this being useful for certain commonly-monomorphized generic functions to help cut down on build times.

The way I imagine it working in Rust (please, please, please do not bikeshed the syntax; it's a strawman) is like so:

struct Foo { ... }
impl Foo {
  fn bar<T>(t: T) { ... }
}

monomorphize!(Foo::bar::<i32>);

This would result in:

An instantiation of bar for T = i32, included as an exported linker symbol in the resulting .rlib.
Some kind of metadata in the .rlib saying that this specific instantiation is "externed".

Dependents can then skip instantiation if they see this metadata and emit a symbol reference as with any ordinary function.

Extra stuff:

We should allow monomorphize!(#[inline] Foo::bar::<i32>), which causes LLVM bitcode to be included for the instantiation, as would happen for a "normal" inline functions (v.s. how each crate would generate its own bitcode for the instantiation).
It might be worthwhile to use this mechanism to skip going through the frontend for re-instantiations of generics. For example, if a crate causes Foo::bar::<i32> to be instantiated into LLVM bitcode, it should advertise it to downstream crates for re-use (as if by monomorphize!(#[inline])). If we already make this optimization, awesome! We can hide this behind a -Z flag to opt in/opy if we're worried about disk usage as with -Cembed-bitcode.
I don't think we need to worry about orphan rules here, because the result of instantiation is the same across all crates.

Given that this is a pure optimization, I don't think it's worthwhile to invent novel syntax like C++ does. Just add a compiler-provided macro or attribute and don't think about it further.

CAD97 · July 21, 2021, 9:38pm

IIUC, this is implemented as -Zshare-generics and is (or at least was at one point) on by default for unoptimized (and size optimized) builds, but off for full optimization builds.

wesleywiser · July 22, 2021, 1:39pm

That is still the current state.

mcy · July 22, 2021, 3:09pm

Ah. That is somewhat different from what I would propose, since I wanted to pipe bitcode through to make them inlineable. I suspect there is some work to be done here to make it so that we do not lose perf benefits for -O2, as discussed on the CL.

system · October 20, 2021, 3:10pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Explicit monomorphization for compilation time reduction compiler	32	5890	April 14, 2022
"Generic" externs for high-level FFI language design	12	3776	October 18, 2021
Better C++ interoperability internals	30	26978	March 25, 2019
Eagerly instantiating generic types to improve build speed? compiler	12	1288	December 19, 2019
Some notes on reducing monomorphizations compiler	13	6816	March 25, 2019

Premonomorphization

Related topics