Eagerly instantiating generic types to improve build speed?

EDIT: s/specialize/instantiate/g

We have some code which defines a large generic type (a parser), which is instantiated in different ways to generate different kinds of output.

The crate dependency structure is (where a <- b is "b depends on a"):

parser  <-  instantiation1  <- usercrate
    ^------ instantiation2  <--/
    ^------ instantiation3  <--/
    ^------       ...       <--/

That is, there's a fan-out in the middle of the dependency graph.

Each instantiation crate is basically:

use parser::Parser;
use something::{Something, Else};

type SomethingParser<'a> = Parser<'a, Something, Else>;

I.e. it just defines a type alias for the specialization.

usercrate brings this all together, and makes use of all the instantiations.

The problem is that usercrate has a long build time, and a significant amount of that is from the instantiations, which have been deferred to their use-sites.

If we modify the instantiationN crates to make a dummy invocation of the parser (via public, but unused, function which invokes the parser on dummy input), then it moves the instantiation into the instantiationN crates. This still takes CPU time, but it parallelizes well and the overall wallclock time is reduced. In particular, the wallclock time for usercrate goes down significantly.

We're using codegen-units, of course, but this shows that it is not enough to compensate for completely independent rustc invocations. Perhaps more intelligent scheduling of work within rustc could help improve this.

This also suggests that if there were multiple usercrateNs then doing it eagerly would do it once, but lazily would defer the work to be done independently in each user crate.

I'm also not sure if this a specific side-effect of using type ie, just a type alias, which doesn't really do anything. Would using a newtype wrapper help? But that would be awkward because it would also need impls for a lot of forwarding methods.

But alternatively, I was wondering if it would be worth adding an attribute to (something) to indicate that eager instantiation would be useful. I'm not sure what it would be added to; I guess either:

  • a generic type definition
  • a specific instantiation (like type above)

There's also the question of whether this only applies to full instantiations, or where there's some benefit to eagerly partially instantiating. In that case it might also make sense to annotate specific type parameters as being good for instantiation.

I'm thinking of this as a pure hint, along the lines of #[inline], where there's no semantic change, and the compiler is free to ignore it.

Thoughts? Has anyone run into this before?

1 Like

For reference, C++ has a similar feature in the form of extern template, though it's not used terribly often.

What would this look like, exactly? Some mechanism that says "please force-instantiate this concrete type?" I suppose that in Rust you can actually get away with it being a pure hint, since instantiation is not side-effecting like it is in C++.

I'm thinking something like:

#[specialize_eagerly]
type MyParser<'a> = Parser<'a, MyParams>;

would cause everything that is now non-generic to be emitted into this crate. Something that directly depends on this crate would be able to use that specialization without having to do it for itself. (Lifetimes can be ignored here since they don't affect the generated code or memory layouts.)

Since there's no concept of a coherence rule here, there's no special crates where a specialization should be found - so it all has to be done in an opportunistic way. The "check first order dependencies" is just a heuristic - it could be "check entire dependency graph" or something else, but it could be changed without changing the semantics.

This doesn't really help with the other problem, of common types being instantiated all over the place (eg Vec<String>), but one could imagine libstd having some common types pre-instantiated.

1 Like

(Note that specialization is a confusing word in this context with impl specialization, instantiation or monomorphization are probably better).

8 Likes

Could we just always do that for publicly-exported non-generic types? Is there a place where that would be undesirable?

One could imagine a crate which has lots of such types, but only a small subset is used at a time.

1 Like

Sure, but that can also happen with types that aren't instantiations of generic types, where we're currently willing to compile all of them.

1 Like

Isn't this the kind of thing that DCE and LTO are supposed to deal with anyway? I think having the semantics of

pub type T<'a..> = K<'a.., U..>;

be "always instantiate the RHS of T and generate all relevant code and metadata" is perfectly fine.

Especially since this would just be a hint anyway, this feels like something that rustc could and probably just do anyway.

The only downside is making the compile time of crates with lots of concretizing type aliases that aren't used longer, but since it equivalently lowers the compile time of every dependent using that type (theoretically), it seems like a worthwhile trade-off.

The tricky part is actually teaching rustc/LLVM to take advantage of it. Currently type aliases are just thin aliases IIUC, so using MyParser<'_> is equivalent to writing Parser<'_, MyParams>. This means rustc needs to figure out somehow that a upstream crate has already done the monomorphization for us. I'm fairly certain that IIRC even when a type alias is used and fully monomorphized in an upstream crate, each downstream crate still re-monomorphizes it.

I'd love to be wrong, but this improvement would "just" be noticing and reusing upstream monomorphization work along with making concretized type aliases do a complete monomorphization for use in downstream crates.

2 Likes

Can this be solved more naturally with incremental compilation? That is, if there is a cached instantiation query will all downstream crates share it (edit: this would still result in a speedup as long as somebody typically wins the race to populate the cache), and further can we teach incremental compilation to wait for in-progress queries?

2 Likes

This entire thread (as far as I've seen, apologies if I missed something) talks about "instantiating types" as if the type were the granule of code generation. But the thing being monomorphized is code, and so the unit of monomorphization are functions (either free, in inherent impls, or in trait impls). It is hard to precisely define what set of functions should be instantiated to "instantiate the type Foo<Bar>", but no matter what, it'll probably instantiate too much for many use cases.

For example, in the motivating Parser use case, users probably only care for one entry point to the parser, let's call it fn parse. Other code used from fn parse must also be instantiated in turn, but that's already how rustc works today. On the other hand, probably nobody wants the reflexive From impl (i.e., <Parser<..> as From<Parser<...>>::from) to be compiled.

By the way, with today's rustc behavior this effect can be achieved by exporting a monomorphic, non-#[inline] function that wraps parse:

fn parse_something(input: &str) -> Result<Something, Else> {
    Parser::new().parse(input)
}
5 Likes