As a C++ user, I’ve had some experience fighting template-induced code bloat in the past, and there are definitely “patterns” emerging. I thought it could be helpful to share the techniques we know about, whatever the original language, as they may translate to Rust patterns… or even better rustc optimizations.
Here are 3 that I can think of, off the top of my head.
Hoisting
As already mentioned, one pattern is to “hoist” the definition of items outside of the main template code to reduce the number of instantiations. For example, in:
template <typename T, std::size_t N>
class fixed_capacity_vector {
public: /**/
private:
std::size_t size = 0;
std::aligned_storage_t<sizeof(T) * N, alignof(T)> data;
};
A lot of the methods will NOT depend on N, the iterator and const_iterator types will not depend on N, … In this case, the use of “mixin” is pretty natural: define a base class which stores all the non-N stuff!
Note: LLVM’s SmallVector is an example of this strategy.
As already mentioned by @kornel with AsRef, this can also be applied to functions where a “monomorphic” core can be extracted from a “generic” sandwich.
Shims
When there is no immediate monomorphic core, it can be possible to create one by introducing light-weight type-erasure.
The best C++ example I have here is a printf-like replacement:
template <typename Writer, typename... Args>
void printf(Writer& writer, char const* format, Args const&... args);
Writing all the formatting code there is going to lead to a large generated function, and it’s going to be repeated for each slight variation of the argument types (and each permutation!). Furthermore, the actual meat of the formatting is likely to be significant (formatting an integer in ASCII costs, etc…), so a small performance penalty would be unnoticeable…
void printf_impl(
IWriter& writer,
char const* format,
std::initializer_list<IArgument const*> args
) __attribute__((flatten));
template <typename Writer, typename... Args>
void printf(Writer& writer, char const* format, Args const&... args) {
auto iWriter = make_iwriter(writer);
printf_impl(
iWriter,
format,
{ &static_cast<IArgument const&>(make_iargument(args))... }
);
}
The code can still be inlined, if at least via LTO, but the bloat is considerably reduced.
Total Type Erasure
More extreme, laying a thin layer of strongly-typed code over a void* core. This obviates the need to instantiate shims, albeit only provides very limited functionality obviously. Still, qsort does works with void*, so not all is lost.