Add back polymorphization

Previous proposals such as this one revolve around new syntax to indicate when to use polymorphization.

I would propose instead that enabling and configuring where polymorphization is applied should be a compile-time flag and not be configurable by crates.

This solves the problem for existing crates that may make heavy use of generics, for good reason. When trying to compile these crates for wasm, the developer can manually discover monomorphizations with twiggy and mark them to be polymorphized via a JSON file (or perhaps some more automated tool). Instead of the crate author having to decide on performance vs binary size, it should be the end-developer actually shipping the binary on their particular environment and hardware.

Ideally static dispatch could be replaced with dynamic dispatch, where desired, as well.

7 Likes

polymorphization seems like it could be especially helpful for reducing code size when using parsing code that takes a generic reader.

perhaps it could even be combined with PGO to polymorphize functions that are not bottlenecks.

Another thing I'm not sure if it's related is the binary size overhead of serde. Miniserde and serde-lite improve this, but requires adjusting existing crates to use a different trait. Would polymorphizarion help here?

1 Like

Polymorphization as in the previously existing analysis and optimization is not a tradeoff, it's an objectively good thing. No performance is lost.

It seems like you're asking for something different, like replacing generics with dynamic dispatch when possible.

2 Likes

for reference, here is the MCP that removed polymorphization

1 Like

The thing confusing about this is that they link a tracking issue that contains info about a redesign,

There are a few more details about the current state and history in the feature's tracking issue, including a link to a sketch of a redesign: rust-lang/rust#124962

But it was closed as completed and then not planned. But this tracking issue should be reopened, right?

I keep thinking that a flag that enabled a more aggresssive polymorphization (much further than was available in the previous implementation) could enable Rust to have smaller binaries and faster compile times, at a (probably) modest runtime cost. This could help adoption of Rust both in areas where fast iteration and hot reloading is important, and in areas where smaller binaries is paramount (like in web frontend).

Ok well then would love to see that topic re-visited!

Yeah sorry I'm probably asking for multiple things here, I'm not familiar with rustc's internals. I guess in addition to re-opening the previous discussion also I would want to see compiler flags allowing granular control over:

  • additional polymorphization that may have negative runtime performance effects, but reduces binary size
  • replacing static dispatches with dynamic dispatches (again, for existing crates, without any code edits)

For context: I'm building a WASM-based library and of course binary size is very important for bundle size and bandwidth consumption reasons. I've reached the limits on what's possible to cut without rewriting major parts of common libraries e.g. serde or alloy. But I'm hoping compiler improvements may one day avoid having to replace libraries for more binary-friendly versions.

There are two separate things:

  • Removing unused generics. This is what the polymorphization option did. It should never hurt runtime performance.
  • Replacing static dispatch with dynamic dispatch. This can hurt runtime performance (or improve it when it reduces instruction cache pressure). This is not always possible, but could be beneficial to support. Doing it unconditionally when possible can actually increase code size however due to making optimizations less effective.
5 Likes

Oh, I didn't know polymorphization just removed unused generics. When I was thinking "aggressive polymorphization" I thought about replacing some generics with dynamic dispatch. Like, undoing monomorphization.

One awful thing about Rust is that if I want to switch back and forth between generics and trait objects, I need to change APIs, code and types in a viral way (similar to changing between sync code and async/await, but worse because it infects types in a more involved way). It makes it hard to have an honest assessment on what is the performance penalty of dynamic dispatch in real world code, and how is the compile size affected, specially if the relevant code is in public APIs used by many unrelated programs.

Maybe if traits had a #[may_polymorphize] or #[may_turn_into_dynamic_dispatch] or something like it, it could guide the compiler into deciding whether some generic code should be turned into trait objects automatically. (Of course that's either an error or a warning for traits that aren't dyn compatible, unless the compiler is willing to do some involved transformations automatically like what is done manually by erased-serde)

One of the strengths of C++ in this area is that switching between dynamic and static dispatch is just adding or removing a virtual keyword.

TBH I rarely found that to be the case, and just adding virtual tended to not actually work in C++ either as things just got sliced off when the type was treated like a normal value.

(What you describe is more true in C# where you're usually paying all the allocation and indirection costs anyway even for things where you're using static dispatch.)

I rather like the per-callsite choice you have in Rust: if something takes impl Iterator<Item = String>, you as the caller have the choice between passing your iterator as a generic or type-erasing it to pass &mut dyn Iterator<Item = String> so that what you're calling will only have one instance in the binary.

2 Likes

In Rust the difference is often just replacing impl with dyn or vice versa:

fn foo(x: &dyn Display) {
    println!("dynamic {x}");
}

fn bar(x: &impl Display) {
    println!("static {x}");
}

fn main() {
    foo(&123);
    bar(&123);
}

Whereas in C++ I don't know how you'd even start approaching the same problem.

4 Likes

I am ok with an annotation that sometimes turns things into virtual dispatch, and sometimes does nothing (provided there is a warning)

This suggests that APIs should be written using generics anyway, and the user decides whether they want dynamic dispatch or static dispatch. This could work.. if there weren't so many API calls (that can span multiple crates) that would need to be changed if you just want to test whether dynamic dispatch buys you anything. If your crate is a published in crates.io, it may become very hard to get your consumers to test whether dynamic dispatch is worth it: overall Rust programmers default to generics even when it is not appropriate. So in practice people may have difficulty changing back and forth, unless there is just a few call sites.

Worse, this is only applicable for generics in argument position. If you are dealing with generic APIs (which include also APIs that return a generic parameter), often you want to store a generic type in a struct. Then the struct itself becomes generic, which is an annoying viral change when it is deep embedded into many structs. The net result is that it becomes even harder to actually test what is the effect of changing the struct to store a trait object instead. There is no impl Trait for storing things in structs. (the trouble here is syntactical - the syntax for trait objects is too far away from the syntax of generics, because generics in structs require a type parameter)


But if we stick to the simpler situation where we have generic parameters in argument position, maybe each such function or method could have a macro that turns a generic argument into a trait object. Something like this

#[dyn]
fn f(x: &mut impl Iterator<Item = String>) {
    for i in x {
        // things
    }
}

Gets expanded to this

#[inline(always)]
fn f(mut x: impl Iterator<Item = String>) {
    #[inline(always)]
    fn f_wrapper(x: &mut dyn Iterator<Item = String>) {
        f_inner(x);
    }

    fn f_inner(x: impl Iterator<Item = String>) {
        for i in x {
            // things
        }
    }

    f_wrapper(&mut x);
}

(If one want to have it configurable by the crate user, they could write #[cfg_attr(feature = "something", dyn)] or something)

This macro may at glance seem pointless (can't whoever is calling f just create the trait object themselves?), but now nobody needs to change anything at call site: they continue to call f(x) just as before, and there is only one place in the code that decides whether f is being monomorphized for each different iterator type or not. Adding or removing #[dyn] to and from a function is not even a breaking change.

Now, if we enable this kind of automatic dynamic dispatch, the users of f lost the ability to specify static dispatch when it makes sense. So some syntax could be provided to force static dispatch anyway (something like mono!(f(x)), or even f(mono!(x)) to monomorphize just one parameter). With this, we kind flipped the default: rather than f being normally statically dispatched and dynamic dispatch being opt-in, now dynamic dispatch is the default that can sometimes be opt-out.

And well my point is that this kind of transformation (and even more aggressive transformations) is in the realm of compiler optimizations (well maybe this one specifically is hard to do automatically since it turns a by-value generic parameter into a &mut but still)

This is absolutely not the case. In Rust you switch between impl and dyn. In C++ the equivalent change would be between template and interfaces[1]. That is a much larger change in C++, requiring a large refactoring.


  1. C++ doesn't really have interfaces as a language feature, but you can emulate them with an abstract base class that contains no data and only pure virtual methods. ↩︎

2 Likes

Well, you can't store a DST inline in a struct (except in a tail position). Why? Because you don't know it's size as compile time.

But you should be able to pass a Box<dyn Trait> in place of a T: Trait, though I forget if you also need to add a impl Trait for Box<dyn Trait> (and being on my phone I can't easily test at the moment).

1 Like

You do need that impl as Box is not that special.

1 Like

(Somewhat off-topic, sorry)

I am also building a wasm thing in Rust and am also worried about binary size. But I've at least come to accept that automatic polymorphism (probably) isn't going to happen.

I think it's much more reasonable to focus the rust-wasm community's efforts into things like code splitting, see here for a (stalled) effort: Splitting into multiple lazily-loaded modules · Issue #3939 · rustwasm/wasm-bindgen · GitHub

I implore anyone with the know-how and bandwidth to work on that instead.

1 Like

one thing you can do to get a hacky version of polymorphization is use the cargo patch section to modify dependencies to use &dyn instead of &impl.

i wonder how far that would take you.