Auto-generate crate-agnostic anonoymous sub-traits for `dyn Foo + ...`

alonely0 · August 30, 2023, 3:32am

It is not possible to have a dynamic dispatched type implement multiple traits (except for markers) because it is a fat pointer of (*data, *vtable). The suggested workaround by the compiler is to create a super-trait that implements all the traits the dynamic dispatched type ought to implement, but I've always wondered why the compiler is not able to do it by itself. It seems trivial to me to turn something like:

struct Baz;

trait Foo {}
trait Bar {}

impl Foo for Baz;
impl Bar for Baz;

fn qux() -> Box<dyn Foo + Bar> {
    Box::new(Baz)
}

into:

struct Baz;

trait Foo {}
trait Bar {}

impl Foo for Baz;
impl Bar for Baz;

trait FooBarWithCompilerMagicOnTop: Foo + Bar {}

impl<T> FooBarWithCompilerMagicOnTop for T where T: Foo + Bar {}

fn qux() -> Box<dyn FooBarWithCompilerMagicOnTop> {
    Box::new(Baz)
}

When dealing with public traits in foreign crates, there will be name conflicts to resolve, sure; but isn't there a way? That's why FooBar has CompilerMagicOnTop. If two crates make the same dynamic trait "combination", we just transparently deal with it to make them interoperable (compiler magic), but when all the trait bounds are all local (i.e. from the same crate) it is very simple to codegen.

RustyYato · August 30, 2023, 6:41am

That's the hard bit, and why it's taking so long to get a performant design for this. Check out past discussions for details. If you have a new idea, it would be great to see it.

alonely0 · August 30, 2023, 6:41am

I will. I guess I should mainly search on the rust GH repo, right? Edit: Going through Zulip is probably a good idea too.

alonely0 · August 30, 2023, 6:58am

I think I got it. This is not something easy, but we can solve this basically overnight with something like Higher-Kinded Traits. We just need to define a blanket impl generic over the trait, not just the type; something that is semantically equivalent to "The FooBarWithCompilerMagicOnTop is implemented for any type that implements a trait, that in turn, implements Foo + Bar".

The compiler then would codegen different traits inside every crate boundary, but they all would be compatible with each other thanks to that blanket impl. Then in the link or pre-link stage, we just clean up the mess of traits that we left, that is if we want to go with the easy solution for getting rid of the repeated traits (but it might take more compile time than if we just cleaned it all up during MIR normalization).

Syntax bikeshed #1

impl<T> FooBarWithCompilerMagicOnTop for T
where
    for<A: Foo + Bar> T: A
{}

Syntax bikeshed #2

impl<T> FooBarWithCompilerMagicOnTop for T
where
    for<trait A: Foo + Bar> T: A
{}

Syntax bikeshed #3

impl<trait T> FooBarWithCompilerMagicOnTop for T where T: Foo + Bar

The syntax bikesheds are ordered by my preference, from (IMO) better to worse.

Edit: I just realized that if this ever gets implemented, we basically get the last thing we need to turn Rust into a fully capable language of Functional Programming, having it be a Haskell that doesn't require a PhD in applied mathematics and allows mutation. I. WANT. THIS. NOW.

RustyYato · August 30, 2023, 7:25am

How would you're scheme handle function pointers, note that each function may come from different crates. For example

// crate A
fn get_stuff() -> fn() -> &(dyn Foo + Bar) {
    ...
}

// crate B
fn get_other_stuff() -> fn() -> &(dyn Foo + Bar) {
    ...
}

// crate C
fn use_stuff(b: bool) -> fn() -> &(dyn Foo + Bar) {
    if b { crate_a::get_stuff() }
    else { crate_b::get_other_stuff() }
}

The tricky bit here is we can't really inject any code in crate C to reconcile to two potentially different layouts for the vtable given by the other two crates.

alonely0 · August 30, 2023, 7:48am

We'd need to make vtable generation deterministic between compilation runs when the function pointers are the same and come from exactly the same traits. This is not currently guaranteed, but it is not that difficult to build some sort of deterministic, perfect function pointer sorter that would have every vtable be the same when the function pointers it implements are the same. The Rust ABI isn't stable, but we can still get away with this; if we were to link code generated on different compiler versions or runs, it'd still produce UB if it linked in the first place. I'm not talking about ABI stability, this is deterministic code generation.

To illustrate what I'm saying:

A vtable of functions

clone()
drop()
foo()
bar()

that comes from the specific traits (no less, no more) Clone + Drop + Foo + Bar, will produce the same vtable across all crates, by having a deterministic, perfect function pointer sorter.

Is this viable, or am I being stupid?

RustyYato · August 30, 2023, 8:20am

Not across multiple compiler runs, even within one run. Let me clarify the example. crate C depends on crate A and crate B, but A and B don't have any connection. Both A and B will try to generate a vtable, but each may pick a different layout. And crate C cannot in general reconcile this difference. So how would you ensure that the layout of &(dyn Foo + Bar) is consistent (including the vtable) across all crates and dependencies for a single cargo build

alonely0 · August 30, 2023, 8:36am

Thanks for the clarification, I actually had some issues understanding your example. What I'm trying to say is that when the compiler generates the vtable for the invisible sub-trait of Foo + Bar, we have to force it to always generate the same vtable (within a compiler run), without changes in the layout or the function pointer order; i.e., the vtable must be deterministic. However, if we add another trait bound to the dyn, it would become incompatible, because the layout isn't guaranteed to be the same anymore (the same would happen with the manual implementation). In other words, we can't downcast a dyn Foo + Bar into a dyn Foo or a dyn Bar, at least without magic. Maybe we can swap the vtable pointer, but I don't know to which extend that's viable.

Nemo157 · August 30, 2023, 10:16am

For named super-traits this is an unstable feature now:

github.com/rust-lang/rust

Tracking issue for dyn upcasting coercion

opened 02:41AM - 31 Oct 19 UTC

closed 06:48PM - 22 Nov 23 UTC

alexreg

A-traits T-lang T-compiler S-waiting-on-team C-tracking-issue disposition-merge finished-final-comment-period F-trait_upcasting to-announce S-tracking-ready-to-stabilize T-types

This is a tracking issue for [RFC3324](https://github.com/rust-lang/rfcs/pull/33…24). Corresponding [MCP](https://github.com/rust-lang/lang-team/issues/98) is here. The feature gate for the issue is `#![feature(trait_upcasting)]`. ### STATUS UPDATE #### We are in the process of stabilizing. ### About tracking issues Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however *not* meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label. ### Steps - [x] Implement the experimental feature behind a feature gate. - [x] Define the semantics explicitly. - [x] Gathering design docs into initiative repo: https://github.com/rust-lang/dyn-upcasting-coercion-initiative - [x] Adjust documentation ([see instructions on rustc-dev-guide][doc-guide]) https://github.com/rust-lang/reference/pull/1259 - [x] Gather data about memory usage? https://github.com/rust-lang/rust/issues/112355 - [ ] Stabilization PR ([see instructions on rustc-dev-guide][stabilization-guide]) https://github.com/rust-lang/rust/pull/101718 ### Previous discussions * RFC401: https://github.com/rust-lang/rfcs/pull/401 * RFC982: https://github.com/rust-lang/rfcs/pull/982 * RFC-issue2765: https://github.com/rust-lang/rfcs/issues/2765 * MCP: https://github.com/rust-lang/lang-team/issues/98 (Current lang-team MCP) * Initiative: [Dyn upcasting coercion initiative](https://github.com/rust-lang/dyn-upcasting-coercion-initiative) * RFC: https://github.com/rust-lang/rfcs/pull/3324 ### Unresolved Questions * ~~Currently `CoerceUnsized` trait cannot express this case: a smart pointers that wrap a raw pointer and don't guarantee via a custom invariant that it is valid. Maybe a separate `CoerceUnsizedUnsafe` trait is needed. (see https://github.com/rust-lang/rust/pull/88010#issuecomment-898893209).~~ * Should we make upcasting opt-in in some form to limit vtable size by default? The current inclination of the lang-team is "no", but it would be useful to gather data on how much supporting upcasting contributors to overall binary size. * ~~Before stabilizing it we should check that libs-api is ok with upcasting for all dyn-allowed traits in the library, since those we can't change.~~ (addressed by https://github.com/rust-lang/rust/issues/65991#issuecomment-1369970129) * Should we add an *opt-out* mechanism, and extend library stabilization checklist with "do we want to opt-out for now"? ### Implementation history * Initial attempt: #60900 * Second attempt: * [X] Preparation refactoring: #86291 * [X] Unify vtable building: #86475 * [x] Update vtable format: #86461 * [x] Feature gate implementation: #86264 * [x] Pointer conversion adjustments: #87515 * [x] Support multiple occurrence of a target trait with different substsdefs: #88135 * [x] ~~Unsafety checking for raw dyn pointers: #88010 (not prefered approach, likely will need to do nothing here, but need a design meeting discussion with lang-team)~~ Solution: https://rust-lang.zulipchat.com/#narrow/stream/326132-t-types.2Fmeetings/topic/meeting.202022-09-02.20planning/near/296826711, https://github.com/rust-lang/rust/issues/101336 * [x] Lint for easing migration: https://github.com/rust-lang/rust/issues/89460 * [x] Reference changes https://github.com/rust-lang/reference/pull/1259 * [x] Mark this feature no longer incomplete and stablize: https://github.com/rust-lang/rust/pull/101718 * [x] Make trait upcasting work with supertraits with fewer associated types: #114036

I would assume it should Just Work:tm: for anonymous supertraits too (EDIT: actually re-reading the description, I guess these are anonymous subtraits, Foo + Bar has the supertraits Foo and Bar).

newpavlov · August 30, 2023, 12:56pm

I believe a better approach is to use "fatter" pointers. dyn Foo + Bar would be represented as (data_ptr, foo_vtable_ptr, bar_vtable_ptr) (it could use a separate optimization for marker traits). It immediately resolves all questions around deterministic compilation, potential bloat issues, casting between dyn A + B + C and dyn B, and it's easy to see the associated overhead from reading source code. If you do not want to pay for the additional pointer, you always can create a super trait.

alonely0 · August 30, 2023, 2:23pm

That doesn't really scale at all. Probably I've not expressed myself clearly enough at the start, but one of the majors points of this feature, which I'm now renaming to anonymous sub-traits, is the fact that it allows you to make code across the crate border work flawlessly without worrying about implementing all different sub-traits that mean the same under the hood. Fatter pointers are a sub-par solution that would force this feature to be turned back into a manual sub-trait when the requirements of a project eventually and inevitable increase, creating an API change that is also inevitably a breaking one. It'd be like a ticking bomb waiting to explode; doing it this way would be like setting up a technological debt trap in disguise as a nicer syntax.

alonely0 · August 30, 2023, 2:27pm

That's great actually, seems like the only real blocker is implementing the beast on its own that Higher-Kinded Traits (a.k.a. Ruskell) is. I will try to get the next T-lang meeting to discuss it, I'll see if I can get there through Zulip.

newpavlov · August 30, 2023, 2:43pm

In my practice I very rarely need more than 2-3 combined traits used with dynamic dispatch. If your project needs "increase", I highly doubt you will work with abominations like dyn Foo + Bar + Baz + Zoo + Zaa in your code and will introduce a wrapper around the combination either way. And note that I am not saying you have to implement super-trait wrappers manually, hopefully, we will get trait aliases eventually, which will allow you to write something like this:

trait MySuperTrait =  Foo + Bar + Baz + Zoo + Zaa;

// `dyn MySuperTrait` will use 2 pointers under the hood, not 6
fn f(a: &dyn MySuperTrait) { ... }

You are being overly dramatic. Syntax-wise there is absolutely no difference, the only difference is how the feature is implemented under the hood and associated technical trade-offs.

alonely0 · August 30, 2023, 3:05pm

With nicer syntax, I meant eliding the manual sub-trait, but you're on point with the fact that this is all an implementation detail. The problem is that ideally we'd want them to behave the same way (the best one), so that a manual sub-trait isn't ever necessary; ideally we wouldn't want to make any trade-offs. This feature is as much about crate interoperability as it is about saving a few lines of code, and I don't think that we should throw you under the bus with the one feature that's not just cosmetic because your requirements are odd. If you look at Tokio for example, some of their dynamic dispatched traits are massive. I still would consider your approach because we avoid Higher-Kinded Traits, though.

This is still a great idea and on the table, but it still doesn't scale, and I'd see it as an antipattern in disguise; the price you're paying for every trait that you're adding is enormous, because someday someone will add another trait thinking that the cost grows lineally (which in memory, it does), but suddenly, your dyn pointers will stop fitting into the registers, and that's not a linear cost to pay when it comes to performance. Dyn is slow enough, and I wouldn't want to add something like this when we can do better.

newpavlov · August 30, 2023, 3:15pm

"The best way" implemented implicitly has very non-trivial consequences for compiler. The most basic one: is dyn A+B used in one crate is the same as dyn A+B from another crate? What about dyn A+B and dyn B+A? Think carefully. For a naive implementation of your proposal the answer is no, since both crates can be compiled in parallel, so both will create separate anonymous wrapper traits.

How is it different from adding a field to a struct passed by value? Everyone who cares about stack spilling will know about associated costs. Again, I highly doubt that in practice people will use more than 2-3 traits in one dyn combination, simply because it makes signatures unwieldy.

alonely0 · August 30, 2023, 3:22pm

I will link you both of my responses to that, as it was said earlier, but TL;DR, we only have to make it deterministic between compiler runs, because our ABI isn't stable.

Auto-generate crate-agnostic anonoymous sub-traits for `dyn Foo + ...` - #6 by alonely0
Auto-generate crate-agnostic anonoymous sub-traits for `dyn Foo + ...` - #8 by alonely0

It isn't, but that is not my point. My point is that we can make these extra trait bounds codegen to be as fast as a manual implementation, which is what Rust always has done (Zero-Cost Abstractions), or we can take the easy route. I don't like the precedent of taking this approach, because this wouldn't be a ZCA over manual sub-traits (we wouldn't be creating them in the first place, I know, they would be different things).

newpavlov · August 30, 2023, 3:34pm

It's not about ABI, it's about addresses of generated vtables. dyn A+B from different crates will point to different vtables with the same function pointers inside. De-duplication of those vtables will require either non-trivial changes to linking, or will undermine parallel compilation of crates. And I haven't even touched issues with downcasting dyn A+B to dyn A or dyn B.

It's a common misunderstanding of the ZCA principle. Under your interpretation you proposal is not zero cost as well, at the very least you generate additional vtables which bloat resulting binary. I can name a number of Rust features which are not "zero-cost" under such interpretation (e.g. there is triple indirection with boxed futures). ZCA is about not paying for features you do not use. If you don't explicitly use dynamic trait dispatch combinations, you will not pay for the fatter pointers.

Combination of the fatter pointers and trait aliases give explicit choice to programmers. Arguably, it is more in spirit of Rust as a system programming language.

alonely0 · August 30, 2023, 3:45pm

I said at the start of the discussion that we can deal with that while linking or even during MIR normalization, and I would like more insight on the matter. Would the changes be that dramatic that they're a no-go from the beginning?

I already said that it's problematic, and IMO we really should get someone from T-compiler to explain us our possibilities here, if any. I would like to hear your take, though.

Touché, I will give you that.

Nemo157 · August 30, 2023, 4:16pm

This is true with named traits today, you can trivially get duplicate vtables for the exact same &() as &dyn Debug expression compiled in different crates.

Vorpal · August 30, 2023, 4:36pm

Shouldn't this already be supported by linkers? After all, they need to merge definitions for C++ to comply with ODR. How hard would it be for rust to use the same mechanism?

Topic		Replies	Views
pre-RFC: Automatic generic-to-dynamic conversion language design	11	3020	March 25, 2019
Idea: New feature flag to produce type-only crates to solve the orphan rule traits working group	3	847	December 22, 2024
Revisit Orphan Rules language design	89	10563	March 25, 2019
Implementing traits and types for multiple foreign crates (pre-RFC) language design	4	8520	March 25, 2019
Add trait objects with static dispatch language design	13	4686	May 3, 2022

Auto-generate crate-agnostic anonoymous sub-traits for `dyn Foo + ...`

Related topics