Pre-RFC: returning automatically generating impl Trait

See also https://github.com/rust-lang/rfcs/issues/2414. (Its thread is long, but as far as I know, there is the most specific discussion on this topic.)

3 Likes

Another option would be something more internal to the method, sketch

fn create_or_generate(condition: bool) -> impl Iterator<Iter=Foo> {
    enum match condition {
        true => create_iterator(),
        false => generate_iterator(),
    }
}

Because the fact that it's becoming an enum is local to the particular branching expression -- once the value of that type exists, it doesn't need anything special at the signature level. And this way you could make a local variable of such a type too.

(Plenty of gaps in that sketch too, like doing it with early return, what it would look like in nested-ifs, etc.)

2 Likes

I should have said explicitely that I'm aware of the previous discussion about enum impl trait. At one point it appeared that enum impl trait where filling two needs:

  • creating an opaque type from two or more (possibly opaque) types all implementing the same trait (this proposal). This is especially useful to create iterators from iterators of different types
  • creating an anonymous closed set of known type (a bit like C++ variant). This is especially useful for error handling when one want to specify exactly the possible error types, and not use a catch-all enum for the whole crate. This proposition is not about that part.

This is a very valid question. I would argue that a marker is not needed, because :

  • if this proposition is rejected the code would not compile anyway.
  • it's already explicit that some overhead can be added because of the impl Trait in the return type of the function
  • there is no way to write a better version by hand than what the compiler could generate

So I think that adding syntax should be motivated, not the reverse.


With or without this proposal, the code cannot compile because Add<T: Magma> is not implemented for Magma.

This proposal would only automatically generate the boilerplate in get_magma() needed to make this very function compiles. Rust semantic is not changed, and there is no way to see from the outside how get_magma() is implemented.

I didn't realized that my proposition could be both extended and the rule simplified at the same time. Thanks for pointing my attention on this specific detail.

Currently there are multiple construction that requires to unify types: if, if let, loop with multiple break values and functions (I don't think I forgot one, please correct me if I'm wrong). Currently, to be unified, all the branches must have the same (possibly opaque) type. With this proposition, all constructions that have multiple branch would create a value whose type is:

  • the common type of all branches (this is what is done currently)
  • an opaque type that implements all traits implemented by all the branches

EDIT: this rule is too permissive and would lead to bad error messages. I re-wrote those example in a later post. The trait implemented by the opaque type should be made explicit.

Let's try with some examples:

trait Trait {}
fn foo() -> f32;
fn bar() -> f32;
fn baz() -> impl Trait;
impl Trait for f32 {}

let value: if condition {
    foo()
} else {
    bar()
}; // shared types for all branches, unifies as f32

let value: if condition {
    foo()
} else {
    baz()
}; // no shared type, but a shared trait for all branches, unifies as Trait

trait Other {}
fn other() -> impl Other;

let value: if condition {
    baz()
} else {
    other()
}; // no shared type, and no common trait, no unification possible, doesn't compiles

If also forgot to talk about Sized and unsafe traits.

  • The unification is trivial to do using the mechanism presented in the first post for Sized types, but (at least I don't see how to do it) it seems not possible to do it for !Sized types. This initial version would only extend the current rules for Sized types, and would not try to unify !Sized types (resulting in a compile error).
  • In previous discussion around enum impl Trait, some concern where raised about automatically implementing unsafe trait (it's not sound). This means that this proposition should obviously exclude unsafe trait (unless put inside an unsafe block). However, I don't saw any issue raised against automatically implementing Safe traits.

While that would be true with this proposal, today using impl Trait is no slower than returning the concrete type, and that's one of its advantages -- using it lets you tweak your semver promises without changing the concrete code that executes.

Of course the version of this that already works today is Box::new(x) as Box<dyn Trait>, which adds extra overhead and is explicit about doing it.

That does make me think that another way to do this would be some sort of bounded-size dyn Trait holder -- if it's only going to work for dyn-capable traits anyway, does it really need to go through enum dispatch instead of just using a fat pointer and dispatching that way? Another way to phrase this is "less general Box<dyn Trait> so I don't have to pay for allocation".

3 Likes

"This feature is okay because without it the code wouldn't have compiled" is not a universal argument. Sometimes you don't want the code to compile. The compiler's job is often to tell you that your code doesn't compile, rather than trying to find some meaning that allows compiling your code.

15 Likes

Sorry my bad. In the last line i meant to call op, and not add. The corrected example:

trait Magma {
    fn op(self, other: Self) -> Self;
}

impl Magma for usize { /*…*/ }
impl Magma for String { /* … */ }

fn get_magma(condition: bool) -> impl Magma { 
  If condition {
    Strring::from("bla")
  } else {
    3
  }
}

fn main() {
    let m1 = get_magma(true);
    let m2 = get_magma(false);
    // what should happen here?
    let result = m1.op(m2);
}

This is a bad idea in general because it allows arbitrarily unifying types in any branch expression as long as they have some trait in common. Given that almost all types implement e.g. Debug (or to say something even more drastic, think Sync or Send), this would effectively lead to dynamic typing.

2 Likes

Fair point. I see horrible error messages from the compiler coming.

There is a second possibility (in current Rust) that also return an impl Trait. The one that I used in the first post.

There are effectively two possibilities for this issue:

  • a "fat" impl Trait: wrapping all values in an enum (union + discriminant): this proposal, dispatching using the discriminant
  • a "fat" dyn Trait: wapping all values in an union + vtable pointer, dispatching using the vtable

That's interesting and need to be explored further, but I personally think that the former is more interesting. I don't think that both should exists, they seems to fill the same need.

Definitively! However, I wanted (but fail to emphasis) that ”it's ok to not add syntax“ because all the 3 points at the same time are true.

It's exactly the same issue. With or without this proposal this code doesn't compile because you didn't provide:

impl <T: Magma> Magma for T { /* … */ }

This proposal only generates boilerplate inside a function and doesn't change anything on how the function can be manipulated from the outside. Since this function returns an opaque impl Magma, you can only use construction that accept T: Magma.

In the initial post, this was not an issue, because the only place where the unification could occur was in an expression that directly returns. The type of this expression was already constraint by the return type of the function (impl SomeTrait). However since I later extended this idea to any expression that have multiples incompatible types, this could definitively lead to horrible error message. I think that it's relatively easy to fix: an expression can be promoted to an opaque impl Trait if and only if the type of that expression is explicit:

trait Trait {}
fn foo() -> f32;
fn bar() -> f32;
fn baz() -> impl Trait;
impl Trait for f32 {}

let value: impl Trait = if condition {
    foo()
} else {
    bar()
}; // shared types for all branches, unifies as f32, but can only be manipulated as an opaque type

let value: impl Trait = if condition {
    foo()
} else {
    baz()
}; // no shared type, but an explicit shared trait for all branches, unifies as Trait (using an enum as proposed)

let value = if condition {
    foo()
} else {
    baz()
}; // will not compiles since the compiler doesn't have the right to do unification unless the unified type is explicit

fn abc() -> Trait {
    if condition {
        foo()
    } else {
        baz()
    }
} // no shared type, but an explicit shared trait for all branches, unifies as Trait (using an enum as proposed)

trait Other {}
fn other() -> impl Other;

let value: impl ??? = if condition {
    baz()
} else {
    other()
}; // no shared type, and no common trait, no unification possible, doesn't compiles

This is an example of the weakness of type Foo = impl Trait; as the syntax for named existentials. This could be a feature of the -> impl Trait syntax, but not a feature of abstract type Foo: Trait; syntax, so that users who want to guarantee its a single type can use that syntax. But it feels much more arbitrary to restrict it when they both use impl trait syntax.

This split feels right though: my feeling is that unifying all branches when the return type is anonymous is usually desirable, whereas unifying all branches when you've named an existential is usually not.

My point is that without the proposal it does compile. Check out this playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f3ed2aace4889dbeff49fb642f6b8a81 Why should I provide impl <T: Magma> Magma for T { /* … */ }? You want to reimplement Magma for types that implement Magma ?

I am aware that you only want to generate boilerplate within the function. But the question is how you would generate the boilerplate for this trait? I reckon that's not possible at all. Therefore we'd need some rules for enum-safe traits. I think these rules would need to be almost the same as the ones for object-safe traits. Except that they could allow for associated types, as long as the associated types of all the types that should be wrapped in an enum are the same.

Also I think since enum impl trait is not possible for every trait, I think it should definitely have a keyword in order to prevent people from relying on enum impl trait without being aware that this is not possible for every trait.

You did mean fn baz() -> impl Trait; here, right?

I hope that I understand correctly your point. Please correct me if I'm wrong. let val: impl Trait = expression can mean two things:

  • “Dear compiler, could you please check that expr returns a type that implements the trait Trait, the hide the concrete type, but don't modify it in any way”
  • “Dear compiler, could you please create an opaque type that implements the trait Trait by unifying all the branches of expr, and if needed add whatever is needed to make it compile” (this proposal)

Since the former proposition is definitively a valid need, then the later would need some syntax to differentiate itself from the former.

My bad. My mental compiler had probably an ICE! That's effectively right. This means that such coercion could only work for trait that don't take Self (or any variation with references and pointers) in the arguments of any functions (except for self itself obviously). I need to think about it, since it feels that it's a major limitation.


Thanks, fixed


Btw, thanks all. I got exactly the kind of feedback that I was hoping for!

Note that with continuation-passing style, especially after sugaring, we get to write:

#[with]
fn create_or_generate (condition: bool)
  -> &'self mut (dyn Iterator<Item = Foo>)
{
    if condition {
        &mut create_iterator()
    } else {
        &mut generate_iterator()
    }
}

Doing so, besides the CPS constraints, replaces an enum / discriminant-based dynamic dispatch with a virtual method one.

But it has the advantage of not being magic: we are using the classic &mut impl Iterator<...> as &mut dyn Iterator<...> vtable-based type erasure coercion.

  • For those skeptical of that #[with] and 'self stuff from CPS, know that that is just a library-based way of implementing the unsized_locals feature, whereby we would have been able to otherwise write:

    fn create_or_generate (condition: bool) -> dyn Iterator<Item = Foo>
    

    I am personally more fond of things with actual implementations, even with slightly hindered usability, than just ranting about a feature that may take too long to happen. Hence my mention of #[with] ... -> &'self mut dyn ...


In that regard, it becomes now more intuitive / obvious that we should have a way to express the coercion from an impl Iterator to an enum dyn Iterator type, making it thus relate back to the linked / mentioned "anonymous / auto- enum" proposals.

It also makes the following suggested idea especially resonate:

  • I initially wasn't that fond of the enum as a (contextual) keyword modifier for expressions, and started suggesting a coercion syntax (explicit, such as as enum dyn ..., or implicit, such as with type ascription or a typed binding).

    But then I realised that enum dyn ... was not acceptable, since each so-created anonymous enum dyn type was gonna create ad-hoc, different, unnameable types, such as with closures and futures (and generators).

    So an expression-based modifier is indeed the most sensible option, so I've backtracked since and realize that something such as enum match makes the most sense, actually.


There is one thing remaining though, which is:

How to enable enum dyn-dispatch for arbitrary traits

It definitely should not be automatic, since traits can express arbitrary properties, not just method-based APIs. For instance, let's imagine layout-related marker traits, such as ReprC, FromBytes, Layout. These should definitely not carry through.

Also for retro-compatibility reasons (in this case, adding unexpected implementors of some traits can cause logic bugs or even unsoundness (c.f., these layout-based unsafe traits). So it needs to be something opt-in, for each and every trait.

Having a special trait attribute then seems to be the most straightforward approach:

#[bikeshed_name]
trait Iterator { ... }
// e.g.
#[auto_enum]
trait Iterator { ... }

Another approach, but which would require variadic generics, or genericity over traits, would be to have enum match ... expression not have a fully anonymous type, but only an anonymous type parameter within some known wrapper type:

  • Variadic generics (+ magical AutoEnum def)

    #[lang(auto_enums)]
    enum AutoEnum<Ts...> { Ts... } // super fictional syntax
    

    And then people would opt-in using:

    impl<Item, Ts... : Iterator<Item = Item>> Iterator
        for AutoEnum<Ts...>
    {
        type Item = Item;
    
        /* built-in impl for _methods_ ? */
    }
    
  • Genericy over traits

    #[lang(auto_ennums)]
    struct AutoEnum<trait Trait, __>  { ... } // __ represents the unnameable auto-generated type
    
    impl<Item, __> Iterator
        for AutoEnum<trait Iterator<Item = Item>, __>
    {
        type Item = Item;
        /* auto-generated for methods */
    }
    
    • Note that for this use case, genericity over traits can be hacked / emulated using a ?Sized type parameter that would be using dyn Trait afterwards. Despite the appealing simplicity of so doing, so doing would be intermixing different notions (for instance, an enum dyn Trait could have methods taking self by value, whereas a classic dyn Trait cannot (unless unsized_locals were available)).

Note that the AutoEnum wrapper type has the advantage of no longer truly requiring the enum match, by making it possible to go back to my initial idea of a type coercion:

  • match condition {
        | true => create_iterator(),
        | false => generate_iterator(),
    } as AutoEnum<impl Iterator, impl Iterator>
    
  • match condition {
        | true => create_iterator(),
        | false => generate_iterator(),
    } as AutoEnum<trait Iterator, _>
    

All in all, I find the whole implementation far from trivial, and wonder if #[with] ... StackBox<'self, dyn Iterator> / unsized_locals or manually written ad-hoc Either-like enums don't suffice to solve real use-case needs :thinking:

I would expect this to instead be

match condition {
    | true => create_iterator() as AutoEnum<impl Iterator, impl Iterator>,
    | false => generate_iterator(),
}

as after the match does not influence Box coercions currently, so I don't think it would influence any stack box. Then this is getting very close to something that can be implemented by user code (other than needing variadic generics to be able to define it for an arbitrary arity).

1 Like

@Nemo157 oh sure, I was hinting at what the most aesthetic (because symmetrical) situation would be; but if "back-tracking" type inference can only go so far (IIRC, it does work for things such as &mut dyn Iterator...), then so be it.

Aside

Oh indeed, that's kind of what I had in mind, but we do hit the issue for both an automated way to opt into delegation for custom traits, and the lack of variadic generics, although the latter is not a real problem, provided we are given enough impls, such as your Z Y X ... (nit: I would rename those to _25 ... _2 _1 _0, since I find sum_types::_2 to be more readable than sum_types::C :wink:)

But maybe you can publish that crate, "now" that proc-macros are stable?

1 Like

It's always* sound to have an incorrect implementation of a trait which is not unsafe. Thus it makes sense that "auto enum dyn" be limited to safe traits, since the generated implementation may be incorrect w.r.t. unsafe invariants on the trait.

* modulo traits (ab)using privacy barriers to require a closed set of implementations. So, add another requirement: it must be possible to implement the trait at the point where the "auto enum dyn" is constructed. This is even more complicated a property than the "constructability" constraint for the Safer Transmute RFC.

Given that:

  • anyway an explicit marker seems to be needed
  • it's already possible to create this construction using user code

Then I also think that experimenting with a crate is the right way to advance forward (a bit like try! that evolved into ?).

1 Like

Even if proc-macro implementations, can make it work without explicit markers in many cases by analyzing ast.

#[auto_enum]
fn foo(x: i32) -> impl Iterator<Item = i32> {
    match x {
        0 => 1..10,
        _ => vec![5, 10].into_iter(),
    }
}

(The linked example uses the Iterator argument, but it can be detected automatically by analyzing the return type.)

1 Like