Extending `impl Trait` to allow multiple return types

While I overall agree with your post, I don't think this part is necessary.

If I have a method that returns impl Iterator<Item=u8> today, I can totally return an Either<Iter1, Iter2> from inside it without needing to put an enum keyword in the signature. So I don't see why it ought to be explicit in the signature.

One could have the marker inside the body instead, like

enum if b { it.step_by(2) } else { it.rev() }
1 Like

Sure, it can also go in the function body somwhere. That keeps the information local as well. Even more local and it documents the intended "merge point", so I'm all for that alternative.

The original CPS idea is very cool, but I’d be afraid to use it in practice — that’s just asking for exponential growth of code when multiple such calls are combined.

And the lesser variant based around a regular enum, but automagically inserted in all places, doesn’t add any functionality that isn’t available already (without the syntax sugar), so it isn’t as compelling.

1 Like

First, the questions around backwards compatibility, which I believe is still preserved:

This is not actually an issue for this proposal. impl Trait is, itself, something like ∃ T. T: Trait => T. So at the type-checking level, the types of returns_impl_trait(1) and returns_impl_trait(2) unify.

Each time returns_impl_trait returns, you have the potential to enter a differently-monomorphized piece of the caller, which knows the new concrete type. (And the stack frame can stay the same because x already has to reserve enough space for any of returns_impl_trait's concrete return types.)

@ExpHP's example is harder, but not impossible either. The idea of forbidding it in the 2018 epoch is appealing in some ways, as Vec<impl Debug> is not a type we support today. But the full interpretation can be made to work, of impl Trait as a full existential type with its witness "folded into the instruction pointer."

The key is that, while Vec inherently and intentionally "strips away" the control flow around the values it contains, we can instead convert the existential's witness from its encoding as the instruction pointer to a value that goes in the Vec.

Then, as part of println!, std::slice::Iter<impl Debug>'s Iterator::next method can use the same mechanism as any other function that returns impl Trait- the caller (in this case <[impl Debug] as Debug>::fmt, or rather its indirect callees in the display plumbing) monomorphizes its continuation for each of the concrete types that the original foo call returns, and the callee returns to the appropriate one based on the value stored in the Vec.

And finally, @scalexm's example---this is also doable, though it's a more interesting question of whether it should be allowed. At one level, this interpretation means one impl Debug is the same type as any other impl Debug, satisfying the constraint that x and y have the same type.

Without any further constraints on T, it doesn't actually matter- we could monomorphize bar to give x and y the same size regardless of what g returns. And if you add T: Display or whatever trait g returns, bar could be monomorphized as if it were bar<T, U>(x: T, y: U) while again giving x and y the same size. But at another, is that actually what bar meant? Does it break anything important?


And the questions about functionality and syntax:

From what I understand, there aren't any plans to allow putting impl Trait in structs anyway. That's to be provided by abstract type instead, which also provides a name. What this does disallow is things like this, which try to retroactively name an existing impl Trait type:

abstract type T: Trait;

fn f() -> T { g() }

fn g() -> impl Trait { .. }

This is exactly the same scale as normal generic functions- "roughly exponential" in the number of type parameters, i.e. the number of eliminated dyn Traits. And as @earthengine points out, combinator-based futures code would run different logic in each monomorphized continuation, leading to approximately the same number of "states" as an equivalent async fn.

It's certainly worth considering, but without benchmarking things I don't believe we can really say that the outcome is any worse than pervasive use of type parameters. We still need good tools like cargo bloat regardless.

These are intertwined---the answer to the first depends on (among other things) the answer to the second. One of the issues with enum impl Trait, which makes people want yet-another-syntax for it instead of just reusing impl Trait, is that it's basically a new dynamic dispatch mechanism. A major draw of this proposal, on the other hand, is that it works the same as argument-position impl Trait, without a new dynamic dispatch mechanism.

So while as it stands I believe this is backwards compatible, we can't really separate those two questions without at least committing to forward compatibility with it until (1) is decided. And as a general response to the enum impl Trait bikeshedding going on here, I would say any new syntax in the type essentially defeats the proposal's purpose (of making APIT and RPIT equivalent).

I do think a marker in the callee function body, like @scottmcm suggests, might make sense. It would be a helpful call-out that the following code is using "static existentials," and isn't just a type error.

This would work, except overhead of heap allocation:

fn foo() -> dyn Trait {
   Box::new(if random {
       TraitOne()
   } else {
       TraitTwo()
   })
}

So how about adding an alternative to Box that does the magical enum wrapping?

fn foo() -> MagicEnumWrapper<dyn Trait> {
   MagicEnumWrapper::new(if random {
       TraitOne()
   } else {
       TraitTwo()
   })
}

With compiler’s help the MagicEnumWrapper<T> would make new enums for its T as necessary, and the newtype would remain stack-allocated.

1 Like

MagicEnumWrapper is essentially “unsized rvalues”- which use a vptr instead of an enum, and allows different sizes so it can support things like slices.

For this proposal to be viable, we would definitely want to consider requiring annotations to be present where monomorphisation can occur. For polymorphic functions, we effectively have this in terms of type parameters (here including argument-position impl Trait) — assuming users understand what monomorphisation is.

However, if we blindly allow inference of variables with type impl Trait, which is happening implicitly in the examples here, then we’re effectively inferring a polymorphic type, with no indication this is happening. As such, without looking at function signatures, we might have no idea monomorphisation is occurring.

fn g(b: bool) -> impl Display {
    if b { 42 } else { "hello" }
}

// This version of `f`...
fn f() {
    println!("{}", g(true));
}

// Is equivalent to this version:
fn f() {
    // We're inferring a generic type!
    let x<T: Display>: T = g(true); // Alternatively `let x: impl Display = g(true);`.
    println!("{}", x);
}

If we require type annotations for generic types, then this proposal seems more plausible from a “no hidden costs” point of view. (This also achieves the “callee body marker” functionality @scottmcm suggested.)

With this alteration, not specifying a generic type would force return-position impl Trait to be monomorphic (or otherwise cause an error), as it currently is today, so it entirely preserves the expected behaviour.

I'm not sure I understand what you mean by "whether it should be allowed". It already compiles, so if you meant "whether it should compile", then yes it should in order to preserve backward compatibility.

I read the posts you linked, there are some points but the mathematical facts are not presented in a very rigorous way and eventually I don't buy the "APIT and RPIT should be equivalent" thing. For example,

fn foo(x: impl Debug)

may always be considered as a type constructor while

fn bar() -> impl Debug

will never be. At best, we may know that the return type of bar takes values from a specific finite set which highly depends on bar, whereas the input type of foo may take values from an infinite set. Anyway I guess that's not the interesting point of this topic.

Certainly, if we decided to make it stop working it would have to be in the new edition. I was speaking more in terms of whether it makes sense from the perspective of this proposal---that is, the perspective of impl Trait being equivalent to ∃ T. T: Trait => T regardless of its position.

fn foo(x: impl Debug) treats it this way, giving foo the type fn(∃ T. T: Debug => T) -> (), which can only be considered a type constructor because it's isomorphic to the type ∀ T. T: Debug => fn(T) -> ().

fn bar() -> impl Debug does not treat it this way. The problem is not that there's no isomorphism to a universal quantifier, but that the type of bar is something other than fn() -> (∃ T. T: Debug => T). It is instead ∃ T. T: Debug => fn() -> T.

Why is the quantifier in a different location? Under the existential types interpretation, there's no good answer other than "because we implemented something else, for other reasons." I'm not sure how much more rigorous we can make this---our options are a) keep things as they are, with an inconsistent meaning for impl Trait, b) change the interpretation of impl Trait but not the functionality (this is the inference interpretation), or c) extend the functionality of impl Trait but not the interpretation (this is this proposal).

1 Like

At the risk of derailing the conversation, I'd like to comment on this:

If, by "very rigorous way", you mean formalised, then no, they're not (the posts were supposed to be reasonably precise, but primarily accessible). I do think they're more explicit than any other explanation of the existential semantics of impl Trait, and if you've spotted a flaw in the reasoning, I'd definitely want to address it.

A type constructor is a function from a type to a type. fn foo(impl Debug) is a function from value to values. So I don't understand your point here.

The return type of bar is a single type (impl Debug). The return type of the function never changes — under the existential interpretation, it's always an existential type (it doesn't "take on" the type it's quantifying). The type over which the return type is quantified, though, could be any type in the universe though (assuming it meets the trait condition). The same goes for the input type.

The difference between argument-position and return-position impl Trait, as discussed in the post, is the quantifier position, which entirely describes the difference in functionality.

As you wrote somewhere, fn foo(x: impl Debug) is isomorphic to fn foo<T: Debug>(x: T). When you declare such a function, you are indeed declaring a type constructor T -> fn(T). The fact that you cannot name this constructor in actual Rust is something else. Anyway, my point is that impl Debug in arg position does not depend foo, while impl Debug in return position does depend on bar.

Edit: this is indeed what I was meaning by « not very rigorous », sorry I should have been more clear. Also I’m not a type theorist, but the point that is very unclear to me is when you somehow « map » a logical implication to a function. Apart from sharing the -> symbol, I don’t see how this is done, and I think that could have been explained better.

This is true, but irrelevant. The (in)consistency we're talking about between return-position and argument-position impl Trait is not "does this make the function generic," but "can you view impl Trait as the same existential type." In argument position, you can- it can hold any concrete type. In return position, you cannot- it can only hold one type. (In that case the whole function is existential, or in other words the quantifier scope includes the function, as we've said before.)

(The way you map logical implication to a function is via the Curry-Howard correspondence. But you don't really need to do that to grasp the "existential scope" stuff.)

1 Like

Well, I probably don't have enough knowledge in that field of mathematics to debate on this anyway.

Yes I understand this statement, and this is the feature for which I left a :+1: (this is also why I do want my above example to compile). Thanks to your explanations, I now understand your point with fixing the inconsistency between APIT and RPIT. However I'm really not sure that this would be less confusing than the existing situation and why we would want to do this other than for optimizing speed vs code size.

It's not that I spotted any flaw, but what confused me is that the original RPIT RFC was clearly defined as:

"there exists some type T: Debug such that bar has the type fn() -> T" (*)

So now what you propose is that we replace that with:

"bar has the type fn() -> (∃ T. T: Debug)"

which is indeed different from statement (*), and which would be consistent with impl Trait in arg position. Do I understand it correctly?

If you wanted return-position impl Trait to be consistent with argument-position impl Trait, described in terms of existential types, then you'd want bar to have the type fn() -> (∃ T. T: Debug) — but this would of course change the semantics of impl Trait.

The first post was intended to illustrate this difference between APIT and RPIT, but wasn't intended to be a proposal to change this behaviour. (The technique described in this thread would make this change, so it'd be a (backwards-compatible) extension of the original RPIT RFC.)

2 Likes

The OP proposal is a bit too magical for me and I am not sure if we should do it (though it’s indeed a cool concept), so I would like to comment a bit on enum based approach.

Personally I don’t like enum impl Trait syntax and think that it will make function signatures over-cluttered. I would like to suggest an alternative aproach: why not introduce a macro which will construct anonymous enum and will wrap returns? Code could look like this:

fn foo(flag: bool) -> impl Display {
    if flag {
        enum_impl!(1)
    } else {
        enum_impl!("foo")
    }
}

It will be lowered into:

fn foo(flag: bool) -> Self::AnonEnum {
    enum AnonEnum {
        B1(u32),
        B2(&'static str),
    }
    impl std::fmt::Display for AnonEnum { .. }

    if flag {
        AnonEnum::B1(1)
    } else {
        AnonEnum::B2("foo")
    }
}

This approach is quite explicit and does not clutter signatures though enum_impl! is a bit magical and I am not sure if it will be possible to implement it as not-builtin, but then question is do we need enum_impl! or should it just work.

1 Like

I’d be happy if there was a macro to explicitly make a named enum implement a trait by delegating to its variants.

2 Likes

Ran into this related Cranelift issue: https://github.com/CraneStation/cranelift/issues/553, which references this paper: http://www.ccs.neu.edu/home/shivers/papers/mrlc-jfp.pdf.

This is probably more usefully-applied to returning enums into matches and ?, but combined with the “caller monomorphization” trick it’s basically the idea proposed in this thread. This brings the enum impl Trait concept much closer, because they can share the return control flow optimization.

2 Likes

I made a prototype macro that implements a portion of this for use with futures: https://crates.io/crates/future-union

I could definitely see it being extended to non-futures by having it generate an either type that delegates on whatever impl trait the user specifies.

@djmcgill, have you seen https://crates.io/crates/auto_enums ? I think it fills similar needs but is more general.

1 Like