[lang-team-minutes] stabilization and future of impl Trait

In the meeting we discussed the general state of the impl Trait, with a particular focus on “what are the open questions that would prevent us from stabilizing?”

General status

@eddyb completed the implementation of the “most basic form” of impl Trait. In particular, you can use impl Trait in free function return types. This has already raised a couple of interesting issues that we didn’t uncover in the RFC process itself (covered below).

Looking forward, there are also a number of (relatively) straight-forward extensions that we’d like to see:

  • impl Trait in trait method return types
    • this would effectively desugar to an associated type in the trait
    • but this associated type would need to be generic, along the lines of RFC 1598
  • impl Trait in argument position
    • this would be sugar for a generic argument
      • fn foo(x: impl FnMut()) becomes fn foo<T: FnMut()>(x: T)
    • this raises syntactic questions (see below)

And other possible extensions that one could imagine, but where the desirability and semantics are less clear:

  • impl Trait for local variables
  • impl Trait in structs

Question: what should be captured by an impl trait?

One of the more interesting questions that arose had to do with “capturing”. If you have a generic function, the current design allows the “hidden type” described by the impl Trait to make use of any type variables it wishes. It turns out that this is not always what you probably want.

Let’s explore through an example. Consider https://is.gd/Cn6iAN. Here we have a function foo() that returns a pair (T, U), though the caller only sees impl SomeTrait:

fn foo<A: Debug, B: Debug>(a: A, b: B) -> impl SomeTrait {
    (a, b)
}

The caller main() can invoke foo like so:

fn main() {
    let mut i = 0;
    let mut j = 0;
    let k = foo(&i, &j);
    // Are these legal?
    // i += 1; 
    // j += 1;
    k.print();
}

You’ll find that if you uncomment i += 1 or j += 1, the program will not compile. This is because k could potentially be using &i and &j, so we have to keep those two variables locked so long as k is in scope. (And, in this case, it is.)

But of course the same is true regardless of the body of Foo. So if we update Foo to only use the first argument:

fn foo<A: Debug, B: Debug>(a: A, b: B) -> impl SomeTrait {
    (a, ) // look ma, no `b`
}

we will still get the same errors in main (and rightly so; we don’t want to leak internal impl details).

But sometimes we will want to take “transient” arguments that are not used in the return type! And this is particularly evident when those arguments are lifetime parameters. Imagine if instead of foo we had something like this (lifetimes made explicit for clarity):

impl SomeType {
    fn iter<'a,'b>(&'a self, config: &'b Config) -> impl Iterator<Item=Blah> { ... }
}

Now if you call x.iter(&config), this means that config will be locked during the entire iteration. This is because we assume that the return type may “capture” 'b.

Note the analogy to lifetime elision. If you have the same function signature, but you have a reference in the result:

impl SomeType {
    fn iter(&self, config: &Config) -> &SomeConcreteIteratorType { ... }
}

Here lifetime elision would expand the return type to use the same lifetime as &self. This is because experimentally we found that this is what you want most of the time. Note that we do not expand to the same lifetime in all three positions:

impl SomeType {
    fn iter<'a>(&'a self, config: &'a Config) -> &'a SomeConcreteIteratorType { ... }
}

If we did so, then this would be more analogous to our impl Trait behavior – i.e., this would mean that, by default, config is locked as long as the result is used, just as is the case with impl Trait.

So, the first thing we noted is that whatever we do, we will probably need some form of explicit syntax (just as with named lifetime parameters). No default will be right 100% of the time. So we propose the syntax impl<'a,B> Trait where the 'a and B refer to lifetime or type parameters in scope that may be captured. If you don’t have any <> (e.g., impl Trait), then this applies the default (yet to be determined). If you have an empty <> (e.g., impl<> Trait), this implies no capturing at all. The fact that an empty <> has a different meaning than no <> gives some mild discomfort.

Next point is that we need to gather up data from what really happens in practice to determine the best default. It may be that the current behavior is best, but maybe not – or maybe we want a default that is different for type vs lifetime parameters. Thoughts?

Question: how to extend to argument position?

Most everyone in the lang team would like to see impl Trait usable in argument position as a shorthand for declaring a type parameter. This is for several reasons. First, it’s a way to lighten notation significantly. Second, it may allow one to teach traits earlier without going the details of explicit parameters and parametric polymorphism. However, it does raise some questions. Because impl Trait would be expanded differently in argument position, is it proper to use one keyword?

Many have argued for a distinction where the current impl Trait (in return position) would be some Trait, indicating that a specific type is being returned, even if it is not explicitly named, whereas the impl Trait in argument position would be any Trait. For example @withoutboats is a proponent.

On the other hand, it’s unclear how important this distinction will be in practice, and it could be confusing. An alternative would be to have impl Trait that is a contextual shorthand, and then have explicit named parameter notation for both return position (which we lack today) and argument position (which we have). I prefer this, at least at present. =) This version does open the door that we might someday change the meaning of Trait, though that opens up another can of worms best left out of this thread.

It’s worth noting that there are cases where one might want some Trait even in argument position (and if we only had impl Trait, we would not handle said cases correctly). An example would be fn foo(x: any FnMut(some Display)), which is saying “give me a closure that is prepared to handle a type T: Display but I’m not telling you what T is”. @wycats has described this using the keyword “my”, which I think gives a better intution: fn foo(x: any FnMut(my Display), meaning "foo gets to decide the type of Display value you are given".

Clearly, before we stabilize impl Trait notation, we should settle on whether we will want to use the keyword some (or, no pun intended, some other alternative) instead.

For the record, I don’t like the keyword some in particular because I think it is begging for confusing with the option variant Some, also a new concept to many early Rust users.

What should block stabilization?

It seems clear that we have to settle on a keyword and capture semantics before we could stabilize, since any changes there would break code. But what about implementation status? How “complete” does the story have to be?

For example, @aturon felt like he would want at least argument position working before stabilizing, but that we didn’t have to have things working in traits. Otherwise the feature “feels too incomplete”. @nrc felt like he wants impl trait yesterday and would rather stabilizing bits and pieces as they come into being. This was the area where the meeting ended without a clear consensus.

Focus time!

So the key questions are:

  • What do we need to fully settle before stabilizing
    • default capture semantics
    • keyword we want
    • anything else?
  • What do we need to implement before stabilizing
    • obviously capture semantics must be right
    • in particular, do we need argument position?
    • trait position?
    • anything else?
4 Likes

To reduce discomfort, why not default impl Trait to bind no type params as impl<> Trait as long as we can while the feature is still unstable?

Why don’t we use lifetime elision for this - impl Trait unannotated lives for the LUB of the input lifetimes, while impl Trait + 'a lives for 'a? Trait objects work like that.

The final paragraph is why I am in favor of two keywords instead of one (personally I'm coming around to any/my for the reason you mention elsewhere in your post). Basically, once we've advanced one rank (or whatever), I don't think either semantics is predominantly obvious.

Over the long arc, and recognizing the huge logistical challenges of a breaking change, I think moving trait objects to virtual Trait & providing ellisions of any/my in the most obvious cases while still enabling expressing them explicitly seems most desirable to me.

EDIT

Discourse recommended that I not make a third replay in a row, so this is an edit.

I realized that I did exactly the wrong thing and got down into the weeds instead of responding to the real question:

I think we need to work through all the places we'd want this shorthand to figure out if the syntax makes sense, but its not so important to have actually implemented them as long as we're compatible with making the extensions.

2 Likes

I also like this syntax, but it doesn’t work as well for specifying type parameters.

What if I call foo(Default::default())? Will I get an error message because it couldn't infer template parameter _? Isn't that too confusing for beginners?

What about this code? How can I reproduce it if I change bar to have no template parameter? Will impl Trait in argument position create two kinds of functions? Ones that that be used with foo and ones that can't?

Also the fact that it creates yet another way to be generic. We already have two different syntaxes (fn foo<T: Trait>() and fn foo<T>() where T: Trait) that are redundant. Being friendly for beginners-not-coming-from-C++ seems to be the focus over language purity, but in this situation it's both a wart in language purity and confusing for beginners.

Same kind of problems here. What if I try Box<Foo>? Will I get an error because I didn't specify the type of the "virtual" associated type? Or will Foo be automatically not be box-able (even though it could)?

I think that these additions will most likely create more problems than they solve, which means that these should definitely not be features to implement right before stabilization. impl Trait in return position is still not entirely solved, as there's the problem of lifetime elision and the poisoning problem that I describe here.

3 Likes

Instead of capturing parameters, couldn’t we some how extract their lifetimes? That is:

fn foo<'a, B ~ 'a>(b: B) -> impl Trait + 'a { ... }

Instead of:

fn foo<B>(b: B) -> impl<B> Trait { ... }

I don’t really know what ~ would be. Maybe 'a: B (outlives)?

Is there some other reason to capture parameters?

2 Likes

Does this path also allow explicit associated types that are impl Trait?

One of the suggestions when I tried to add IntoIterator for arrays was that it might be feasible once we have impl Trait, since we could hide the actual iterator type. (That type was undesirable to make public because it revealed flaws in not having integer generics.) So in that case, I would need to write that IntoIterator::IntoIter in an abstract way.

impl<T> IntoIterator for [T; $N] {
    type Item = T;
    type IntoIter = impl Iterator<Item=T>; // like this?
    fn into_iter(self) -> Self::IntoIter // referencing the abstract type
    { ... }
}

Having considered this a bit more, it now gives me quite significant discomfort:

  • it is a pain for macro authors
  • it is a rule without precedent
  • it is a subtle and somewhat implicit distinction

I'm not saying we definitely should not take this approach, but I'm much colder than I was at the meeting.

9 Likes

I think we need to work through all the places we'd want this shorthand to figure out if the syntax makes sense, but its not so important to have actually implemented them as long as we're compatible with making the extensions.

I basically agree with this. In particular it seems clear we must have a fairly solid plan for argument types before we can stabilise, though I continue to think we don't need to implement.

I think we do have to have implemented the capture stuff in order to ensure we have the right defaults.

I don't think we need to block on trait methods, but question - is there some suggestion that the design there would somehow affect the impl Trait on bare functions design? It doesn't seem like it would to me, but perhaps there are ways. Seems important to be sure.

Also the fact that it creates yet another way to be generic. We already have two different syntaxes (fn foo<T: Trait>() and fn foo() where T: Trait) that are redundant. Being friendly for beginners-not-coming-from-C++ seems to be the focus over language purity, but in this situation it's both a wart in language purity and confusing for beginners.

This argument and the some/any issue are making me less and less happy about argument position.

I'd love to see a large-ish code example with argument-position impl Trait to get a better feel for what it looks like. As a person pretty comfortable with generics, I keep going back and forth on how useful this will be.

I really don't like the some/any distinction I fear it is way to subtle a distinction for most programmers to understand. I think if anything I'd like to see impl Trait mean any for top-level args, and some for nested args - if you want any for nested args you can always use explicit generics, but not vice-versa.

Are there examples of use cases for nested any?

I expect that it is minimizing one kind of discomfort (impl Trait being distinct impl<> Trait) while increasing another -- in particular, I think that it will be common to want to use data involving at least type parameters and the self reference.

Ah, this is actually a third possible extension. It raises some interesting questions (to me) about the “scope” of impl Foo-like types. Here it seems clear that two items (IntoIter and into_iter()) both know what type is being hidden here. This seems ok, we just want to tread a bit carefully in this area – there may be some interactions with specialization and the ability to refine individual items, as well.

I'm not clear on what you mean by "nested" here, can you give an example?

I don't quite follow. It seems to me that impl (Trait+'a) (using parens to clarify precedence) already has a meaning: it requires that the hidden type outlives 'a. (Note that this may in turn influence inference; clearing up the inference impl issue is actually another thing I should have brought up in the lifetime meeting.) However, it seems orthogonal to the problem at hand. In particular, it doesn't provide a way to talk about the type parameters that get captured (or not captured, as the case may be.)

1 Like

One thing is that parameters don't, in general, have a single bounding lifetime. (Something like Foo<'a, 'b>, for example, has two bounding lifetimes, at least in some cases; similarly Trait + 'a + 'b.) You could imagine extracting a set, or introducing a notion of a lifetime 'x that is something like intersect('a, 'b), maybe. I guess the question comes down to what we can explain best.

Another possible interaction in the future is with variance inference. In this case, it is important to know that e.g. A is captured.

In general I think we should try to pursue the notion of expressing impl Trait in terms of a desugaring to specialiazation, as I elaborated earlier. It seems to be a (nearly) perfect fit, and I don't think that this is an accident (the only sticking point is that default types do not currently work the same way w/r/t auto traits as impl Trait, which I think we should change).

So in this case something like:

pub fn foo<'a,'b>(x: &'a [u32], y: &'b [u32]) -> impl Iterator<Item=u32> {...}

desugars to a sort of synthetic trait:

trait FooReturn<'a,'b> {
    type Type: Iterator<Item=u32>;
}

impl<'a,'b> FooReturn<'a,'b> for () {
    default type Type = XXX; // here XXX represents the hidden type; note the `default`
}

pub fn foo<'a,'b>(x: &'a [u32], y: &'b [u32]) -> <() as FooReturn<'a, 'b>>::Type { ... }

The main problem here is that impl Trait introduces a kind of "link" between foo and the impl that we can't express, where the two are linked. But in any case this suggests that impl Trait can be thought of as a projection like <() as FooReturn<'a, 'b>>::Type. In this case, the impl<'a, 'b> notation is basically defining the type parameters that would appear on the trait.

If we do pursure this, then a lot of thorny questions just "fall out" from how we treat projections (notably variance, outlives relations, etc). For the time being, it is likely that this analogy will remain in our heads as a guiding principle, but I would love it if ultimately we could use it to simplify the compiler internally as well.

Isn’t this just an artifact of the fact that apart from impl Trait, projections of default types are the only other way we currently have to express module-level existential types? This is why I was trying to advocate for a long time for having actual abstract type declarations, which would express the same thing in a more direct & less shoehorned way (no need for the dummy () type to serve as Self, and so on).

That example would desugar to:

abstract type FooReturn<'a, 'b>: Iterator<Item=u32> = XXX; // here XXX represents the hidden type

pub fn foo<'a,'b>(x: &'a [u32], y: &'b [u32]) -> FooReturn<'a, 'b> { ... }
1 Like

Yes. But if we were to add those, I would probably consider think of them as desugaring into projections of defaulted types too. =)

It's kind of off-topic, but I still feel quite good about not resolving default even if we theoretically could. It seems to have obvious "semver" benefits -- it means you can add specializations later without breaking people's code. It's not obvious to me that abstract is orthogonal here -- that is, if I have some existing impl for some set of types X, and I reserve the right to specialize it (and hence change associated types) for some subset of X -- that is basically the same as reserving the right to change it for all of X (which is what abstract as distinct from default would be).

At least in my use case, the type itself is easily nameable (not involving closures or the like), just not something we want to publicize. So maybe that could be written something like:

    type IntoIter = IntoIter<T, [T; $N]> as impl Iterator<Item=T>; 

… where the real type is still explicitly stated, but hidden from users. Maybe that’s yet another extension.

Presumably there will eventually be an expanded form for existentials that could express this, right?

I think a good comparison here is covariance/contravariance. In languages with both higher order functions and subtyping, this can be quite a confusing problem. For us its sort of worse - we can't tell whether a parameter in a trait bound is universal or existential. I don't know if being implicit makes this easier or harder. :-\

An example would be Into.

fn foo(arg: Into<ToString>) -> String {
    arg.into().to_string()
}
1 Like