Lifetime Elision for Associated Types (Unbaked idea)


#1

I was reading this code in one of the 2018 preview discussions (thank you again, @CryZe):

impl<'a> IntoIterator for &'a SegmentHistory {
    type Item = &'a (i32, Time);
    type IntoIter = Iter<'a, (i32, Time)>;

and thinking, hmm, removing the <'a> there doesn’t really make much of a difference.

But then the elision rules jumped into mind. Associated types are kinda like return types from traits-as-functions, so you can think of that code like fn IntoIterator<'a>(self: &'a SegmentHistory) -> (&'a (i32, Time), Iter<'a, (i32, Time)>). But of course you wouldn’t write it like that, since it could fully elide.

That would make the original example into just

impl IntoIterator for &SegmentHistory {
     type Item = &(i32, Time);
     type IntoIter = Iter<'_, (i32, Time)>;

To me, at least, that’s beautiful. Way better than just removing the <'a>, all the 'as are gone, though of course the Iter type still has its '_ as a reminder that there’s a lifetime, like it would when used as a return type. And it’s still clear what it should mean, since there’s only one input lifetime available, the same way it’s clear in the function signature case.

Any traps here that I missed? I’d probably start with just the “there’s only one lifetime in the arguments” elision rule. There might prove to be a good equivalent of the “well, take it from self if there’s a self” rule (since there’s a Self), but that can wait.

(If you’re looking at the original code, you’ll be noticing that there’s another 'a I didn’t talk about, but it doesn’t cause a problem because -> Iter<'a, (i32, Time)> can just be replaced by -> Self::IntoIter.)


#2

Interesting idea! I agree it produces beautiful code.

I would write this as:

impl IntoIterator for &'a SegmentHistory {
     type Item = &'a (i32, Time);
     type IntoIter = Iter<'a, (i32, Time)>;
}

or alternatively:

impl IntoIterator for &'_ SegmentHistory {
     type Item = &'_ (i32, Time);  // did you miss the '_ here?
     type IntoIter = Iter<'_, (i32, Time)>;
}

Have you thought about the interactions with GATs? For example, if we say:

impl Foo for &'alpha Bar {
    type Baz<'beta> = &'_ Quux;
}

Does the lifetime '_ refer to 'alpha or 'beta here? and why?


#3

I was debating about this when implementing the code to permit elision in impls. I agree it is the analogous and obvious thing – however, I figured we could wait and assess the impact of in-band lifetimes before taking this step.


#4

I would expect an error if there is more than one input lifetime in scope (including 'beta here). In particular, there is nothing quite akin to &self to privilege, I think.


#5

PS What is this &'_ business? I think there is no need for '_ there…the & already informs you that a lifetime is present. =)


#6

That seems reasonable; what about this?:

impl Foo for Bar {
    type Baz<'beta> = &'_ Quux;
}

// one could ostensibly write:

impl Foo for Bar {
    type Baz<'_> = Quux<'_, '_>;
}

As you said that you expected an error in the previous case, one can consider this case of GATs independently I think.

EDIT: possibly type Baz<'_> really means “bring '_ into scope”.

Oh right; we introduced '_ for non & types… I should know this, I wrote an edition guide section on it :wink:


#7

Interesting. I guess I would expect elided/anonymous lifetimes to map to 'beta in that case, yeah. Kind of neat that one could do type Baz<'_> = ... – except that this is expanding the role of '_. Thus far, it is not permitted in a generics listing.

That said, I would like if it you could do:

struct Foo<'_> {
    x: &u32
}

I feel like this pattern of a “struct with one lifetime parameter” comes up a lot for me (e.g., when writing iterators).


#8

That is interesting; I guess this mostly pays off if there are more fields than x or more places that expect lifetimes; otherwise you’ve not elided much and this is already pretty ergonomic:

struct Foo<'a> {
    x: &'a u32
}

#9

Somehow giving a name to that lifetime annoys me quite a bit. It might be because of this pattern that I find: typically, there is one that kind of corresponds to “the struct itself” – that is, it is just used for “random references the struct needs to hold on to”. This lifetime often has no sensible name – it isn’t the lifetime of some piece of data you are referring to. This is the case for the lifetime on an iterator – it corresponds sort of to the “lifetime of the iteration”.

Then there are sometimes more parameters – these arise when mutability comes into play, and sometimes for other reasons. e.g. in the compiler we have 'tcx that corresponds to the lifetime of a particular arena of memory that persists for the entire compilation (well, sort of).

So maybe I would even want to intermix '_ with named regions, e.g., struct Foo<'_, 'tcx> { .. } (NB, I don’t actually think this is a good idea).

(Indeed, in the very early days of Rust, all structs had a single lifetime parameter, and you didn’t even have to declare it. That turned out to be horribly confusing and terrible. But I do think we were onto something there – maybe there is a way to resurrect this notion that is not as confusing.)


#10

That’s an interesting observation!

Hmm, that syntax doesn’t seem that strange tbh; I read it as: “define the structure Foo with a lifetime name I don’t care about and 'tcx.” and then the “don’t care about” lifetime becomes the lifetime of things where you use '_ or where you write &Bar.

The nice thing about using '_ in the quantification there is that you can control the order it appears in, so you could ostensibly move things around with:

struct Foo<'tcx, '_, 'gcx> { ... }

Otherwise, you could also enforce the rule that you may only quantify '_ if it is the sole lifetime quantified; but perhaps that is too arbitrary a restriction.

This idea of using '_ in the parameter list can also be used for impls:

impl<'tcx, 'gcx, '_> Foo<'tcx, 'gcx, Thing<'_>> {
    type Bar = &Baz; // this is referring to '_ in impl<..>.
    type Quux = Wibble<'_>;
}

#11

That was how I meant it. I was scared off however by the thought of trying to explain to people the many roles of '_. =)

“Well, in a struct definition, it acts like any other lifetime parameter – just an anonymous one. In a fn signature or impl header, it acts like a fresh name. In a return type or associated type value, it identifies one of the input lifetimes.”

Maybe… that’s ok. It’s sort of the DWIM region. =)


#12

Oh, this is an important point too – it seems not great if '_ in a struct and '_ in an impl behave so very differently.

(Note that declaring lifetime parameters is in my mind quite gauche at this point, what with in-band lifetimes. :wink:


#13

Yes; I think that it’s sorta straightforward; It is already contextual based on location, what’s one more context ^,-


#14

I’m puzzled by that, since in-band lifetimes helps basically not at all for associated types. I would think the opposite, and that we should improve elision before doing in-band lifetimes, since in-band can fundamentally only help in cases where they weren’t elided, which ought to be the majority.

See, I think that naming lifetime parameters is what’s gauche, what with '_, but think that when they need names, having a separate declaration is still the right choice :wink:

There was a post on URLO recently that I liked, talking about a style that suggests loop when using break/continue as a heads-up that there’s something different from the normal coming up. In that way, one could think of elision as for/while, and the <'a> as a “warning, nuanced lifetime use upcoming” .

:+1: