[lang-team-minutes] Elision 2.0

So I just had a conversation with @wycats. We elaborated on “the syntax table”, and we also had the insight that – as a general rule – if you have an &Foo, we can still completely allow eliding any lifetime parameters of Foo, since there is already an indication of borrowing from the outer & (this was the same argument that I used to claim we would not need to ever write &Foo&).

// Not eliding 'nested lifetimes' like `'a` in `&Foo<'a>`
Foo'     Foo'<T>     &Foo'        &Foo'<T>
Foo&     Foo&<T>     &Foo&         &Foo<T>
Foo<'>   Foo<', T>   &Foo<'>      &Foo<', T>
Foo<&>   Foo<&, T>   &Foo<&>      &Foo<&, T>
Foo<ref> Foo<ref, T> &Foo<ref, T> &Foo<ref, T> 

// Eliding 'nested lifetimes' like `'a` in `&Foo<'a>`
Foo'     Foo'<T>     &Foo         &Foo<T>
Foo&     Foo&<T>     &Foo         &Foo<T>
Foo<'>   Foo<', T>   &Foo         &Foo<T>
Foo<&>   Foo<&, T>   &Foo         &Foo<T>    // my preference
Foo<ref> Foo<ref, T> &Foo<T>      &Foo<T> 

When I look at these syntaxes, particularly the second set, I feel like Foo<&> stands out to me as the best choice:

  • we have the consistency of & as the “elidable lifetime”
    • when first learning, you can probably get quite far this way
  • using the angle brackets, like Foo<&>, strongly reads to me as “struct with references in it”, which is precisely what it is
    • that is, we do not have the &Foo vs Foo& confusion
  • we do not have the “floating single tick” syntax (') that is very surprising and provokes strong reaction in some people
    • we also avoid “doubling down on the tick”

The only case in that table that kind of “looks heavy” to me Foo<&, T> – but it’s also one of the cases where I personally get confused about today (because you can elide the &). (Well, I sometimes get confused about &Foo<T>, but I think the ongoing work on improved lifetime errors will solve this.) It feels like this comes up rarely to me in practice; it’d be interesting to try and measure it.

I agree that there is a nice continuity between Foo<', T> and Foo<'a, T>, but it feels like it is compensated for by the twin downsides of:

  • single tick alone is very jarring to some people
  • lack of continuity between &T and Foo<&>
4 Likes

Could you elaborate, what is the logic here? I just... don't see any connection. What difference does the outer & make? If Foo<'a> is "good" and Foo is "bad", then why is &Foo good again? What is the purpose of &/' if not "highlighting" types containing lifetimes?

The way I see, if you have &Foo<&>, but you do not write the "inner" & (i.e., you write &Foo), then you already have a signal that the Foo is borrowed, and you would not expect to be able to (e.g.) take ownership of it, at least not without knowing more details about the type. Moreover, if you see it in return position, then you would already know that the lifetime of self is extended to cover the use of the return value.

In short, I want to ensure that there is a & or 'x somewhere when there are borrowed values floating about, but I'm not sure they have to be everywhere. The case that I think is actively confusing is fn foo(x: Ref<T>) or fn foo(&self) -> Ref<T>, where there is no sign at all that the argument contains borrowed data. If you see fn foo(x: &Ref<T>) or fn foo(&self) -> &Ref<T>, I think that is much better. (But feel free to tell me you disagree.)

That said, I am not sure if same logic applies when there are named lifetimes involved. For example, if you have these signatures, there is more room for surprise:

impl<'a> Foo<'a> {
    fn foo(&self) -> &'a Ref<T> { ... }
    // becomes: fn foo<'b>(&'b self) -> &'a Ref<'b, T> { ... }
    fn bar(&self, x: &'a Ref<T>) { ... }
    // becomes: fn bar<'b>(&self, x: &'a Ref<'b, T>) { ... }
}

This perhaps falls under the general header against mixing named and elided lifetimes, although neither is quite the cases I had in mind (since the named lifetime is in a different "layer" of the type).

The input lifetime case could be solved with a syntax highlighter that colors structs with references differently, or underlines them or something. Sufficently smart IDE is usually a bad argument, but in this case it's a simple IDE feature that avoids a breaking change to syntax.

EDIT: My wording is incorrect, as Niko points out below this is not a syntax breaking change only a syntax deprecation, but the point still stands.

So just to be clear, your preferred solution would also excise the tick from named lifetimes (so Foo<'a, T> would become Foo<&a, T> if you want the lifetime named), right?

To be clear: nobody is proposing a breaking change. All existing code would continue to work, though some patterns might be deprecated (and perhaps automatically transitioned by a rustfix-like tool, but that's a separate thread). I am not sure how "active" this deprecation has to be-- i.e., should we actively warn, or just teach it another way?

No, if you use a named lifetime, you would use 'a just as today. The & would just be for the case where you elide things. And this is the one downside, I think: that for &T, you "add" the 'a to get &'a T, but for Foo<&>, you "replace" the 'a with Foo<'a>. Writing Foo<&'a> would be ambiguous in any case.

To be clear, I see the appeal of the ', and there is something nice to "just elide the name", but I don't it's an open-and-shut case as to which is better. They each have their advantages, as @glaebhoerl pointed out earlier. (Among other things, using ' also has a kind of discontinuity, since you don't write &' T when you elide the lifetime from &T, but you do write Foo<', T>.)

1 Like

To me, Foo<&> is more implying “parameterized by a reference”, but I already said as much way up there.

Some other downsides:

  • Currently, I see & in a signature and I expect that to be a type.
  • I could see learnability problems wrt. & vs &mut if the former is repurposed.
3 Likes

On the other hand, this makes the rule for having to write <&> context-dependent. That's going to cause confusing on its own, and I am not sure that's worth it here. I feel things are more consistent when adding or removing an & doesn't also magically change which annotations you have to (not) make "below" this type operator.

That said, I find myself somewhat disagreeing with the entire premise of the <&> proposal. I was actually positively amazed when I realized that lifetime elision "just works" in fn foo(&self) -> Ref<T>. I never considered this confusing. After all, it says Ref right there, which is a pretty good indication of a reference, isn't it? Of course, API designers could make this confusing by picking bad names, but that is not something a compiler can fix. So, I would be somewhat sad if this got deprecated. But maybe I just didn't run into this issue yet because I didn't work with enough code that puts lifetimes in structs in interesting ways.

2 Likes

That made me think – is the Foo<&> syntax forward compatible with higher kinded types? Imagine that if Foo had a kind: lifetime -> type -> type, then the Foo<&> syntax could be valid usage of & type constructor. I'm in favour of the Foo<&> meaning elided lifetimes, but I just want to make sure it won't creates syntatic ambiguities with other proposed features.

1 Like

That's fair. I don't know for sure if it is or not. I think what I would probably prefer is that I can type whatever I want, but rustfmt fixes it up for me to the canonical form (right now we often do this sort of thing with lints that "steer" you the right way, and this might be a good place for that too, at least until we get a smarter rustfmt).

You can memorize Ref, but there are many other types with lifetimes. What about MutexGuard? Imagine you don't really know how mutex works... in that case, there is nothing in this signature that would tell you that invoking lock() is going to make the mutex be considered borrowed for the lifetime of the return value:

fn lock(&self) -> MutexGuard<T>

Add to that things like vec::Iter (borrows) but vec::IntoIter (does not).

Then you get into projects like the compiler, which make extensive use of references. When you see a signature like this:

fn cause(&mut self, code: traits::ObligationCauseCode) { ... }

doesn't it seem a bit surprising that you can't (e.g.) return this code out from the function, or make a struct and store it? In contrast:

fn cause(&mut self, code: traits::ObligationCauseCode<&>) { ... }

makes it pretty clear that you won't be able to do that (in that particular case, I admit, it's most common in the compiler to see ObligationCauseCode<'tcx>, but it's still fairly common for me to find signatures that elide any reference to the lifetime because it doesn't happen to matter in that particular spot). This can be particularly confusing because the compiler undergoes regular refactorings, and things that used to have references sometimes change so that they no longer do, or vice versa.

3 Likes

I see. I mean, the compiler will of course tell you instantly, but I can see how it is nice to be alerted of such subtleties already in the "planning" stage, before actually writing any code.

1 Like

I was picking back up on this topic today, and together with folks on #rust-lang, had a couple of realizations.

First up: the idea of e.g. Ref<&, T> has some issues around impl Trait. The proposal was to use impl& Trait, but the fact that &impl Trait (and hence &impl& Trait) is a thing, with a somewhat subtly different meaning, seems worrisome.

But in digging into this, @mbrubeck raised a really interesting idea.

Part of the last merged RFC on impl Trait talks about the interaction with lifetimes. The key question is how to understand signatures like the following:

fn iter1(&self) -> impl Iterator<Item = i32>
fn iter2(&self) -> impl Iterator<Item = &i32>
fn transform(iter: impl Iterator<Item = u32>) -> impl Iterator<Item = u32>

In particular, each of the return types will actually be some concrete, underlying types. What lifetimes are allowed to appear in that type? Can iter1 produce an iterator that borrows from self? Can transform produce an iterator that mentions lifetimes if its argument did?

The RFC makes a key assumption here, which I want to argue is false:

There should be an explicit marker when a lifetime could be embedded in a return type

This is in the context of the general regret over elided lifetimes in type constructors having no marker that they occur (making it hard to know when borrowing is happening), which has been discussed throughout this thread.

Here's the thing: we can achieve the same goal by using reversed defaults! That is, the general assumption can be that impl Trait allows for borrowing according to the usual elision rules, making impl much like & when scanning for borrowing. When elision isn't allowed, or when you want to override these rules, you can impose lifetime bounds. That is:

// these two are equivalent:
fn iter1(&self) -> impl Iterator<Item = i32>
fn iter1<'a>(&'a self) -> impl Iterator<Item = i32> + 'a

// by analogy to the following:
fn iter1(&self) -> &SomeStruct // the lifetime here is tied to self's
// similarly, these two are equivalent
fn iter2(&self) -> impl Iterator<Item = &i32>
fn iter2<'a>(&'a self) -> impl Iterator<Item = &'a i32> + 'a
// finally, these two are equivalent:
fn transform(iter: impl Iterator<Item = u32>) -> impl Iterator<Item = u32>
fn transform<'a, T>(iter: T) -> impl Iterator<Item = u32> + 'a
    where T: impl Iterator<Item = u32> + 'a

In cases where elision rules don't apply, you have to disambiguate:

// This is not allowed:
fn no_elision(x: &Foo, y: &Bar) -> impl SomeTrait

// Instead, write e.g.
fn no_elision<'a>(x: &Foo, y: &'a Bar) -> impl SomeTrait + 'a

The point is that, unlike with custom type constructors, with impl you assume borrowing (based on elision) unless otherwise stated. That makes borrowing fully apparent based on the signature, but is likely a better default -- and it eliminates the need for something like impl& to flag that borrowing might be happening.

What do you think?

5 Likes

But how do you "state othrewise"? that is, if the following are equivalent...

...but the return value doesn't borrow from self, is there a way to indicate that the return value doesn't need the "+ 'a" and that it may outlive self?

Just like Box?

fn iter1(&self) -> impl Iterator<Item=i32> + 'static
2 Likes

Whoops, I forgot about 'static.

I like this! However, this is in conflict with the plans to make trait objects use dyn and not have a keyword for "imp Trait". When/if that ever happens, given a signature like fn foo(&self) -> ReturnType, one has to know whether ReturnType is a trait or a type to know whether it can borrow from self. That doesn't seem any better than knowing whether a type takes a lifetime parameter.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.