[lang-team-minutes] Elision 2.0

nikomatsakis · June 12, 2017, 9:08pm

So I just had a conversation with @wycats. We elaborated on “the syntax table”, and we also had the insight that – as a general rule – if you have an &Foo, we can still completely allow eliding any lifetime parameters of Foo, since there is already an indication of borrowing from the outer & (this was the same argument that I used to claim we would not need to ever write &Foo&).

// Not eliding 'nested lifetimes' like `'a` in `&Foo<'a>`
Foo'     Foo'<T>     &Foo'        &Foo'<T>
Foo&     Foo&<T>     &Foo&         &Foo<T>
Foo<'>   Foo<', T>   &Foo<'>      &Foo<', T>
Foo<&>   Foo<&, T>   &Foo<&>      &Foo<&, T>
Foo<ref> Foo<ref, T> &Foo<ref, T> &Foo<ref, T> 

// Eliding 'nested lifetimes' like `'a` in `&Foo<'a>`
Foo'     Foo'<T>     &Foo         &Foo<T>
Foo&     Foo&<T>     &Foo         &Foo<T>
Foo<'>   Foo<', T>   &Foo         &Foo<T>
Foo<&>   Foo<&, T>   &Foo         &Foo<T>    // my preference
Foo<ref> Foo<ref, T> &Foo<T>      &Foo<T>

When I look at these syntaxes, particularly the second set, I feel like Foo<&> stands out to me as the best choice:

we have the consistency of & as the “elidable lifetime”
- when first learning, you can probably get quite far this way
using the angle brackets, like Foo<&>, strongly reads to me as “struct with references in it”, which is precisely what it is
- that is, we do not have the &Foo vs Foo& confusion
we do not have the “floating single tick” syntax (') that is very surprising and provokes strong reaction in some people
- we also avoid “doubling down on the tick”

The only case in that table that kind of “looks heavy” to me Foo<&, T> – but it’s also one of the cases where I personally get confused about today (because you can elide the &). (Well, I sometimes get confused about &Foo<T>, but I think the ongoing work on improved lifetime errors will solve this.) It feels like this comes up rarely to me in practice; it’d be interesting to try and measure it.

I agree that there is a nice continuity between Foo<', T> and Foo<'a, T>, but it feels like it is compensated for by the twin downsides of:

single tick alone is very jarring to some people
lack of continuity between &T and Foo<&>

petrochenkov · June 12, 2017, 11:19pm

Could you elaborate, what is the logic here? I just... don't see any connection. What difference does the outer & make? If Foo<'a> is "good" and Foo is "bad", then why is &Foo good again? What is the purpose of &/' if not "highlighting" types containing lifetimes?

nikomatsakis · June 13, 2017, 12:36am

The way I see, if you have &Foo<&>, but you do not write the "inner" & (i.e., you write &Foo), then you already have a signal that the Foo is borrowed, and you would not expect to be able to (e.g.) take ownership of it, at least not without knowing more details about the type. Moreover, if you see it in return position, then you would already know that the lifetime of self is extended to cover the use of the return value.

In short, I want to ensure that there is a & or 'x somewhere when there are borrowed values floating about, but I'm not sure they have to be everywhere. The case that I think is actively confusing is fn foo(x: Ref<T>) or fn foo(&self) -> Ref<T>, where there is no sign at all that the argument contains borrowed data. If you see fn foo(x: &Ref<T>) or fn foo(&self) -> &Ref<T>, I think that is much better. (But feel free to tell me you disagree.)

That said, I am not sure if same logic applies when there are named lifetimes involved. For example, if you have these signatures, there is more room for surprise:

impl<'a> Foo<'a> {
    fn foo(&self) -> &'a Ref<T> { ... }
    // becomes: fn foo<'b>(&'b self) -> &'a Ref<'b, T> { ... }
    fn bar(&self, x: &'a Ref<T>) { ... }
    // becomes: fn bar<'b>(&self, x: &'a Ref<'b, T>) { ... }
}

This perhaps falls under the general header against mixing named and elided lifetimes, although neither is quite the cases I had in mind (since the named lifetime is in a different "layer" of the type).

leodasvacas · June 13, 2017, 2:32am

The input lifetime case could be solved with a syntax highlighter that colors structs with references differently, or underlines them or something. Sufficently smart IDE is usually a bad argument, but in this case it's a simple IDE feature that avoids a breaking change to syntax.

EDIT: My wording is incorrect, as Niko points out below this is not a syntax breaking change only a syntax deprecation, but the point still stands.

KasMA1990 · June 13, 2017, 7:56am

So just to be clear, your preferred solution would also excise the tick from named lifetimes (so Foo<'a, T> would become Foo<&a, T> if you want the lifetime named), right?

nikomatsakis · June 13, 2017, 12:52pm

To be clear: nobody is proposing a breaking change. All existing code would continue to work, though some patterns might be deprecated (and perhaps automatically transitioned by a rustfix-like tool, but that's a separate thread). I am not sure how "active" this deprecation has to be-- i.e., should we actively warn, or just teach it another way?

No, if you use a named lifetime, you would use 'a just as today. The & would just be for the case where you elide things. And this is the one downside, I think: that for &T, you "add" the 'a to get &'a T, but for Foo<&>, you "replace" the 'a with Foo<'a>. Writing Foo<&'a> would be ambiguous in any case.

To be clear, I see the appeal of the ', and there is something nice to "just elide the name", but I don't it's an open-and-shut case as to which is better. They each have their advantages, as @glaebhoerl pointed out earlier. (Among other things, using ' also has a kind of discontinuity, since you don't write &' T when you elide the lifetime from &T, but you do write Foo<', T>.)

phaylon · June 13, 2017, 1:43pm

To me, Foo<&> is more implying “parameterized by a reference”, but I already said as much way up there.

Some other downsides:

Currently, I see & in a signature and I expect that to be a type.
I could see learnability problems wrt. & vs &mut if the former is repurposed.

RalfJung · June 13, 2017, 10:50pm

On the other hand, this makes the rule for having to write <&> context-dependent. That's going to cause confusing on its own, and I am not sure that's worth it here. I feel things are more consistent when adding or removing an & doesn't also magically change which annotations you have to (not) make "below" this type operator.

That said, I find myself somewhat disagreeing with the entire premise of the <&> proposal. I was actually positively amazed when I realized that lifetime elision "just works" in fn foo(&self) -> Ref<T>. I never considered this confusing. After all, it says Ref right there, which is a pretty good indication of a reference, isn't it? Of course, API designers could make this confusing by picking bad names, but that is not something a compiler can fix. So, I would be somewhat sad if this got deprecated. But maybe I just didn't run into this issue yet because I didn't work with enough code that puts lifetimes in structs in interesting ways.

krdln · June 13, 2017, 11:53pm

That made me think – is the Foo<&> syntax forward compatible with higher kinded types? Imagine that if Foo had a kind: lifetime -> type -> type, then the Foo<&> syntax could be valid usage of & type constructor. I'm in favour of the Foo<&> meaning elided lifetimes, but I just want to make sure it won't creates syntatic ambiguities with other proposed features.

nikomatsakis · June 14, 2017, 1:21am

That's fair. I don't know for sure if it is or not. I think what I would probably prefer is that I can type whatever I want, but rustfmt fixes it up for me to the canonical form (right now we often do this sort of thing with lints that "steer" you the right way, and this might be a good place for that too, at least until we get a smarter rustfmt).

You can memorize Ref, but there are many other types with lifetimes. What about MutexGuard? Imagine you don't really know how mutex works... in that case, there is nothing in this signature that would tell you that invoking lock() is going to make the mutex be considered borrowed for the lifetime of the return value:

fn lock(&self) -> MutexGuard<T>

Add to that things like vec::Iter (borrows) but vec::IntoIter (does not).

Then you get into projects like the compiler, which make extensive use of references. When you see a signature like this:

fn cause(&mut self, code: traits::ObligationCauseCode) { ... }

doesn't it seem a bit surprising that you can't (e.g.) return this code out from the function, or make a struct and store it? In contrast:

fn cause(&mut self, code: traits::ObligationCauseCode<&>) { ... }

makes it pretty clear that you won't be able to do that (in that particular case, I admit, it's most common in the compiler to see ObligationCauseCode<'tcx>, but it's still fairly common for me to find signatures that elide any reference to the lifetime because it doesn't happen to matter in that particular spot). This can be particularly confusing because the compiler undergoes regular refactorings, and things that used to have references sometimes change so that they no longer do, or vice versa.

RalfJung · June 14, 2017, 5:25pm

I see. I mean, the compiler will of course tell you instantly, but I can see how it is nice to be alerted of such subtleties already in the "planning" stage, before actually writing any code.

aturon · August 9, 2017, 11:23pm

I was picking back up on this topic today, and together with folks on #rust-lang, had a couple of realizations.

First up: the idea of e.g. Ref<&, T> has some issues around impl Trait. The proposal was to use impl& Trait, but the fact that &impl Trait (and hence &impl& Trait) is a thing, with a somewhat subtly different meaning, seems worrisome.

But in digging into this, @mbrubeck raised a really interesting idea.

Part of the last merged RFC on impl Trait talks about the interaction with lifetimes. The key question is how to understand signatures like the following:

fn iter1(&self) -> impl Iterator<Item = i32>
fn iter2(&self) -> impl Iterator<Item = &i32>
fn transform(iter: impl Iterator<Item = u32>) -> impl Iterator<Item = u32>

In particular, each of the return types will actually be some concrete, underlying types. What lifetimes are allowed to appear in that type? Can iter1 produce an iterator that borrows from self? Can transform produce an iterator that mentions lifetimes if its argument did?

The RFC makes a key assumption here, which I want to argue is false:

There should be an explicit marker when a lifetime could be embedded in a return type

This is in the context of the general regret over elided lifetimes in type constructors having no marker that they occur (making it hard to know when borrowing is happening), which has been discussed throughout this thread.

Here's the thing: we can achieve the same goal by using reversed defaults! That is, the general assumption can be that impl Trait allows for borrowing according to the usual elision rules, making impl much like & when scanning for borrowing. When elision isn't allowed, or when you want to override these rules, you can impose lifetime bounds. That is:

// these two are equivalent:
fn iter1(&self) -> impl Iterator<Item = i32>
fn iter1<'a>(&'a self) -> impl Iterator<Item = i32> + 'a

// by analogy to the following:
fn iter1(&self) -> &SomeStruct // the lifetime here is tied to self's

// similarly, these two are equivalent
fn iter2(&self) -> impl Iterator<Item = &i32>
fn iter2<'a>(&'a self) -> impl Iterator<Item = &'a i32> + 'a

// finally, these two are equivalent:
fn transform(iter: impl Iterator<Item = u32>) -> impl Iterator<Item = u32>
fn transform<'a, T>(iter: T) -> impl Iterator<Item = u32> + 'a
    where T: impl Iterator<Item = u32> + 'a

In cases where elision rules don't apply, you have to disambiguate:

// This is not allowed:
fn no_elision(x: &Foo, y: &Bar) -> impl SomeTrait

// Instead, write e.g.
fn no_elision<'a>(x: &Foo, y: &'a Bar) -> impl SomeTrait + 'a

The point is that, unlike with custom type constructors, with impl you assume borrowing (based on elision) unless otherwise stated. That makes borrowing fully apparent based on the signature, but is likely a better default -- and it eliminates the need for something like impl& to flag that borrowing might be happening.

What do you think?

edmccard · August 16, 2017, 12:27pm

But how do you "state othrewise"? that is, if the following are equivalent...

...but the return value doesn't borrow from self, is there a way to indicate that the return value doesn't need the "+ 'a" and that it may outlive self?

kennytm · August 16, 2017, 2:57pm

Just like Box?

fn iter1(&self) -> impl Iterator<Item=i32> + 'static

edmccard · August 16, 2017, 8:52pm

Whoops, I forgot about 'static.

RalfJung · August 20, 2017, 3:58pm

I like this! However, this is in conflict with the plans to make trait objects use dyn and not have a keyword for "imp Trait". When/if that ever happens, given a signature like fn foo(&self) -> ReturnType, one has to know whether ReturnType is a trait or a type to know whether it can borrow from self. That doesn't seem any better than knowing whether a type takes a lifetime parameter.

nikomatsakis · March 25, 2019, 8:28am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
pre-RFC: Lifetime elision 1.1 - structs with one reference field language design	12	1913	March 25, 2019
Lifetime elision with only the return type elided	4	978	March 25, 2019
Lifetime Elision for Associated Types (Unbaked idea) language design	14	978	March 25, 2019
Pre-RFC: usagetimes (partial mutability) language design	18	1281	December 11, 2023
Nicer syntax for lifetime arguments?	2	888	March 25, 2019

[lang-team-minutes] Elision 2.0

Related topics