[lang-team-minutes] Elision 2.0

The good news is that we're also making progress on this front; @pnkfelix has been hard at work laying the foundations for NLL and other region checking improvements, and it's an explicit part of our ergonomics roadmap plan. We don't have to choose between these priorities.

Re: effort of typing code, etc, that's not how the lang team, at least, is thinking about these questions -- and note that part of the proposal here includes asking for a greater degree of explicitness (for things like fn foo(&self) -> Ref<T>, where it's not clear that the return value is extending the lifetime of the argument). Our general thinking about these kinds of questions is laid out here.

Finally, I should say that the thinking here isn't just about syntax per se, but about how to build toward a coherent mental model for borrowing that is easy to learn and applies consistently throughout the language.

10 Likes

Allow lifetimes to be elided in the body of a type with just one lifetime parameter.

This is nice to have. Though, I do find the example a bit jarring to read:

struct Foo<'a> {
    t: &i32 // assumed to be `&'a i32`
}

Here, 'a is explicitly declared, yet it is not used anywhere in the body! It would be more consistent to have either 'a appear everywhere or not appear at all.

Allow structs with a single lifetime parameter to use an "anonymous" syntax.

Among these two, I prefer the former. Even though Foo might have lifetimes, it need not contain any non-phantom references at all, so Foo& is a bit misleading.

That being said, is there a reason why the status quo (Foo without any decorations) is not even considered?

We would then deprecate the current elision rules when being applied to a struct with lifetime parameters unless the new syntax is used.

How much impact would this have on existing code? Deprecating stable syntax just to improve readability seems overkill. Maybe a Clippy lint would be more suitable?

I concur with @kornel in that a large part of this could be resolved with Rustdoc. So perhaps the ideas should test driven on Rustdoc first? Thereā€™s a lot of ways the docs could be improved, e.g.

  • Mousing over a type could show a snippet of the type definition.
  • Elided signatures could be expanded using a button.
  • Elided lifetimes could be represented as Foo<'ā€¦> in the documentation, but the <'ā€¦> would be uncopyable (via e.g. user-select: none or some other trick). The use of the Unicode ellipsis makes it obvious that this is not real code.

Permit referencing the name of a parameter instead of declaring a lifetime.

It may better to address this by adding an alternate syntax. The idea of having a undeclared lifetime that looks just like any other lifetime except with a magical name introduces yet another inconsistency that users will have to learn (and not to mention the backward compatibility concerns). Consider perhaps a different syntax like ''data or '(data)?

All in all, I do think the ideas presented are good. However, the proposed implementation introduces lots of deprecations and inconsistencies and I do not think the benefits are strong enough to justify them.

3 Likes

In general, I think being able to make things more explicit like that is good. I guess my experience is that reading code that uses the current elisions is harder than code that uses explicit lifetimes (IMO the clippy warning is counterproductive here), so I don't look forward to reading code that uses even more elisions.

Yes, actually, I don't like the example either. In fact, later on I said I would prefer to deprecate having "elided" notation that referred to a named lifetime. So probably I would only allow you to elide types in a struct (or enum, etc) if the lifetime is anonymous (struct Foo<'> or whatever).

Yes. Quoting myself (from the section "Prefer for this same "anonymous" syntax to be used in references to a type"):

I do agree that there may be a lot of code affected, but I don't know how much. I was thinking about trying to write a tool to try and gather some numbers from crates.io.

I don't however feel this is a minor thing -- I consider it pretty important to steer people in the right direction when it comes to lifetime conventions; if people are following them consistently, it makes it much easier to read code. This seems (to me) to be at least as important as removing unused mut declarations, and way more important than unused imports, two things that we lint for by default today.

Another possibility might be to provide a tool to automatically transition code to the new format. This would be a non-trivial thing to write, but if we do it right, it could be very useful in the future if we aim to make more changes of this kind -- for example, the same tool might allow us to transition code away from explicit ref bindings.

Thanks for bringing this up! I wanted to mention that as a possibility. That said, I don't think I've seen a syntax I prefer yet. My expectation is that actually giving explicit names to lifetime parameters will become quite rare, so I would want the syntax to be very lightweight; this seems to rule out (for example) ''x. '(x) is a possibility, I suppose, but it still feels like it biases the wrong way.

1 Like

As @aturon said, this is not an "either or" thing. In any case, for reference, this topic of nested method calls has been recently discussed in these two threads (and I agree it's a priority):

Some of the suggestions here I find unconvincing in isolation because they seem to only reduce the number of keystrokes without either simplifying the mental model or making the code clearly more readable. I think we need clearer motivation in terms of the overall mental model we want to implement before spending much more time on the syntax bikeshed. But I'm going to throw out some other thoughts before getting back to that point.

To me the clearest example of this is the single-reference struct case. On its own, I just don't see the benefit here.

struct Foo<'a> {
    t: &'a i32
}
struct Foo<'> {
    t: &i32
}

The second version doesn't prevent the user from having to think about the implications of a type having lifetime parameters/constraints, nor does it make it any more obvious what those implications are. I also can't think of any user errors that could be given a better error messages with the latter syntax. It seems like the only advantage to this sugar is (slightly) reducing the number of keystrokes. If this syntax somehow magically scaled to cases with multiple lifetimes I'd probably like it more, but obviously it can't do that. I would probably be against doing this were it not for the argument I bring up later in this post.

In contrast, I am extremely strongly in favor of inferring T: 'a constraints. These are technical details the programmer almost never cares about because it's always either objectively correct or objectively incorrect to add such a constraint. The decision to include or exclude such a constraint never represents a meaningful decision about a type's semantics and public API contractl; it only indicates the presence or absence of a compiler error. Being able to omit such constraints also allows the actually meaningful information about what lifetimes exist and who has what lifetime to stand out more.

Function signatures are more interesting, because I actually agree there's a problem here, but I disagree with the proposed solution. It is unfortunate that fn foo(x: &i32, ...) -> Foo<the lifetime of x> can only be expressed today by adding an explicit lifetime parameter in three separate places. The examples at the beginning of this thread appear to be proposing we reduce that to two separate places. But if we're going to improve this at all, why leave it at two? Why not just one, i.e. fn foo(x: &i32, ...) -> Foo<'x>? That totally eliminates the distracting redundancy and is more consistent with the cases where Foo<'> works.

Tangent: Someone suggested Foo<'x> is a little too magical and we should have a different syntax. For totally unrelated reasons, we're going to want typeof in the language someday. Assuming epochs eventually allow us to add new keywords, we could do lifetimeof, which would give us fn foo(x: &i32) -> Foo<lifetimeof(x)>. I have no idea whether I actually want this, but it's a thought I had.

Now for my serious suggestion: I would like to be able to "explain lifetimes" to a Rust user like this:

In function signatures, you usually don't have to worry about lifetimes because Rust will infer them for you.

// x has a lifetime, but there's no reason to mention it
fn foo(x: &i32) { ... }

In structs, you usually only care about distinguishing structs which do not contain any references/lifetimes at all from structs which do contain at least one reference/lifetime.

// The return value of foo() is NOT constrained by the lifetime of x
struct Foo { ... }
fn foo(x: &i32) -> Foo { ... } 
// The return value of bar() IS constrained by the lifetime of x
struct Bar<'> { ... }
fn bar(x: &i32) -> Bar<'> { ... } 

You only need to explicitly name lifetimes when there are multiple references involved and you need a lifetime constraint to apply to some but not all of those references. For example:

// Foo must not outlive x or s
fn foo(x: &i32, s: &str) -> Foo<'> { ... }
// Foo must not outlive s, but it is allowed to outlive x
fn foo(x: &i32, s: &str) -> Foo<'s> { ... }

This usually becomes useful when writing generic code or libraries where your clients might pass arguments of varying types and lifetimes (in particular, some 'static arguments and some non-'static).

Although I believe struct Foo<'> syntax has little benefit on its own, I also believe not having that syntax would make this sort of explanation more cluttered and less self-evident, and that's probably good enough motivation.

I have very little experience with non-toy Rust code, so I've probably gotten something horribly wrong in this hypothetical future tutorial, but I think our goal should be figuring out what we want this future tutorial to look like (emphasis on tutorial, not exhaustive reference). Or equivalently, what the overall mental model for lifetimes should become (as opposed to whatever it is now). I don't think we're at the point where it makes sense to be debating whether Foo<&> is more intuitive than Foo<'> in isolation from the rest of lifetime syntax/elision.

P.S. Except that I would like to echo the sentiment that Foo& looks way too much like "Foo reference".

18 Likes

I assume the trailing ampersand shouldn't be on this line? It's not something I've seen before, and I've double checked I can't get it to parse.

I've mentioned briefly noted my thoughts on closure capture before. In short, it's less about working around limitations and more about it being tricky to reason about what gets moved around - sometimes it's nice to be able to pin down whats going on (as it happens, @nikomatsakis kindly pointed out a trick you can use in the following comment). Anyway, here's a relatively simple example:

    let (x, y, z) = (X::new(), X::new(), X::new());
    dosomething(|| {
        let x = x;
        x.abc();
        {y}.abc();
        z.abc();
    });
    let refs = (&x, &y, &z);

How quickly would a newcomer to Rust be able to identify why two of these fail (especially if they were isolated examples, without the others alongside - note that to the untrained eye, the problem could be inside or outside the closure)? The error message is essentially "value is moved into closure, can't take reference" and it's not obvious how to look past the magic to figure out why.

Maybe I'd just like better error messages (I'm reminded of a cannot move out of captured outer variable in an `FnMut` closure error when I was working with futures that I was scratching my head over, since the error was pointing to a FnOnce closure - as it happens, the FnOnce was inside a FnMut, and Rust (sensibly) refused to pull the variables from outside the FnMut all the way inwards), but getting good error messages for all the strange corner cases is a long term goal and optional explicitness is a reasonable step forward.

(sorry, this was off topic)

1 Like

I think this is a great idea. It's clear, nothing is implicit (at least, not more so than now), it's just eliminating redundancy.

This also makes sense to me, although overloading the backtick to mean lifetimeof as you suggested above seems better to me.

3 Likes

This is actually what @nikomatsakis intended, but I think he made a mistake in the very first example of the feature and also wrote an explicit lifetime.

In any case, that's certainly what we discussed at the lang team meeting, and I personally feel like this particular expansion of elision is pretty much a slam dunk.

5 Likes

So, regarding my first example, I was trying to separate out the proposal into little bits that I introduced one at a time, and hence I wanted to add "elision in struct bodies" as a separate item. However, I think that in general, I would like to have a rule that says "if a lifetime is given an explicit name, you can only refer to it via that name, and not via an anonymous form" (as I mentioned later). This would presumably apply equally to struct Foo<'a> { x: &i32 } and fn foo<'a>(&'a self) -> &i32. The latter is legal today, but I'd prefer to deprecate it.

That said, if I understood @lxrec correctly, they were suggesting rather that we could forgo the existing elision completely, and simply deprecate anonymous forms -- or at least deprecate them in return position (presumably the same would apply to struct declarations, which is the "first example" I think you are referring to). Instead, we would encourage people to use names (fn foo(x: &i32) -> Foo<'x>). I see the appeal of this, there is a measure of added clarity! But it would mean that methods like fn elements(&self) -> &[T] become fn elements(&self) -> &'self [T] and so forth. Personally I find this overrotates on "clarity for beginners" -- that is, I think this is clearer at first read, but becomes tedious over time. The current rules seem to largely hit a sweet spot for me in this respect.

I am definitely in favor of this, and this is very much on my mind with this proposal. I know that @wycats has long advocated for teaching lifetimes, at least initially, using the idea of "borrows from x" -- i.e., fn foo<'x>(foo: &'x Foo) -> &'x Bar can be though of as "the return value borrows from the argument foo". This seems to be greatly aided by having a shorthand to refer to parameters.

I think if I were teaching I would probably introduce things in roughly this order:

  • fn foo(&self) -> &'self [T] and fn foo(&self) -> Ref<'self, T>
    • first, show beginners using an explicit name, referring to a parameter.
  • fn foo(&self) -> &[T] and fn foo(&self) -> Ref<', T>
    • then, explain the elision rules in terms of that
    • probably we have to explain the <'a> syntax too so it's at least not startling, but I'd not go into details; basically, a longer form that is intended for use when you have multiple categories of references you want to distinguish within one type
  • time passes
  • fn foo<'a, 'tcx>(&self, tcx: TyCtxt<'a, 'tcx>)
    • later, when explaining multiple lifetime parameters on one type, cover explicit lifetime names
1 Like

Hm, I have have settled for visual explanations fully. The problem of lifetimes is that they communicate a local view on outside conditions, which are hard to explain with just text at hand.

Sorry, to put this into context, I usually start with "the caller" side. On that side, visual explanations work well. But eventually you want to teach the callee side, and then you have to explain the annotations (I have tried to find a visual way to present that too, not sure how successful it was -- haven't had time to make into an Into Rust tutorial either :frowning2:)

As a relatively recent Rust user but long-time C++ user, I am fairly confident in saying that a trailing & would permanently impede my ability to quickly comprehend Rust code. Almost any other symbol would be a better choice, since & conflates two really different concepts, and it is also extremely important to distinguish references from values.

I really like lxrecā€™s desired-tutorial. The closer we can get to that, the better.

Also, regarding parameters-as-lifetimes, every parameter has a lifetime anyway, so being able to talk about it without separately naming it seems like a win in all cases, even in mixed cases. Whether it is a raw 'data or something else is less important than that you neednā€™t re-name the-lifetime-of-x if you already can talk about x. Only when you want x to have something other than its default lifetime would you need to use a name for that lifetime.

Thatā€™s really the spirit of elision, isnā€™t it? If thereā€™s one obvious way things should be, donā€™t bother mentioning it.

7 Likes

Interestingly, I have a visualisation that makes no such distinction, but talks about the annotations in terms of the boundary.

Sadly, I usually derive that on a whiteboard and have no picture at hand, I'll make a couple of pictures next week on the next workshop.

2 Likes

I don't think @lxrec was suggesting to forgo the current elision. @lxrec was just suggesting that we use variable names as lifetimes and (usually) avoid explicit lifetime declarations/definitions.

Re-reading their comment, I agree, I guess I got confused. In any case, that was part of the original proposal, and I'm definitely in favor =)

Iā€™ve never really liked the tick sigil for lifetimes in general (my brain thinks string without closing delimiter!), and even with that in mind, Iā€™m still in favor of using ' over &, both for consistency and avoiding the specter to complicated C++ usage. I think it also makes introducing lifetimes a bit easier by starting out with anonymous <'> syntax (or even Foo' syntax) and then saying, ā€œHey, now letā€™s look at explicitly naming our lifetimes, itā€™s almost the same!ā€

I also want to say that parameter lifetime ellision feels too confusing as is? Especially the inability to use some names when explicitly naming lifetimes, and being unable to mix named and parameter lifetimesā€¦very very confusing. What if, instead of overloading lifetime naming, just adding to the syntax with a special ā€œparameter lifetime referenceā€ syntax? fn foo(&self, data: &Foo) -> &'{data} Bar or maybe &'(data) Bar or even &'''data Bar. I think this might be less confusing? It also might (or maybe not, just throwing it out there) be useful in other contexts, like maybe big structs, or between borrowed parameters, not just return borrows? But in all cases, having it be different from a named lifetime syntax should be essential imo.

5 Likes

I'm not sure what you mean by this -- can you give an example?

I think the let x = &x; technique is rather elegant. It doesnā€™t come up much (for me), and when it does it just makes use of the general rules for scoping and Rust moves. I think itā€™s much nicer than the syntactic junk that C++ throws into lambdas to achieve the same thing,

Iā€™m not sure that adding more syntax to move is really going to pay for itself here. But I suppose it depends a lot on what youā€™re working with - I havenā€™t tried to use Rayon seriously, so perhaps it comes up every time you try to use a closureā€¦

2 Likes