[lang-team-minutes] Elision 2.0

I don't mind it so much either, but:

  • it's not obvious.
  • it's annoying if you have a lot of variables.

Another similarly non-obvious trick is the following, for when you wish to capture a clone of y (and not y itself):

let x = {
    let y = y.clone();
    move || use(y)
};

Maybe just documenting these things as expected patterns would suffice.

1 Like

I later realized that a semi-conscious goal with the “hypothetical tutorial” in my previous post was avoiding lifetime parameters until absolutely necessary, because lifetime parameters are kind of weird. I think it’s worth expanding on that issue.

How exactly are lifetime parameters “weird”?

  1. We can receive lifetime parameters, but we can’t pass them, unlike all other kinds of parameters.
struct S<X: Debug> { // receiving a type parameter (presumably const parameters will be similar)
    x: X
}
fn main() {
    let s: S<i32>; // passing a type parameter (presumably const parameters will be similar)
    ...
}

struct R<'a> { // receiving a lifetime parameter
    r: &'a i32
}
fn main() {
	let x = 42i32;
    let r: R<???>; // no way to pass "the lifetime of x" to R here
    ...
}
  1. Lifetime parameters aren’t really lifetimes.

Most users either assume or are taught that a lifetime parameter is “the lifetime” of a certain variable, block, scope, borrow, or whatever from the call site. Usually that’s good enough, but in some cases (like Opposite of &'static ) that quickly breaks down and you have to start thinking of lifetime parameters as a set of borrow checker constraints.

This is obviously closely related to #1, since part of the reason you can’t “pass” a lifetime parameter is because we don’t have a syntax for writing “borrow checker constraints” in Rust source code.

As far as I know, type parameters are types and const parameters will be const values.

“Direct” Solutions / Strawman Syntax Ideas

A lot of the elision ideas already discussed attack this problem by simply making lifetime parameters not appear in the source code at all, which does help a lot, but since we want a comprehensive rethink and not just an incremental improvement we should talk about more “direct” solutions as well.

I think direct solutions would generally fall into one of two categories. Warning: lots of strawman syntax here.

  1. Make lifetime parameters act like parameters: You can pass them and they are the lifetime of something.
struct R<'a> {
    ref: &'a i32
}
fn main() {
	let x = 1;
	{
		let y = 2;

		let r1 = R<lifetimeof y>{ ref: &x }; // OK, x is alive at least as long as y is
		let r2 = R<lifetimeof x>{ ref: &y }; // ERROR, y does not live as long as x does
	}
}
  1. Replace “lifetime parameters” with some non-parameter syntax(es?).
fn foo(a: &str, b: &str) -> &'a str {
// where 'a is the constraint "has the same lifetime as a"
    ...
}

#[lifetime_constraint(a, same_as(b), different_from(c))]
struct Foo<'> {
	a: &i32,
	b: &i32,
	c: &str,
}

This overlaps a lot with making lifetime parameters “not appear in the source code”, but I think it’s important for the mental model that we decide whether it should be conceptually correct to view this sort of change as “hiding lifetime parameters” or “expressing lifetime constraints”. We all seem to support being able to return &'arg1 str, but is that 'arg1 a “constraint” in and of itself or is it merely sugar for fn foo<'a>(arg1: &'a str) -> &'a str ?

Conclusion

My current feeling is that we should lean towards making “lifetime parameters” act like lifetime parameters, and treat all “non-parameter syntaxes” as syntactic sugar that can always be expanded to use explicit lifetimes at both the “call site” and the “receiver”.

The reason I lean in that direction is that “explicitly passing lifetimes” would be a huge win for teaching Rust, and to a lesser extent for debugging lifetime issues and helping tricky lifetime code be more self-documenting.


P.S. Regarding closures, I do think “just documenting” the current tricks is sufficient, and capture lists wouldn’t really pull their weight (unlike in C++).

Regarding explicit lifetime parameters and teachability, I remember the plan was to be able to specify lifetimes as labels, so your example would look something like:

struct R<'foo> {
    ref: &'foo i32
}

fn main() {
    a: let x = 1;
    {
        b: let y = 2;
        let r1 = R<'b>{ ref: &x};
        let r2 = R<'a>{ref: &y};
    }
}

Regarding closures, I second just documenting the pattens.

You can pass lifetimes (as you do in your struct R, which passes one to &i32), you can even pass a lifetime in main - but only the one concrete lifetime that we allow you to name right now: 'static. The problem you're describing is that we don't allow you to construct concrete lifetimes except for 'static, so in a function like main that's the only lifetime you can name. As @yigal100 showed, we could possibly do something with labels on expressions.

I think this lack is clearly confusing since even many advanced users have difficulty articulating exactly how a lifetime parameter relates to a concrete lifetime. We should definitely add a syntax like this as a teaching tool. But users won't use it most of the time, whereas ellisions are intended to be used frequently.

One angle I would like to use when designing the syntax for lifetime parameter elision: Leaving out the lifetime of &'a T looks like &T. The & indicates the presence a lifetime, but the entire 'a is what gets elided.

From this perspective, I’m not sure which of the proposed shorthands makes the most sense. T<'> introduces an new variant on what characters to leave out, but T<&> adds a new character that isn’t even there in the un-elided case. T'/T& are rather unfortunate in that they widen the difference between &'a T->&T and T<'a>->T&.

The closest would be T<> but that doesn’t help the case where there are already type parameters (and doesn’t really imply “reference” either). Maybe T<'_> or even T<'..>? Maybe that’s even worse.

2 Likes

One thing I think was a slam dunk for fns, that I haven’t seen for structs is “Permit referencing the name of a parameter instead of declaring a lifetime”.

// Should be legal
struct Foo<'a, 'data, 'z> {
	a: &i32,
	data: &i32,
	z: &str,
}

Thoughts?

1 Like

AFAIK named lifetimes are only really necessary (i.e. can’t be easily inferred by the compiler) only when multiple fields in a struct share the same one lifetime.

I am strongly in favor of adding the ability to "declare" lifetimes. However, I think that the lifetimeof keyword is not what you want. In particular, that's not the model that the compiler has internally.

For example here:

fn foo() {
    let x: i32 = 1;       // --- scope of x ------+
    let y: &'a i32 = &x;  // --- lifetime 'a --+  |
} //                   <-----------------------+  | 
  //               <------------------------------+

To be honest, I'm not crazy about the term lifetime. It's kind of hard to change it now, but I think there is a dangerous confusion that occurs. The "lifetime of x", or at least what I think you meant by that, corresponds to the span of code that begins when x is allocated (i.e., the let where it is pushed on the stack) and ends where x is popped (i.e., the exit from the enclosing block). I usually try to call that the scope of x.

In contrast, the lifetime of a reference corresponds to the region or span of the code where the reference is used. This is almost always shorter than the lifetime of the variable itself, as you can kind of see in the diagram above -- in this case, the lifetime 'a would end infinitesimally before the scope of x.

What I would prefer is if we can label blocks and expressions and then use those names as lifetimes in the code. This would permit us to explain the mechanisms of the type system with more clarity:

fn main() {
    let x: i32 = 1;
    'a: {
        // explicitly give this reference the lifetime `'a`,
        // corresponding to the labeled. The lifetime must
        // be the label of some enclosing block or loop.
        let y: &'a i32 = &'a x;
    }
}

This isn't perfect, since internally the compiler has a whole range of lifetimes that are not blocks. Basically every statement (e.g., let y = &x), expression and subexpression has its own lifetime, corresponding to the duration of time in which they execute. Once we move to NLL, then we'll have an even larger set of lifetimes, corresponding to arbitrary sets of paths through the control-flow graph. But at least being able to label blocks might help to communicate that the lifetime of a reference is not, in fact, tied to a variable, but rather it's just a region of the code (which must be some subregion of the scope of the owner).

5 Likes

Would it be consistent to allow a lifetime to be declared “on the item”? The following would be sweet and simple if it could work

trait MyTrait {

    // fn foo<'a,'b>(&'a self, data: &'b[i32]) -> &'b[i32] { } 
    fn foo(&self, data: &[i32]) -> &'data [i32] { }                  

    // fn bar<'a,'b>(&'a self, data: &'b[i32]) -> &'a[i32] { } 
    fn bar(&self, data: &[i32]) -> &'self [i32] { } 

    //  fn baz<'a,'b,´c:'a+'b>(&'a self, data: &'b[i32]) -> &'c[i32] { } 
    fn baz(&self, data: &[i32]) -> &'data+'self [i32] { }
}
1 Like

How about the following syntax for structs?

struct Foo<'self> {
   a: &i32
}

This syntax isn’t much shorter than the present syntax, but hopefully it is more googleable and easier to grasp for new users. In general I think that keywords are more googleable than sigils.

An assortment of embarrassingly random thoughts in no particular order:

  • In the const generics thread we were floating the possibility of removing ticks, and using lifetime a as the syntax to introduce lifetimes, and just a to refer to them (much like we’d have const X: Foo resp. just X for constants). This idea is in tension with adding more ticks, in <'>.

  • Recall that the spark for adding lifetime elision in the first place was a discussion some C++ programmers (might’ve been Chrome folk?) were having on a different forum, about Rust, which got shared to one of the Rust forums, where they were essentially WTFing over Rust’s noisy and verbose explicit lifetime syntax (fn foo<'a>(a: &'a Foo) -> &'a Bar was the only option at the time). Seeing non-Rustaceans having that reaction convinced us that it was in fact an actual problem and that we should do something about it. (I actually don’t remember where this pre-RFC discussion took place, and couldn’t find it just now, does anyone else?)

    Anyway, the point I’m getting around to is that for non-Rustaceans, Foo<'> also has the risk of coming across as line noise and leading to “what is this I can’t even”-style reactions. You need to already know a lot about Rust to even be able to guess at the meaning of an unmatched apostrophe standing on its own, there.

  • I feel like our thought process here is roughly:

    1. Lifetime elision not being apparent from the function signature for user-defined types is a problem.
    2. Should we fix it? Yes. Yes we should.
    3. Okay, so what syntax should we use?
    4. *surveys available options*
    5. It seems like all of these are pretty bad, but we said we were going to solve the problem, so I guess we have to choose one of them?

    Point being that we should at least consider the possibility that the cure could be worse than the disease. If all of the options for fixing the problem would result in ghastly syntax that would end up repelling people from the language on sight, it might be less bad to resign ourselves to continue living with the problem, as we’ve been doing so far.

    (Even better, of course, would be to find a non-ghastly syntax. Provided that we can.)

  • If we allow punning the name of a variable for the name of a lifetime associated with it, as also proposed, then the elided fn foo(x: &Foo) -> Bar<'> does not have that much of an advantage over the non-elided fn foo(x: &Foo) -> Bar<'x>, any more. Relative to the “current baseline”, the elided syntax has gotten more verbose, and the non-elided syntax has gotten less so. Could we live with deprecating elision for user-defined types outright, without a direct replacement syntax, and just have people use the name-punning syntax instead?

  • (Incidentally, if I remember correctly, a couple of years ago the ability to use function parameter names as lifetime names was proposed kind of frequently, and it was always shot down with the reasoning that while it would indeed be convenient, it’s founded on a misunderstanding of how lifetimes work and would cause people to form misleading mental models. Fast forward to the present, and the lang team itself is now proposing the change. Does anyone involved happen to remember when/why/how your thinking changed?)

  • I feel like a nice thing about the lifetime elision syntax when actual references are involved is how you can just visually match up the & symbols to see what is borrowing from what: fn foo(x: &Foo, y: Blah, z: Zzz) -> HashMap<int, &Bar>. I prefer one of the syntaxes involving an & symbol for this reason, most likely Foo<&>, if the plain postfix Foo& is a non-starter due to the potential for confusion w.r.t. C++.

  • If we ever add “full” HKTs, allowing us to abstract over type constructors, and to refer to & itself as a type constructor (of kind type<lifetime, type> using my preferred kind syntax, or Lifetime -> * -> * in the Haskell notation most people are familiar with), then Foo<&> could be valid syntax, with a meaning that conflicts with the aforementioned one. This could conceivably be worked around in a number of ways, like introducing a type alias type Ref = &; and writing Foo<Ref>, or requiring & to be written as <&> like we currently do when explicitly referencing associated items (<&T>::Foo), or potentially others. (I don’t think this is a significant issue, it’s just a random thought.)

  • What if instead of decorating the types, we were to decorate the function arrow itself to indicate “borrowing is taking place across this function call”? Like, fn foo(x: &T) &-> Foo, or ->&, or something along those lines. I’m not remotely sure that I like this idea (it’s also a bit cryptic and syntaxy), just putting it out there.

  • We also have a ref keyword, which we might incidentally be phasing out with the "match ergonomics" improvements. Maybe we could use that somehow?

3 Likes

Perhaps something like:

fn foo(x: &Foo) -> Bar ref x

Less noisy? Less intimidating for non-Rustaceans?

I’ve been running into this a little, and I’ve been thinking about let ... in syntax again:

...map(let foo = foo.clone() in move |x| foo.thing(x))

It’s equivalent to

...map({ let foo = foo.clone(); move |x| foo.thing(x) })

but when the closure spans many lines and is more complex, it’s nice to not have to close the {}.

It also more general than adding new syntax to closures.

That said, last time I floated the idea it was generally rejected as not adding enough new value.

1 Like

This is an interesting use case for let in! I’ve always been sort of surprised Rust doesn’t have this feature since:

  • All the keywords are already reserved.
  • I think its much more natural to scope lifetimes with let y = &x in { ... } than { let y = &x; ... }.
  • It doesn’t seem hard to me to figure out what it means when you see it (but I could be wrong I guess).

Of course this use case doesn’t scale super well to capturing multiple refs since you have to use tuples:

.map(let (foo, bar) = (foo.clone(), bar.clone()) in move |x| foo.thing(x, bar))
1 Like

Since the general syntax is let <binding> in <expr>, and it is itself an expr, I was assuming you could cascade:

let foo = foo.clone() in
let bar = bar.clone() in
move |x| ...

Yes, that would work, though I’m not sure its better than the tuple form.

Either falls directly out of the syntax, so it just becomes a matter of using the form that suits the situation. The tuple form works well if you want to simultaneously alias multiple things from the outer scope:

let (foo, bar) = (bar.thingy(foo), foo.frob(bar)) in ...

There is another use case that I think is also important, which is when a user-defined type appears in a parameter list. Basically, I think it should be visually evident when a type has references, regardless of where it appears:

// This version makes it clear that `x` and `y` 
// contain references, without having to consult
// the struct definition.
fn foo(x: Foo<'>, y: Foo<'>) { ... }

// This version, accepted today, does not.
fn foo(x: Foo, y: Foo) { ... }

I find that I rely frequently on the ability to visually scan for references and things in order to estimate whether refactorings will work, etc. For me this is an extended version of the principle that it's good to have a (lightweight) visual indicator of when borrowing / ownership transfer are at play.

I was debating about Foo<ref> as well earlier, though I don't think I ever floated it on the thread. It seems not entirely implausible. I was nervous because we are backing away from it in match, although I think that the too things aren't necessarily in conflict.

Crazy thought: what if instead of writing lifetime (a term that I do not like anymore, for reasons I've already enumerated in this thread), we used ref to introduce named lifetimes?

struct Foo<ref a> { // new version of 'a
    x: &a i32, // no need to write ' here
}

fn use_foo<ref a>(f: Foo<a>)

fn get_foo<ref a>(&a self) -> Foo<a>

Then we would be saying that this is the shorthand:

struct Foo<ref> {
    x: &i32
}

fn use_foo(f: Foo<ref>)

fn get_foo(&self) -> Foo<ref>

If you really wanted to go crazy, you'd replace the & and &mut type constructors with ref :), so that we write fn get_foo(ref self) -> Foo<ref>. But this would then motivate one to introduce ref a.b.c as an expression. At that point, you have a bit of a problem because ref P patterns are ... well ... already taken (this tension being what motivated us to introduce ref binding mode in the first place).

4 Likes

That's... crazy indeed. I think I like it quite a bit! It certainly looks a lot cleaner than the current lifetime syntax, not to mention the other proposed new syntaxes for elision.

(And ref is also much shorter than lifetime would've been, which is a definite plus.)

We'd still need something to actually call them informally though? Presumably we'd be phasing out "lifetime" (or no?), and I assume we wouldn't actually call them "refs" or "references" to avoid confusion with &. Did you have anything in mind?

I thought of this too :slight_smile: I'm not sure if I like it, but it seems plausible. & is both more convenient and has a lot of cultural precedent in systemsy languages across the board (C and C++, but also Go, Swift...), quite unlike '. On the other hand, ref might be more self-describing and less intimidating for people who aren't coming from other systems languages. (So I kind of like them both equally, I guess; which suggests staying with the status quo.)

If we also phase out ref in patterns as part of the "match ergonomics" effort, which I think would be a good thing to do on its own merits, then I don't think this would be an actual problem. It's not backwards-incompatible (so we don't even need epochs), because the old meaning only exists in patterns and the new meaning only exists outside of them, so they can co-exist if need be; and it's also not a significant issue in terms of explanation or mental models, because new and idiomatic code as well as documentation etc. would only have the new meaning in it, not the old one.

(ref in patterns would just be kept around to keep old code compiling, at some point maybe with a deprecation warning, or eventually phased out completely with epochs if we want to, etc.)

Using ref for this seems confusing to me. Even if match loses the possibility of specifying ref, other patterns will still have it. My first intuition when seeing get_foo<ref a> is “a const parameter turned into a reference”.

I’d also suggest that while some might find ' ugly, the lifetime-parameters concept is quite novel, so having it stand out makes sense to me both from an educational and readability standpoint, and I’d be quite sad to lose that. I certainly feel that hiding lifetimes too much will make it harder to use them. I’d also be worried about things like quickly seeing the lifetimes involved in complex error messages.

One further note: I can’t prove but I’m quite certain that outdated documentation will be an issue. My reasons are: It was already an issue pre 1.0, and the Rust community is rich in blog articles, talk videos and slides that aren’t going to get updated.

1 Like