[lang-team-minutes] Elision 2.0

Just “parameter name instead of declaring a lifetime” doesn’t help with declaring a struct containing lifetimes or writing impl blocks for it (although, re-reading the OP and very briefly scanning the thread I don’t see that mentioned at all).

What might be useful is to separate discussion of the two proposals, they seem to be mostly independent of each other (other than the fact they’re both aiming to help “make the ‘easy things easier’ in the lifetime system”). And I think having a shorthand for declaring and referring to an existing anonymous lifetime is less controversial (and less bikesheddy) than adding new syntax for explicitly anonymising existing named lifetimes.


Expanding the idea I had about allowing the “anonymous” syntax to be used in impl blocks, I think it would be consistent to allow using the anonymous lifetime marker in an inherent impl block or an impl Trait for block where Trait does not contain any lifetimes itself, e.g.

struct Foo<'a>(&'a [u8]);

impl<'a> Foo<'a> {
    fn new(buf: &[u8]) -> Foo {
        Foo(buf)
    }

    fn get(&self, i: usize) -> Option<&u8> {
        self.0.get(i)
    }
}

impl<'a> AsRef<[u8]> for Foo<'a> {
    fn as_ref(&self) -> &[u8] {
        &self.0
    }
}

could be rewritten as (using the <'> contender as that’s my preferred choice out of what’s been proposed so far)

struct Foo<'>(&[u8]);

impl Foo<'> {
    // Note that this is closer to writing
    //     fn new(buf: &[u8]) -> Foo<'a>
    // above, which currently results in "expected Foo<'a>, found Foo<'_>"
    // Hopefully it would be allowable to unify anonymous lifetimes and have this work as expected
    fn new(buf: &[u8]) -> Foo<'> {
        Foo(buf)
    }

    fn get(&self, i: usize) -> Option<&u8> {
        self.0.get(i)
    }
}

impl AsRef<[u8]> for Foo<'> {
    fn as_ref(&self) -> &[u8] {
        &self.0
    }
}

There may be some scope for expanding this to some cases of Trait containing lifetimes, but I can’t think of what cases those would be right now.

Good point. For struct declarations I don't think elision buys much. For impls headers we could keep the current (unimplemented) elision rules which say that impl<'a> Foo<'a> becomes impl Foo. Requiring even an annonymous lifetime for a struct in input position is excessive I think.

This was really helpful for me to better understand your point of view. I am trying to decide what I think about it and what implications it has for syntax. =)

I think there are many times when it is legit to view "reference" as a broader category than just the &T type -- e.g., the Ref<'a, T> returned by a RefCell is kind of a "reference". This seems to fit into the mental model you describe here, though you are using the word reference more narrowly.

There are some other use cases where I think the best thing is to think of Foo<'a> as a "struct with references in it". This is particularly true when you have multiple lifetimes. In the compiler, for example, we have a number of "context" structs that look like Context<'a, 'tcx> or something like that. In this case, the context collects a number of references pointing at data that lives at various points on the stack, and those multiple lifetimes let us distinguish those cases. (e.g., the data with lifetime 'tcx lives in an enclosing arena, whereas the 'a data tends to be some fleeting references like the duration of a loop).

It is certainly true that it is possible to have structs with lifetimes that fit into neither of those cases, on a technical level, but generally I feel like there is still a "reference" conceptually. e.g., the Guard<'a, T> that is returned by locking a lock (to use your example) is basically a reference with lifetime 'a that points at data protected by a lock (I think in fact that it probably holds a reference to the lock itself, but you could perhaps imagine that it might not).

I think that what @sanxiyn meant was that lifetimes are properties of references (as types), not objects (values). I this meaning, references, which are types are not subset of objects, which are values.

Foo<'a> means that Foo probably contains somewhere a reference &'a X, which points to object (of type X) that outlives* the lifetime 'a. This 'a is property of the type (eg. Foo<'a>) which limits the set of values, which it can point to. So basically, Foo<&> would mean "warning: Foo contains some references".

*) In some sense, the value also has a lifetime (the span of time it's alive), but it's not what the lifetime in 'a notation means.

Different parts of this proposal have distinct motivations. My feeling remains that it is important that we make the notation light but not too light, and this is reflected in the collection of proposals here. I guess that the meaning of light is obvious (less to type) but let me elaborate a bit on "not too light".

I think it is really helpful to have a consistent, lightweight visual cue that indicates when lifetimes are present. I am not necessarily targeting "beginners" here (though I think it may help there); I am also thinking of advanced users. That is, I really appreciate being able to quickly scan type signatures and figure out which structs contain embedded references and which do not, as well as to identify when the return type may "extend" a borrow and when it will not.

I agree that people are not requesting Foo<'> in argument position (well, I don't really know, but I can believe that). I do not necessarily think this means it would not be helpful to them. That is, I think that having a consistent visual cue could well help people to get a better understanding of how lifetimes work.

Note that it is easy to take this too far! As I've mentioned before, in Ye Olden Days of Rust, we allowed you to elide lifetimes everywhere. We even went so far as to infer whether a struct might have references in it globally. This was mega-confusing. You would get wacky errors and have to trace though layers of type definitions to figure out what was going on. The reaction to this was to make all lifetimes named and explicit. I see the elision proposal (and this one) as part of an ongoing effort to step back from that and figure out just the right places to make lifetimes explicit (or semi-explicit).

I agree it's annoying to have to adjust trait signatures, but I am not convinced that the right solution is to allow you to obscure the fact that lifetimes are present. I do think there is a need to make it possible to code that is more tolerant of how many lifetimes are present. For example, one scenario that we don't have a solution for just now is that sometimes I want to specify just some of the lifetimes on a type.

In the compiler, this often comes up when I want to specify 'tcx (the "big" lifetime that is often floating around) but I don't care about the 'a (the "transient lifetime" that often shows up):

impl<'tcx> Foo<'tcx> {
    fn bar<'a>(tcx: TyCtxt<'a, 'tcx>) -> Ty<'tcx> { ... }
    //     ^^              ^^
    // it is annoying that I have to give `'a` a name here
}

I think a good convention (which we have not consistently adopted in the compiler; I've debating about refactoring things so that we do) is to list lifetimes in "increasing" order. So e.g. it'd be TyCtxt<'a, 'tcx, 'gcx> ('a is transient, 'tcx is the current inference session, and 'gcx is the global compilation session). In that case, I think that you almost always just want to specify a suffix of the lifetimes. So you could imagine some scheme that lets you elide "leading" lifetimes. But it's kind of ... wacky. '_ helps here but I'm not keen on it otherwise.

Anyway, leaving all that aside, I feel like another potentially good solution to the problem of having to refactor types is that IDEs should be able to do this refactoring. I know that's sort of dodging the question, but in my view there is a real readability tradeoff here. (Alternatively, of course, you could say that IDEs can add visual indicators when references might be present.)

So, the original elision RFC actually specified that one should be able to elide lifetimes in impls, but it was never implemented. I've been meaning to write up mentoring instructions for some time. I definitely agree we should permit that. However, there is one aspect of your post that I'm not sure about:

I would have expected the Foo<'> here to expand to the lifetime of the buf parameter. So basically fn new<'a>(buf: &'a [u8]) -> Foo<'a> {... }. As it happens, this would probably work out just fine in this particular case, but it is a distinct lifetime from the one that decorates the impl, and in some cases that might matter.

It'd be sort of nice if we had a way to "refer" to the lifetimes in the impl without declaring them. One thought that has been bandied about from time to time is having some ability to "declare" lifetimes and types in the impl header in a more lightweight way. For example (not an actual proposal):

impl Foo<let 'a> { // equivalent to `impl<'a> Foo<'a>`
    fn new(buf: &'a [u8]) -> Foo<'a> { ... }
}

One could then also do things like impl Option<let T> instead of impl<T> Option<T>. Kind of a "crazy cross-cutting proposal", though, and obviously introducing a new convention here would affect a lot of code (also, I feel like to do this right, we would want the type/trait declarations to match, so one would write enum Option<let T>).

1 Like

Good question. I shouldn't have excluded it, I think.

1 Like

I agree with the "lexically offensive" part -- that's a good way to put it :slight_smile: I don't know if there are propre terms for this, but basically, I would call & a full-sized character with it's own "identity". It can stand alone. This is in contrast with ' which is something you attach to something else: normally you use it as an apostrophe or you can use it as a single quote (normally paired with a matching closing quote).

Please note that I'm new to Rust and I have to admit that I'm still trying to figure out what the proposal is really about. So I'm not the best to evaluate how this will influence the language -- all I can say is that it looks very out of place in an otherwise pretty elegant language.

1 Like

Looking at the pre-RFC and at the meeting minutes, I fail to see lots of examples that show the advantages of a lifetime shorthand. It would be helpful just to make it crystal clear what the proposal is about :slight_smile: I see lots of examples discussing the new syntax and I see examples that are deprecated.

The examples I hoped for are like the one from the first post here:

// Before:
struct Foo<'a> {
    t: &'a i32,
}

// After:
struct Foo<'> {
    t: &i32,
}

Is that the whole proposal? If so, it seems one saves 3-4 keystrokes in exchange for inconsistencies. Inconsistencies since a disappeared in the struct Foo<'a> line, but the whole 'a part disappeared in the struct body.

Also, as I understand it, this is only for the case where you have a single lifetime? So the following struct would remain unchanged?

struct Foo<'a, 'b> {
    t: &'a i32,
    s: &'b str,
}

This makes me feel that the syntax doesn’t generalize.


Has it been discussed to simply elide the lifetime completely? That is, allow

// Before:
struct Foo<'a> {
    t: &'a i32,
}

// After:
struct Foo {
    t: &i32,
}

and also

// Before:
struct Foo<'a, 'b> {
    t: &'a i32,
    s: &'b str,
    r: &'b Vec<i32>,
}

// After:
struct Foo<'a> {
    t: &'a i32,
    s: &str,
    r: &Vec<i32>,
}

That is, allow the user to remove a single lifetime? Fields that have no explicit lifetime all get the same lifetime.

Given my limited knowledge of all this, I believe this would mean that you end up with much fewer lifetimes and in particular, you end up with something where the “default” or “obvious” choice is the choice the compiler makes.

But perhaps it was seen as important to signal in the struct Foo<'a> line that the type contains references? If yes, I suppose that is because of nested structs? With the proposal above, you could write:

struct Bar {
    a: Foo,
}

Now Bar would also have an elided lifetime because of the inner Foo. You can only know this if you go look at the definition of Foo. I’m not sure if that’s a good of a bad thing :slight_smile:

As a new user, it seems obvious that Bar values will have a lifetime associated with them – expect all values to have a lifetime. Since Bar contains a Foo, I would expect the lifetime to be determined by Foo. So the nested struct above would give me the same kind of mental image as the more explicit:

struct Bar<'a> {
    a: Foo<'a>,
}

That is, the <'a> parts above don’t seem to tell me anything.

The above is probably terribly naive, so I look forward to hearing where I missed things :smiley:

1 Like

And several decades of using lisp in emacs have trained me to expect U+0027 to mean that what comes next isn't evaluated, but I also have no problem associating it with lifetimes.

I don't really feel strongly either way, but if there is a need to distinguish between "contains reference with a specified lifetime" and "contains reference with inferred lifetime", I have no problem with &.

Could be plausible if you always know what the fields are, but when using a struct from another crate the docs will say: pub struct Foo { /* fields omitted */ } and now you have no idea that the struct has a lifetime parameter. So we either leak private details or introduce new notation, which is what the annonymous lifetimes proposal does.

All values have lifetimes which is the scope where they are accessible, but struct Foo is a type not a value, and struct Foo<'a> is a type generic over a lifetime parameter. Not all types have lifetime parameters, a value of type Foo<'a> cannot outlive the 'a lifetime while a value of type Foo has no restriction on it's lifetime.

2 Likes

To give a concrete example of why your intuition is here is wrong, given this API:

struct Foo(i32)

struct Bar(&i32);
struct Baz(i32);

impl Foo {
     fn bar(&self) -> Bar { Bar(&self.0) }
     fn baz(&self) -> Baz { Baz(self.0) }
}

// Invalid.
// This would be moving a reference into "Foo(0)" out of this function.
fn make_bar() -> Bar {
    Foo(0).bar()
}

// Valid.
// All of the data moved out of this function is owned.
fn make_baz() -> Baz {
    Foo(0).baz()
}

Nothing in the API distinguishes Bar and Baz from one another, but they are very different from one another. What this parameter distinguishes, in high level terms, is whether or the type owns all of the data, or if it is only borrowing some of it. If it has a lifetime parameter, it is bound to the lifetime of another value which it is only borrowing, so its lifetime has to be shorter than that value's.

This is why we feel its very important to document this fact in the API.

2 Likes

Wow, thank you very much for that example! This community is great -- always patient and with great explanations. It makes me feel very welcome :slight_smile:

Just for completeness, the API you described above was in my proposed syntax. In real Rust, I believe it would look like this:

struct Foo(i32);

struct Bar<'a>(&'a i32);
struct Baz(i32);

impl Foo {
     fn bar<'a>(&'a self) -> Bar<'a> { Bar(&self.0) }
     fn baz(&self) -> Baz { Baz(self.0) }
}

// Invalid.
// This would be moving a reference into "Foo(0)" out of this function.
fn make_bar<'a>() -> Bar<'a> {
    Foo(0).bar()
}

// Valid.
// All of the data moved out of this function is owned.
fn make_baz() -> Baz {
    Foo(0).baz()
}

The make_bar function is now more clearly invalid since there's nothing with a lifetime of 'a going into it. It cannot simply conjure up and return a value of limited lifetime. (Why not? One reason, I believe, is that it wouldn't be clear who owns the borrowed value inside the Baz value. Without an owner, there's nobody to free the borrowed memory.)

It would instead have to be written like this (if the Bar is to be derived from a Foo):

fn make_bar<'a>(foo: &'a Foo) -> Bar<'a> {
    foo.bar()
}
1 Like

This is basically correct with one amendment: the borrowed value is owned by the Foo(0) constructed inside make_bar, which is freed when the function concludes (so the borrow is outliving the owned value). If you know C, this is the same error as returning a pointer to a value allocated on the stack during that function call.

1 Like

I like this a lot! What about a'?

If the elided form ends up being as or more verbose than the unelided form, one may as well just write out everything using the regular unelided syntax because the latter has the benefit of being consistent and more flexible.

I realized we are thinking of different things. I agree that parameters as lifetimes will end up being too verbose. I’m thinking of introducing syntax for undeclared lifetimes that work just like explicit lifetimes but you don’t declare them, and they can appear only in an fn signature, as in: fn foo(&self, data: &a' [i32]) -> Foo<a'> which is a small win over: fn foo<'a>(&self, data: &'a [i32]) -> Foo<'a>

Am I guessing correctly that the second example should just be fn foo(&self, data: &[i32]) -> Foo<'data> { ... } without the first 'data?

Rereading the original proposal, I noticed I was getting it wrong. I thought that an annonymous lifetime would match another one, and the following would be valid:

fn foo(x: &Bar, y: Foo<'>) -> Foo<'>

But that is not the case, an annonymous lifetime is merely a marker for an elided lifetime, it can only make the syntax longer and never shorter. I had a different interpretation (which I find more useful) that an annonymous lifetime was just like any other except that it had a special syntax and didn’t have to be declared.

We could reinterpret the annonymous lifetime syntax as something I’ll call undeclared lifetimes. They would be just like explicit lifetimes but they are not declared and may appear only in fn signatures. They must be unambiguous with explicit lifetimes, for example '0, '1 could be the syntax for undeclared lifetimes.

This is how the above example would look:

// Leaving &Bar elided.
fn foo(x: &Bar, y: Foo<'0>) -> Foo<'0>
// Giving &Bar a throwaway lifetime.
fn foo(x: &'1 Bar, y: Foo<'0>) -> Foo<'0>

In comparison to parameter as lifetime:

// Parameter as lifetime.
fn foo(&self, data: &[i32]) -> Foo<'data>
// Throwaway lifetimes.
fn foo(&self, data: &'0 [i32]) -> Foo<'0>
// Parameter as lifetime cannot handle this case.
fn foo(&self, data: &[&'0 i32]) -> Foo<'0>

But most of the time we are only using '0, can we get sugar for that? Here we may repurpose the annonymous lifetime syntax to be sugar for an undeclared lifetime, possibly only when it is the only unelided lifetime in the signature, now the original example is valid:

// These could work.
fn foo(x: &Bar, y: Foo<'>) -> Foo<'>

fn foo(&self, data: &' [i32]) -> Foo<'>

impl<'a> Bar<'a> {
    fn foo(&self, data: &'0 [i32]) -> Foo<'0, 'a>
}

// This style got more verbose.
fn foo(x: Foo<'0>, y: Foo<'1>)

// We probably don't want to mix things too much.
fn foo(&'0 self, data: &' [i32]) -> Foo<'0, '>

What do you think?

Yes.