[Idea] Two lifetimes on unique references

adamAndMath · July 22, 2019, 1:44pm

I've seen some conversations about wanting to downgrade &mut to a & through a function, in such a way that the original value can still be referenced.

struct Foo<T>(T, usize);

impl<T> Foo<T> {
    pub fn new(val: T) -> Self {
        Foo(val, 0)
    }

    pub fn inner(&mut self) -> &T {
        self.1 += 1;
        &self.0
    }
}

The problem is that the unique reference must live as long as the returned shared reference. To solve this we don't need the shared reference to outlive the unique reference, but rather that the reference outlives it's uniqueness.

This leads to the conclusion that unique references could be extended to have 2 lifetimes. One for how long the reference lives, and one for the time in which it is unique.

I'm proposing the syntax &'a mut'b T where 'a is the references lifetime, while 'b is the lifetime of uniqueness. Further more I'll propose that both &mut and &'a mut desugars to &'a mut'a, making the change backwards compatible. This would also mean that this is completely opt in. So as to fix the method I'd have to write

fn inner<'a, 'b>(&'a mut'b self) -> &'a T

I'm not sure whether this is something I want in the language, but it seemed like a solution to a problem, so I thought I should share it.

RustyYato · July 22, 2019, 1:53pm

This seems really complex, ~~and I'm not sure how this will generalize~~. To solve this problem, what you can do is return &Self and &T from your function, then you can use the returned reference later.

struct Foo<T>(T, usize);

impl<T> Foo<T> {
    pub fn new(val: T) -> Self {
        Foo(val, 0)
    }

    pub fn inner(&mut self) -> (&Self, &T) {
        self.1 += 1;
        (self, &self.0)
    }
}

adamAndMath · July 22, 2019, 1:56pm

Could you explain what you mean. I'm not sure in what direction to generalize, and it's often hard to see the complexity of once own ideas

RustyYato · July 22, 2019, 2:07pm

Lifetimes are already a tricky subject for new user of Rust, so any added complexity should be carefully reviewed to see if it is worth it. In this case, a whole new lifetime parameter for minor convenience is certainly not worth it.

More over it is unclear what you mean by "lifetime of uniqueness" and "lifetime of reference", and what's the difference between them.

That was my bad, didn't mean to put that in, I was thinking of something else while writing that.

Ixrec · July 22, 2019, 2:11pm

I think this old post by @nikomatsakis is still a fairly accurate summary of where we're at with this sort of idea:

Probably the biggest change since he wrote that is simply that we have NLL now , but "a proposal for a Rust memory model" is still a ways off (see GitHub - rust-lang/unsafe-code-guidelines: Forum for discussion about what unsafe code can and can't do)

mcy · July 22, 2019, 5:48pm

For what its worth, I often feel that the way we teach references is suboptimal. I feel like "sharedness" and "uniqueness" aren't quite as dual as "const" and "mut", and I think that the idea that a reference can outlive its uniqueness is an interesting way to think about reborrowing.

I guess the way this could be confusing is how you might infer &'a mut'?0 T. Do we infer ?0 separately or do we always infer 'a unless specified by the user? The latter feels like it lines up better with what we expect of &mut T, if anything. I'd definitely class such a feature similar to 'a: 'b and for<'r>: advanced lifetime features that are necessary but not a core part of the mental model.

RustyYato · July 22, 2019, 5:52pm

I completely agree, I think we should start the process of changing all official documentation do use uniqueness instead of mutability. I have yet to meet someone (here, users forum, or on reddit) that doesn't understand uniqueness, so I think that this change will make Rust more accessible and easier to learn.

I don't think that this needs to be encoded into the types, especially when it is really easy to also return self if you want to allow shared usage of self along with the unique borrow.

mcy · July 22, 2019, 5:59pm

I'm not sure. I've never been a big fan of the Rust pattern of "maybe take ownership by taking a T and maybe returning a T". I'm not sure if it should be encoded in the &mut syntax, but decoupling "I have a reference inside 'r" from "I have unique access inside 'u" as a first-class concept is valuable to me.

RustyYato · July 22, 2019, 6:43pm

While yes, I agree that is valuable, I don't know if that is worth adding a whole new lifetime parameter. Given that lifetime parameters are already very confusing for newcomers, adding another may slow down Rust's growth too much.

How often does this come up for you? Personally this doesn't come up for me that often, so I don't see the value of adding a new lifetime parameter for what seems like minor convenience.

mcy · July 22, 2019, 11:09pm

I've mostly only encountered this sort of thing with data structures? For example, I have a vector-like thing where I can push in a T and, if that succeeds, get a &T which points into the data structure. Unfortunately, such a function is fn<'a>(&'a mut self) -> &'a T, so the borrows in 'a include a unique borrow to self, so I can't do

let t = ts.push(t);
let len = ts.len();

I've had similar problems with wanting a map of boxes to sort-of emulate std::unordered_map...

Ixrec · July 22, 2019, 11:26pm

Agreed. I think https://github.com/rust-lang/rfcs/pull/2025 helps with a lot of these cases in practice (after all, the standard motivating example is vec.push(vec.len());).

Interestingly, that thread explicitly proposes references having two lifetimes, but that appears to be borrow checker implementation / inference algorithm details and not anything close to a surface syntax proposal that users would be expected to deal with.

So although that's clearly not exactly what's being asking for in this thread, I strongly suspect that that already gets us most of the way to the optimal complexity/convenience trade-off, by making Rust just accept more code without forcing any new complexity on the user.

adamAndMath · July 23, 2019, 2:22pm

That rfc does have a lot in common with my proposal, though it's about delaying unique access. I don't see any way to expose this in user land, as one of the lifetimes is shorter than the variables lifetime.

Further more, it can't solve the problem in question, as these lifetimes do live in userland. Thus any solution must enable naming of such lifetimes. If not, it would make elision of lifetimes able to describe something more than explicit lifetimes.

It is interesting that "const/mut" leads to ideas like write-only references, while "shared/unique" lead me to this idea. Either doesn't make much sense in the other paradigm.

CAD97 · July 23, 2019, 4:34pm

Just as a real-world example where "unique deterioration" would be useful, HashSet::get_or_insert is defined as fn get_or_insert(&'_ mut self, value: T) -> &'_ T. Instead, with this, it could be fn get_or_insert<'a>(&'_ mut'a self, value: T) -> &'_ T.

mcy · July 23, 2019, 7:29pm

I think this is equivalent to my example; I'd be interested to hear about different cases in which you want something like this.

I don't think that's quite true, actually! The only reason to want write-only references is to be able to have references that temporarily point at unitialized memory. I think there is a reasonable mental model (though one which does not surface in the syntax perhaps) of various properties of a reference having different lifetimes:

Allocated-ness: during some lifetime, this reference points to memory that we have some kind of access to (hopefully someone like @RalfJung can specify the notion I want here, I believe that it is a weaker form of "validity" in the UWG sense). Addresses like 0x0, for example, has alloc for lifetime '![1] while data embedded into the binary for lifetime 'static; stack and heap memory are alloc for some reference 'a.
Initialized-ness: during some lifetime, this reference points to memory that is known to be initialized. Currently, &'a T implies intialized-ness for 'a while a theoretical &'a uninit might indicate initialized-ness for some unknown lifetime such as '!. To be read, a reference must be within a lifetime for which it is initialized.
Uniqueness: during some lifetime, this reference is the sole reference from which a particular memory region is reachable. Currently, &'a mut T implies uniqueness (and initialized-ness) for 'a; a theoretical &'a uninit T merely implies uniqueness. To be written to, a reference must be within a lifetime for which it is unique (pretend there is no such thing as const statics for a moment).

If we were going to write out a totally explicit reference from which the other reference types can be derived, you might write &'a init'i uniq'u T[2], and define

&'a T        := &'a init'a uniq'! T;
&'a mut T    := &'a init'a uniq'a T;
&'a uninit T := &'a init'! uniq'a T;

(Debatably, we could also define raw pointers as &'! init'! uniq'! T.)

[1] '! is the "empty" region or "dead" lifetime, opposite to 'static. This has no meaning in actual Rust, since to witness a region 'a one must be within it.

[2] To be clear, we do not want to surface such insane syntax.

adamAndMath · July 23, 2019, 7:36pm

To be extra clear, we could not surface such syntax, as these references live longer than the lifetimes they contain. I do however think there is value in thinking of the properties in such a way.

Tom-Phinney · July 23, 2019, 9:06pm

To quote you, "I don't think that's quite true, actually!" There are embedded processors with peripherals that have some memory-mapped write-only registers. When there is more than one instance of such a peripheral, or the base address of the peripheral register block is configurable, it is useful to access the peripheral's registers by reference.

mcy · July 23, 2019, 9:15pm

You do not want references when dealing with MMIO in any situation, because LLVM can reorder reads and writes, and also speculatively read references. If you have an MMIO device where each sequential read triggers an update of the underlying register, you're gonna have a bad day.

You want to build a safe abstraction on top of raw pointers and volatile reads and writes. These are guaranteed to commit real reads and writes in true sequential order on every platform, or your money back. This is similar to how interacting with such memory via char* in C/C++ will have the wrong semantics, and should instead be declared volatile char*. Back when I wrote a toy operating system in C, a lot of the fussing about with the disk driver required volatile memory to work correctly wrt the memory model.

Tom-Phinney · July 23, 2019, 9:22pm

You are correct that all such memory-mapped I/O has to be volatile. If references as mapped to LLVM can't support that mode of access, then the access needs to use other functionality.

system · October 21, 2019, 9:36pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: Downgradable mutable borrows language design	14	1394	May 11, 2024
I think the compiler could give a better help message here compiler	7	697	August 19, 2021
Proposal about expired references language design	32	3251	April 30, 2020
[Pre-RFC] Unify references and make them generic over mutability language design	24	1996	August 25, 2023
Upgrading mutability based on exclusivity inference language design	10	706	April 3, 2024

[Idea] Two lifetimes on unique references

Related topics