[Discussion] Hybrid borrow

While writting wrappers, I found something borrow issue that rust cannot handle very well. I called read-after-write interfaces.

fn raw(&mut self/*write*/)->&T/*read*/

Most of these interface could be directly splitted into

fn write(&mut self)->something;
fn read(&self)->&T;

But if we put the read and write in the same interface, we cannot give that function an approrite signature.

For example, the nightly interface, hash_set_entry #60896 has a signature get_or_insert(&mut self, value: T) -> &T, which receive a &mut T and yields a &T, considering the simple program below:

#![feature(hash_set_entry)]
use std::collections::HashSet;

fn main() {
    let mut hs = HashSet::new();
    let str1 = "hello";
    let a = hs.get_or_insert(str1); // insert is mutable, thus `hs` is borrowed exclusively
    // and then, we cannot borrow `hs` since `a` is not dropped.
    if hs.contains(str1) {
        println!("{a}")
    }
}

Although the program is logically correct, but the compiler refused to compile. Maybe we need a hybrid borrow instead of exclusive borrow in such case.

For example, adding a BorrowHybrid trait, adding a special immut function that convert BorrowHybrid objects to Borrow objects.

The signature could be

trait BorrowHybrid<'mutable, 'immutable, Borrowed:?Sized> {
    fn borrow<'a>(&'a self)->&'a Borrowed, where 'mutable:'a;
    fn borrow_mut<'a>(&'a mut self) -> &'a mut Borrowed, where 'mutable:'a;
    fn immut(self) -> &'immutable Borrowed;
}

We may use &'mutable mut 'immutable T as the signature of a hybrid borrow.

3 Likes

It's not clear to me how this would work. How would you use it with HashSet::get_or_insert for example?

How does this avoid unsoundness with internally mutable types? For example you can use RefCell::get_mut to get a reference to the contents of the RefCell, if you're then allowed to downgrade it to a shared borrow you would be able to call .borrow_mut() on the RefCell (because it is not longer mutably borrowed) and get a mutable reference to its contents while the shared one exists.

You mention BorrowHybrid to be a trait, but why the code snippet uses it in an impl as if it was a type?

Is this supposed to be new syntax? How does it interact with the trait, and why are both needed?

2 Likes

This sounds a lot like "downgradable" or "downgrading" borrows that were discussed before:

The main problem is that it must stay impossible to get a shared borrow from an unique/mutable borrow when interior mutability is involved. So your BorrowHybrid::immut() may not be called when Borrowed has interior mutability.

11 Likes

My fault.

The real situation is a little bit complex, and thus I choose the get_or_insert as an example.

The real case is something related to R FFI.

R has a function Rf_alloc, which allocate memory (with triggering GC if needed) and returns an allocated item, such item could be garbage collected in the next Rf_alloc call unless we PROTECT it.

Thus we could simply write a safe wrapper:

fn safe_alloc(zst:&mut BorrowChecker, params:Params) -> Unprotected<'a, SEXP>{
    // zst is exclusively borrowed, thus there's no immutable borrow while calling safe_alloc.
    // thus there should be no other unprotected items held by Rust.
    Unprotected{item:Rf_alloc(params), _phantom:PhantomData}
}

Since we cannot send lifetime other than the one send with zst, the exclusive borrow last for at least 'a, which makes the shared calls unreachable:

fn another_wrapper(zst:&BorrowChecker, params:Params) {
    // do something
}
fn main(){
    let zst = &mut BorrowChecker;
    let allocated = safe_alloc(zst, Params{...});
    // cannot call another_wrapper since allocated makes zst borrowed exclusively
}

The only design is that, seperate borrow and borrow_mut.

before immut is called, BorrowHybrid could be regard as BorrowMut directly. After immut is called, BorrowHybrid becomes Borrow, which could be considered as a normal borrow.

Function with BorrowHybrid signature could be regarded as 2 functions:

struct Foo;
impl Foo {
    fn mut_part(&'m mut self)->AllUsableParameters{...}
    fn immut_part(&'i self, aup: AllUsableParameters)->Ret<'i>{...}
    fn hybrid(&'m mut 'i self,params:Params)->Ret<'i>{
        let aup = self.mut_part(params);
        let immut_self = self.immut(); // change the `BorrowHybrid` to read only.
        immut_self.immut_part(aup)
    }
}
fn main(){
    let foo=Foo;
    let params = ...;// suppose we have that.
    // this code
    {
        let ret = foo.hybrid(params);
    }
    // has the same effect with
    {
        let temp = foo.mut_part(params);
        // here is the time to call `.immut()`
        let ret = foo.immut_part(temp);
    }
}

RefCell::get_mut returns a &mut T, rather than a &'a mut 'b T, thus the downgrade is a normal downgrading from &'a mut T to &'a T, no hybrid is taking considered in this case.

I propose a usable syntax, I strongly doubt about that, whether 'mutable mut 'immutable is better, or &'immutable mut 'mutable is better, since we may write mut ident, the mut seems mut something, but for lifetime, we wrote 'a mut rather than mut 'a, thus I have no idea using which grammar. Maybe something like &<mut 'a, 'b> is acceptable.

Firstly, BorrowHybrid is a trait, we could make it available only with the type marked as Freeze.

Secondly, the immut call accepts self, rather than &mut self, which behaves like a drop. This should be a good-enough design:

  • before immut, the BorrowHybrid behaves like a BorrowMut, thus any interior mutability does not work before immut is called
  • after a immut is called, we should regard the call as, drop the &mut borrow, and re-borrow the item. If any mutable borrows remains, such borrows should also remains after dropping the mutable reference, which is directly UB.

In my design, the hybrid borrow is not downgrade, it is re-borrow after drop, thus it should be safe enough.

I'm fine with using HashSet::get_or_insert as an example, what I want to understand however is how HashSet::get_or_insert and/or its uses need to be changed to benefit from your proposal.

That's the problem. RefCell::get_mut and similar get_mut methods on types with internal mutability are sound only because you cannot downgrade an exclusive borrow to a shared borrow.

That doesn't answer the second question, how does it interact with the trait and why do you need/want both of them?

You can bypass this trivially by using indirection, for example Box<RefCell<T>>: Freeze but still suffers from the problem mentioned above.

3 Likes

Changing the signature from (&mut self, value: T) -> &T to fn(&'a mut 'b self, value: T) -> &'b T, and add a .immut call in case the call is finished.

// implementation of `get_or_insert` in crate hashbrown:
    #[cfg_attr(feature = "inline-more", inline)]
    pub fn get_or_insert<'a,'b:'a>(&<'a mut, 'b> self, value: T) -> &'b T {
        self.map
            .raw_entry_hybrid() // instead of `raw_entry_mut`, we obtain a hybrid entry here.
            .from_key(&value) // same
            .or_insert(value, ()) // same
            .0 // same
            .immut() // call immut, convert &<'a mut, 'b> borrow into &'b borrow.
    }

You should notice that, here, a .immut performs just like a drop. Before the drop occurs, &<'a mut, 'b> is actually &'a mut, any downgrade, without a .immut() is called, any references of the hybrid borrow only extends the lifetime of 'mutable lifetime.

In my design, the borrow_hybrid is borrow it exclusively, drop, and then make a normal borrow

let mut a = hybrid.borrow_hybrid();
do_something(a);
let a = a.immut();
do_some_other_thing(a);

this is actually

let mut a = &mut hybrid;
do_something(a);
drop(a);
let a = &hybrid;
do_some_other_thing(a);

If interior mutability occurs with hybrid borrows, it must occurs in the normal borrows.

@SkiFire13 I've created a small demo with (almost) safe rust, and I have no idea how the interior mutability could affect the code:

use core::ops::{Deref, DerefMut};
use core::marker::PhantomData;
pub struct Avatar<'a,'b:'a, T:?Sized>(&'a mut T, PhantomData<&'b T>);
impl<'a,'b:'a,T:?Sized> Avatar<'a,'b,T> {
    pub fn from(t:&'b mut T)->Self{
        Self(t, PhantomData)
    }
    pub fn immut(self)->&'b T {
        // SAFETY: the original borrow lives at least `&'b mut`
        //         which allows such transmute directly
        unsafe {core::mem::transmute::<&'a mut T, &'b T>(self.0)}
        // since currently we always borrow `t` for at least &mut 'b
        // we could directly store the (&'b mut T) rather than storing `'a` and extending it here.
    }
}
impl<'a,'b:'a,T:?Sized> Deref for Avatar<'a,'b,T> {
    type Target = T;
    fn deref(&self)->&T {
        &self.0
    }
}
impl<'a,'b:'a,T:?Sized> DerefMut for Avatar<'a,'b,T> {
    fn deref_mut(&mut self)->&mut T {
        &mut self.0
    }
}
pub struct RAW(i32);
impl RAW {
    pub fn read(&self)->&i32{&self.0}
    pub fn pointer(&mut self)->&mut i32{&mut self.0}
    pub fn write(&mut self, data:i32){*self.pointer()=data}
}
impl<'a,'b:'a> Avatar<'a,'b,RAW> {
    // Due to current limitations, we cannot tell compiler the originally borrowed item is
    // borrowable now, thus we return its references.
    pub fn read_after_write(mut self, data:i32)->(&'b i32,&'b RAW) {
        self.write(data);
        let immut = self.immut();
        (immut.read(), immut)
    }
}

The avatar plays the role of a BorrowHybrid, with (almost) safe Rust. If you're worrying about safety, here is a totally safe version with relaxed lifetime markers(it is &'b mut rather than a shorter &'a mut is stored):

use core::ops::{Deref, DerefMut};
use core::marker::PhantomData;
pub struct Avatar<'a,'b:'a, T:?Sized>(&'b mut T, PhantomData<&'a T>); // In safe version, 'a is a
                                                                      // useless lifetime.
impl<'a,'b:'a,T:?Sized> Avatar<'a,'b,T> {
    pub fn from(t:&'b mut T)->Self{
        Self(t, PhantomData)
    }
    pub fn immut(self)->&'b T {
        self.0
    }
}
impl<'a,'b:'a,T:?Sized> Deref for Avatar<'a,'b,T> {
    type Target = T;
    fn deref(&self)->&T {
        &self.0
    }
}
impl<'a,'b:'a,T:?Sized> DerefMut for Avatar<'a,'b,T> {
    fn deref_mut(&mut self)->&mut T {
        &mut self.0
    }
}

Here is a demo, since currently Rustc does not allow return a shared reference back to variable borrowed exclusively then allow shared borrows of this struct, we do such thing manually, returning a shared references together with the wanted read-after-write references.

pub struct RAW(i32);
impl RAW {
    pub fn read(&self)->&i32{&self.0}
    pub fn pointer(&mut self)->&mut i32{&mut self.0}
    pub fn write(&mut self, data:i32){*self.pointer()=data}
    pub fn read_after_write(&mut self, data:i32)->&i32 { // original version.
        self.write(data);
        // now self is read-only
        self.read()
    }
}
impl<'a,'b:'a> Avatar<'a,'b,RAW> {
    // Due to current limitations, we cannot tell compiler the originally borrowed item is
    // borrowable now, thus we return its references.
    pub fn read_after_write(mut self, data:i32)->(&'b i32,&'b RAW) {
        self.write(data);
        // now self is read-only
        let immut = self.immut();
        // return both read-after-write references and a references of `RAW` struct.
        (immut.read(), immut)
    }
}
fn main(){
    let mut a = RAW(0);
    println!("{}", a.read());
    a.write(1);
    println!("{}", a.read());
    *a.pointer()=2;
    println!("{}", a.read());
    {
        let mut a = Avatar::from(&mut a);
        println!("{}", a.read());
        a.write(3);
        println!("{}", a.read());
        *a.pointer()=4;
        println!("{}", a.read());
        let (read,a) = a.read_after_write(5);
        // Due to the limitation of current compiler, I have no idea how to tell compiler that,
        // `a` *is currently borrowable*. Thus I have to send this borrow back.
        println!("{} {}", a.read(),read);
    }
    println!("{}", a.read());
}

The potential unsoundness is if downgrading mutable references allows you to do essentially:

let cell: &mut RefCell<i32> = /* ... */;
// (1) get an interior &mut without dynamic checks.
let get_mut: &mut i32 = cell.get_mut();
// (2) downgrade this to a shared reference.
let get_ref: &i32 = downgrade(get_mut);
// (3) cell is now only shared-borrowed.
// (4) borrow interior &mut with a dynamic check.
let borrow_mut: RefMut<i32> = cell.borrow_mut();
// (5) that does not panic; in fact, it can't.
// (6) now we have aliasing `&mut i32` and &i32`.
oops(&mut *borrow_mut, get_ref);
// (7) something before this point mustn't compile.

Rather than continue to talk around it — what exactly prevents you from writing this? What type signature would need to change in order for this to compile? What makes that signature change unsound?

I think I have a potential answer, but please consider what yours is before looking at mine[1]. Write out what the error message should be[2].


  1. The erroneous statement is (3). This would compile given fn RefCell<T>::get_mut(&<'a mut, 'b> self) -> &<'a mut, 'b> T, but that isn't the signature. The signature is fn RefCell<T>::get_mut(&<'a mut, '_> self) -> &<'a mut, 'a> T, such that downgrading the output &mut does not relinquish the mutable loan that it holds on the source cell. This does mean that approximately every single fn that returns &mut will need to be updated before downgrading its output &mut is more useful than simply reborrowing &*it, but surely that's no big deal, right? ↩︎

  2. The lifetime signature I gave would produce the same error message as today, because the error is in fact identical in cause:

    error[E0502]: cannot borrow `*cell` as immutable because it is also borrowed as mutable
      --> src/main.rs:11:35
       |
    3  | let get_mut: &mut i32 = cell.get_mut();
       |                         ---- mutable borrow occurs here
    ...
    8  | let mut borrow_mut: RefMut<i32> = cell.borrow_mut();
       |                                   ^^^^ immutable borrow occurs here
    ...
    11 | oops(&mut *borrow_mut, get_ref);
       |                        ------- mutable borrow later used here
    

    except I hope the issue with that is clear — a downgrade was (ineffectually) performed and yet since it didn't relinquish the mut loan, it doesn't even show up in the borrow error blame, leaving the dev no more aware as to why their downgrade isn't working. ↩︎

2 Likes

If the only difference between the safe and unsafe versions is the lifetime of the field why make two versions at all?

Doing this automatically for all types is unsound, as already mentioned due to types with internal mutability.

Meanwhile doing this manually is something that can already be done. In you example you just need to change RAW::read_after_write to be the same as the Avatar version.

There are two possible path

  1. Ensure that &mut has the same lifetime as get_ref, thus we cannot borrow cell (since it is already borrowed exclusively)
  2. allow panic in (5), which means some tweaks should be done to ensure the cell is marked as borrowed

The first one is not this discussion interested in, the hybrid borrow plays the same role as a normal &mut borrow. (Thus the error message is exactly the same what the compiler will show now).

The second one is tricky, if we modify get_mut to get_hybrid, only the signature changes. Then, the get_ref convert the mutable borrow into a immutable borrow. According to doc, RefCell should be marked as borrowed now. Thus the implementation of .immut of RefCell should be something like:

let cell: &<'a mut, 'b> RefCell<i32> = /* ... */;
let get_hybrid: &<'a mut, 'b> i32 = cell.get_hybrid(); // no runtime check occurs
let get_ref: &i32 = get_hybrid.immut(); // * some compicated things happened
// 1. drop &<'a mut, 'b> i32 (pre-immut step)
// 2. re-borrow as &'b i32 (post-immut step)
// since drop &<'a mut, 'b> i32 directly triggers the drop of `&<'a mut, 'b> RefCell<i32>`
// thus we should execute the following instructions after 1. and before 2.:
// 1.1. drop `&<'a mut, 'b> RefCell<i32>`
// 1.2. re-borrow `&<'a mut, 'b> RefCell<i32>` as `&'b RefCell<i32>`.
// Thus the final instructions it executed is:
// 1. drop &<'a mut, 'b> i32
//   currently, &<'a mut, 'b> RefCell<i32> has no extra borrows and could be dropped
// 2. drop &<'a mut, 'b> RefCell<i32>
//   here is the tricky thing,
//   for normal Rust, we just drop the `&<'a mut, 'b> RefCell<i32>` and thus this item is unreachable, 
//   which prevent further usage of this references in our execution flow until we return back to the function which generates the hybrid borrow.
//   but actually, the borrowchecker ensure that, currently, no borrow of any kind exists
//   thus we *could* perform a fresh borrow here.
// 3. generates a `&'b RefCell<i32>` magically.
//   as analyzed above, this is safe, since we are just create a fresh borrow
// 4. borrow `&'b i32` from `&'b RefCell<i32>`, which directly yields a `Ref<'b>` type and marked `RefCell` as borrowed.

// Thus
// (4) borrow interior &mut with a dynamic check.
let borrow_mut: RefMut<i32> = cell.borrow_mut();
// failed, since the step `4.` described above marked the cell as borrowed.

Such implementation remains an unsolve question, which might be done by borrow checker, but I'm definately not an expert about that.

Unsolved question: Partial hybrid borrow

(Will users writting such code?)

let a = &<'a mut, 'b> (0,1);
let b = &<'_ mut, '_> a.0;
let c = &<'_ mut, '_> a.1;
// (1) no conflict now, since `b` and `c` borrows different field of `a`.
// but
b.immut();
// b.immut() performs:
// 1. drop `b`
// 2. re-borrow `b`
// re-borrow requires `a.immut()` (otherwise the re-borrowed reference is borrowed from `&mut`)
// since `c` is alive, we cannot re-borrow a

The unsafe version uses a more strict life time restriction (mutable borrow lasts for 'a, and immutable borrow lasts a little bit longer, use this notation, people will not get confused about extend the mutable borrow 'a unexpectly)

I'm fighting with such internal mutability, and IMHO it could deal with internal mutability. (Although the Avatar I provide could not deal with it, but writting such signature is possible (as the example above))

You don't just borrow a RefCell to get a Ref<'b>, you have to call the .borrow() method. There's no precedent for the borrow checker inserting hidden calls to methods, rather there are already usecases that rely on the borrow checker not being required to compile a Rust program.

And even then this only works for RefCell and perhaps other types that yield runtime guards. Most notably Cell and the Atomic* types do not use such guards, thus your proposal would still be unsound for them.

I suggest you to look for a general solution that takes into consideration the properties/invariants of mutable/immutable borrows rather than trying to fix specific problematic cases (e.g. RefCell), as fixing those will never be able to prove that there aren't other cases to fix too.

There's no such restriction in the code you posted. Both mutable and immutable references last for a temporary lifetime (when obtained through deref/deref_mut) and when consumed both produce references with lifetime 'b. The only difference is the lifetime of the reference stored inside struct, but that's private and unobservable.

1 Like

This is what <&<'a mut, 'b>::RefCell>::immut should do.

Actually, make hybrid mut of such unsupported type could generate an error. This is why I designed it as a trait.

let cell: &<'a mut, 'b> AtomicI32 = /* ... */;
let get_hybrid: &<'a mut, 'b> i32 = cell.get_hybrid(); // no runtime check occurs
let get_ref: &i32 = get_hybrid.immut(); // * some compicated things happened
// 1. drop &<'a mut, 'b> i32 (pre-immut step)
// 2. re-borrow as &'b i32 (post-immut step)
// since drop &<'a mut, 'b> i32 directly triggers the drop of `&<'a mut, 'b> AtomicI32`
// thus we should execute the following instructions after 1. and before 2.:
// 1.1. drop `&<'a mut, 'b> AtomicI32`
// 1.2. re-borrow `&<'a mut, 'b> AtomicI32` as `&'b AtomicI32`.
// Thus the final instructions it executed is:
// 1. drop &<'a mut, 'b> i32
//   currently, &<'a mut, 'b> AtomicI32 has no extra borrows and could be dropped
// 2. drop &<'a mut, 'b> AtomicI32
//   here is the tricky thing,
//   for normal Rust, we just drop the `&<'a mut, 'b> AtomicI32` and thus this item is unreachable, 
//   which prevent further usage of this references in our execution flow until we return back to the function which generates the hybrid borrow.
//   but actually, the borrowchecker ensure that, currently, no borrow of any kind exists
//   thus we *could* perform a fresh borrow here.

// we could split the whole program into 2 parts, one stops here and another starts from here.
// both program is safe, since we do no dangerous operations
// since there is no hybrid borrows left, the life time of the temporary variables does not contains the hybrid lifetime.
// thus, we could firstly execute the program stops here, and then drop all the hybrid borrows
// and then execute the remain parts, re-generate a immutable borrow.

// 3. generates a `&'b AtomicI32` magically.
//   as analyzed above, this is safe, since we are just create a fresh borrow
// 4. borrow `&'b i32` from `&'b AtomicI32`

The main change is modify all RefCell<i32> to AtomicI32, and the .immut logic remains the same.

I have no idea why you said other cases are not fixed. If you have, please show me an example.

This is the restriction of Rust itself. Currently I cannot return the reference back to the original variable, such sad situation makes the demo less attractive.

The designed read-after-write signature should be

impl RAW {
    pub fn read_after_write<'a,'b:'a>(&<'a mut, 'b> self, data:i32)->&'b i32 {
        self.write(data);
        let immut = self.immut();
        immut.read()
    }
}

which allows borrow RAW struct immeditely after the read_after_write is called.

To show that it is allowed to do such thing, I uses the dirty implementation, return a borrow reference along with the readed result, which proof the usability of borrow RAW later.

If the read_after_write has the correct signature, we could directly borrow the struct after read_after_write is called. Such design only works for signature &<'a mut, 'b>. If the signature is &<'b mut, 'b>, we have to return the reference back.

1 Like

That is never called called in the example above. immut is only ever called on a &<'a mut, 'b> i32, which has no knowledge of being in a RefCell.

I suppose that is something that borrow checker should be done. If the exact signature is expected, maybe

&<'a mut, 'b, &<'a mut, 'b, !> RefCell<i32>> i32

is the correct one. Here,

&<'a mut, 'b, !> T

means T is hybrid borrowed trivially.

&<'a mut, 'b, &<'a mut,'b, !> T> U

means U is hybrid borrowed from a hybrid borrowed type T.

I think this should be fixed at HashSet API by adding:

/// Inserts a value in the hash set, if not already present.
///
/// Always returns a shared reference to the set and a shared reference to the entry.
/// The entry reference is either the inserted one when the result is `Ok`, or the existing
/// one when the result is `Err`. In this last case, the non-inserted entry is returned too.
fn bikeshed_insert(&mut self, value: T) -> (&Self, Result<&T, (&T, T)>);

The idea is to fusion get_or_insert and insert into a single "linear" (or think loss-less) function except it consumes the exclusive access to return shared access. Then both get_or_insert and insert are trivially implemented using this more powerful bikeshed_insert function.

If HashSet doesn't provide this function, one could hope to implement it themselves, but for that they would need Polonius to solve the conditional control flow across functions problem. And they would still need the HashSet API to provide a better insert.

fn bikeshed_insert<T: ...>(xs: &mut HashSet<T>, x: T) -> (&Self, Result<&T, (&T, T)>) {
    if let Some(old) = xs.get(&x) { // Here self is borrowed for a long lifetime
        return (xs, Err(old, x));
    }
    // With Polonius, that borrow would end now, such that we have exclusive access again.
    assert!(self.insert(x.clone())); // Ideally, HashSet::insert would return a reference...
    // We now only use shared access for the long lifetime
    (xs, Ok(xs.get(&x).unwrap()))
}