Allow method calls without borrowing entire struct


#1

If you have a struct Foo { a: String, b: String } where a and b are not pub but rather have one getter/accessor method each (get_mut_a(&mut self) -> &mut String, likewise for b), you cannot borrow both at the same time.

The following is barely a pre-pre-RFC, and also a disclaimer ahead: I couldn’t find any discussions on here or on Reddit although I am sure they exist because this issue, together with e.g. self-referencing structs, is IMO one of the most annoying limitations of the borrowchecker (I promise I searched thoroughly, but I also didn’t know under which name this issue is known)

To me the problem here is that both get_mut_a and get_mut_b mutably borrow self in its entirety. Explicitly, in their signatures. What if they didn’t have to?

Currently get_mut_a may implemented like this:

impl Foo {
    fn get_mut_a(&mut self) -> &mut String { &mut self.a }
}

But what if it was:

impl Foo {
    fn get_mut_a(&mut self.a) -> &mut String { &mut self.a }
}
  • self.a is still not a pub field, but it is exposed in the method signature. The API consumer knows that their Foo is not completely borrowed (and can rely on the fact that get_mut_b will not borrow the same part of Foo as get_mut_a), and can see that there is a field called a inside the struct. But they can’t access it directly as long as it isn’t marked as pub explicitly.
  • Inside get_mut_a, accessing self.b just doesn’t work. It’s not “in scope”.
  • if you need multiple fields, I guess you could write &mut self.{a,b}? Seems roughly consistent with how use works at least. EDIT: perhaps adding two self parameters is nicer: fn foo(&mut self.a, &mut self.b). Then you can also have distinct lifetime parameters, borrow one field as immutable etc

Not everything is a struct field

This solution is highly specific to struct fields. This is also a problem, but not at all solved by the new syntax above:

let mut foo = vec![1,2,3];
let a = foo.get_mut(1);
let b = foo.get_mut(2);

My dream is that, at some point, the direct correlation between &mut self.a in a function signature and an actual struct field a is removed. Such that one can introduce arbitrary “phantom fields” to tell the borrow checker which method calls on a struct would borrow which parts of the struct. Something like:

impl Vec<T> {
    fn get_mut(&mut self[i], i: usize) -> Option<T> {
        // unsafe block here???
    }
}

What do you think?


#2

I like the “phantom fields” idea, but rather than making it unsafe, what if the borrow checker enforced it?

Suppose you declared three methods, declared such that two of them borrow one set of fields and the third borrows another. Without declaring exactly which fields, you could simply declare them with some abstract indication of which ones overlap, and then the borrow checker could check that those overlap indications are consistent with the actual fields the structs borrow. You’d still be exposing which borrows overlap in the API, but not what precise fields they borrow.


#3

Yeah, I haven’t really thought about how this could “recurse” nicely (i.e. s.t. you can write a wrapper method for Vec::get_mut that has the same exact API and borrow-constraints without writing unsafe code). For the simple struct case Foo::get_mut_a it’s more imaginable.

Also an open question is: If you have code like let a = foo.get_mut(i); let b = foo.get_mut(j);, how do you prove to the compiler that i != j? I suppose the compiler could figure out the following cases on its own:

  • assert!(i != j);
    let a = foo.get_mut(i);
    let b = foo.get_mut(j);
    
  • if i != j {
        let a = foo.get_mut(i);
        let b = foo.get_mut(j);
    }
    

and if the compiler still thinks i and j might be equal, the programmer can always throw in an assert!.


#4

I personally like “borrow regions” proposal, although it does not solve problem with element-wise borrowing from containers, but I think it should be solved with Vec wrapper or modification which will implement run-time enforcement of element-wise borrowing rules (any suggested crates?) and not on the language level.


#5

Data-dependent borrows seem significantly harder. I’d settle for statically determined non-overlapping borrows in methods.


#6

For prior/related discussion:

  • Essentially the same problem happens with closure captures, but it’s arguably a much simpler case because we could fix it just by changing language rules without adding any new syntax. https://github.com/rust-lang/rfcs/pull/2229 is the RFC for doing just that.

  • In the context of methods borrowing self, I’ve seen this idea semi-consistently referred to as “partial self borrowing”, which I think is likely the best name we’re going to get for it.

  • Partial self borrowing syntax is an old thread, but it has multiple comments from @nikomatsakis and @withoutboats covering why partial borrows are a lot harder than they look.

To summarize the significant concerns/problems I found by skimming existing threads/issues:

  • First, note that this feature is totally redundant for public fields, since the client code can already do “partial borrows” just by accessing the fields themselves. So we can assume this is strictly about private fields.
  • How does partial borrowing work with traits? The fields in traits proposal would only help for public fields, so we’d probably need something like “borrow regions” in traits, which seems like a pretty heavy feature when the only benefit is making a few more method call sequences borrow-legal.
  • Exposing private fields in public method signatures seems like leaking implementation details, which is undesirable in principle. It’s likely that many use cases for this feature should consider simply making more fields public.
  • Partial self borrows create a new kind of API compatibility hazard, because two public methods that had disjoint partial self borrows may stop being disjoint if the private fields they use get merged into one field. So if we added this feature, anytime you actually used it on two of your methods, your public API would be implicitly making a guarantee that the sets of fields borrowed by those two methods will always remain disjoint.

So, now that I’m aware of these complications, I’m not sure the feature really pulls its weight. I’m struggling to think of a use case where turning the “borrow regions” into actual public fields (possibly of wrapper types that hide exactly how many fields implement each “region”) is not a viable solution, and you can reasonably commit to those borrow regions remaining disjoint in all future compatible versions.


#7

Note that regions are not only about public API, even in private code they will allow us to write code like this:

struct S {
    region buffer { buf: Vec<u32>, foo: Foo }
    region bar: Bar,
}

impl S {
    // take mutable reference `buf` using some complicated logic based
    // mutable use of `buf` and `foo`
    fn take_element(&mut(buffer) self) -> &mut u32 { .. }

    fn modify_bar(&mut(bar) self, val: &mut u32) { .. }

    fn do_stuff(&mut self) {
        let var = self.take_element();
        // currently impossible or unergonomic without "borrow regions"
        self.modify_bar(var);
    }
}

This example is quite simple, but demonstrates the general idea.


#8

Thanks, that thread about partial self-borrows is what I missed.

The syntax I propose doesn’t expose the fields. It’s just creating a “borrow region” that is called the same as the fields it borrows. It’s just that my proposal doesn’t allow for explicit definition of borrow regions yet (which I called “phantom fields”). I don’t see how one would expose more implementation details with my proposal, except for the field name.


#9

I forgot to respond to the rest of your points.

To clarify terminology: As far as I understand from the thread you linked, “phantom fields” and “fields” are both creating a borrow region. So “fields” can be called “implicit regions” and “phantom fields” perhaps “explicit regions”.

Yes exactly.

It wouldn’t have to, or at least I don’t think that this would have to be implemented at the same time the simple concrete-type-only variant gets implemented. It probably requires syntax for “explicit borrow regions”.

I agree that this idea doesn’t cover a significant portion of usecases one might have, it’s just supposed to “get the ball rolling”. Just like impl Trait is not working everywhere. If designed right (probably isn’t), it can be extended cleanly later.

I don’t see the implicitness of this. Current code borrows self in its entirety. If you start borrowing specific fields, you’re willfully adding this guarantee to your API.

The major difference is that you:

  • don’t have to commit to either the name or the type of the field you’re using. The name is visible in the function signature and therefore in rustdoc, but the intention is that it’s not possible to write code that relies on the name of that borrow region.

  • are not inherently exposing all pub members of that field unless you e.g. return a &mut pointer to it.


#10

That’s why I added the parenthetical about wrapper types. Apparently that wasn’t clear enough, so here’s what I meant in code:

struct FooA {
    str: String
}
impl FooA {
    pub fn get_mut(&mut self) -> &mut String { &mut self.str }
}
struct FooB {
    str: String
}
impl FooB {
    pub fn get_mut(&mut self) -> &mut String { &mut self.str }
}
struct Foo {
    pub a: FooA,
    pub b: FooB
}

...

let foo = Foo::new(...);
// foo.get_mut_a();
// foo.get_mut_b(); // ERROR two mut borrows
foo.a.get_mut();
foo.b.get_mut(); // OK

Obviously, it’s a significant amount of boilerplate, but I can’t think of a situation where this wouldn’t solve the same problem as a genuine partial self borrows feature, and this doesn’t require committing to the names or types of or exposing the “real” private fields. And it makes it extremely obvious that you’re committing to keeping the two regions separate in your public API.

So, to explicitly state what I was trying to suggest in my last post, to me the big questions here are:

  1. Are there any cases where refactoring to public fields of wrapper types like this is not a feasible solution at all, rather than a slightly tedious solution?
  2. Are use cases for partial self borrows common enough to justify adding this feature purely as a convenience (arguably even a “sugar”) to avoid the boilerplate needed today?

Since 2 can only be decided by how many other users speak up about this, 1 is probably the more productive one to discuss. So the only point that I’m actually asking for a response on is: Can you think of any non-contrived cases where public fields of wrapper types just don’t work at all?


#11

The problem with this is, now the method’s type and signature will depend on its implementation. It won’t be enough to say it takes &self, you’ll have to say how much / exactly what in &self. That’s already bad enough in itself, but it’s especially worrying in combination with public methods accessing private fields – because it leaks the function’s reliance on certain private fields as a public API constraint.


#12

I am convinced that you have to expose some information about which parts of your struct will be accessed. I don’t see the big difference in whether that happens by explicitly defining borrow regions, or by specifying the exact fields. The latter only leaks their names, it doesn’t leak their types or their pub members.


#13

Ah, that’s a good point. I can’t think of any real cases either. I am not sure how this would work if you (ever) run in a situation where borrows would partially, and only partially, overlap:

struct Foo {
    a: String,
    b: String,
    c: String
}

impl Foo {
    pub fn get_mut_ab(&mut self.a, &mut self.b) -> _ { ... }
    pub fn get_mut_bc(&mut self.b, &mut self.c) -> _ { ... }
}

But I can’t think of a real case for this either. Admittedly, to me the more interesting usecases generally involve data-dependent regions, such as for Vec::get_mut


#14

Could this not also be achieved by simply returning a tuple?

struct Foo { a: String, b: String }

impl Foo {
   fn get_mut_a_b ( &mut self ) -> ( &mut String, &mut String ) {
      ( &mut self.a, &mut self.b )
   }
}

let foo : Foo = Foo::new();

let mut ( a, b ) = foo.get_mut_a_b();

a.push_str( "Hello" );
b.insert_str( 4, "World"); 

#15

The abbreviated implementation of get_mut_ab is not the issue in my example. The issue is in the fact that neither method strictly needs the entirety of Foo, and that the pattern suggested by @Ixrec isn’t a sufficient replacement as far as I can tell.

Your example needs all fields of self anyway (the c is missing in your struct def), so neither the proposed partial self-borrow nor @Ixrec’s pattern is of any use.

EDIT: Just realized you were responding to the OP: Yes, absolutely. That’s basically the workaround: Having additional getters for when you want to borrow multiple things at once.


#16

(as an aside, here’s a minimal, compiling example.)


#17

Which is what split_at_mut() is.