Partial self borrowing syntax

This is valid D code:

struct Foo {
    uint x;
    uint[2] v;

    void bar() {
        x++;
    }

    void spam() {
        foreach (_; v)
            bar();
    }
}

void main() {}

You can’t translate it to this Rust code:

struct Foo {
    x: u32,
    v: [u32; 2]
}

impl Foo {
    fn new() -> Self {
        Foo { x: 0, v: [0; 2] }
    }

    fn bar(&mut self) {
        self.x += 1;
    }

    fn spam(&mut self) {
        for _ in self.v.iter() {
            self.bar();
        }
    }
}

fn main() {}

It gives an error:

error: cannot borrow `*self` as mutable because `self.v` is also borrowed as immutable [--explain E0502]
  --> ...\test.rs:17:13
   |>
16 |>         for _ in self.v.iter() {
   |>                  ------ immutable borrow occurs here
17 |>             self.bar();
   |>             ^^^^ mutable borrow occurs here
18 |>         }
   |>         - immutable borrow ends here

I solve the problem this way, using a “static” method, and passing to it as mutable reference arguments the unborrowed struct fields I need to use or modify:

struct Foo {
    x: u32,
    v: [u32; 2]
}

impl Foo {
    fn new() -> Self {
        Foo { x: 0, v: [0; 2] }
    }

    fn bar(x: &mut u32) {
        *x += 1;
    }

    fn spam(&mut self) {
        for _ in self.v.iter() {
            Self::bar(&mut self.x);
        }
    }
}

fn main() {}

But is it a good idea to introduce some syntax to specify just a subset of the fields, allowing to use a more natural instance method?

struct Foo {
    x: u32,
    v: [u32; 2]
}

impl Foo {
    fn new() -> Self {
        Foo { x: 0, v: [0; 2] }
    }

    fn bar(&self(mut x)) {
        self.x += 1;
    }

    fn spam(&mut self) {
        for _ in self.v.iter() {
            self.bar();
        }
    }
}

fn main() {}

That optional &self(mut x) fantasy syntax is also useful for regular instance methods, when you want to better specify (and enforce) in the method signature what instance fields the method is allowed to use. This makes the flow of information between the struct methods more explicit and could avoid some mistakes. Static analysis tools will love that extra information.

1 Like

This one works.

struct Foo { x: u32, v: [u32; 2] }

impl Foo {

fn new() -> Self {
    Foo { x: 0, v: [0; 2] }
}
fn bar(&mut self) {
    self.x += 1;
}
fn spam(&mut self) {
    let bla = self.v;

    for _ in bla.iter() {
        self.bar();
    }
}

}

But this one only work because [u32; 2] is Copy. For the general case of non-Copy field, partial borrowing is still an issue.

2 Likes

related rfc issue: https://github.com/rust-lang/rfcs/issues/1215

1 Like

I’ve thought about having some syntax. At one point I entertained the idea of (ab)using patterns, such that one would write:

fn bar(&mut Foo { ref mut x, .. }: &mut Foo) {
   *x += 1;
}

The idea would then be that this method only borrows the x field, and the caller knows it, which means that the caller can allow your code to check.

This doesn’t actually work for a number of reasons:

  • horrendously verbose and cryptic;
  • not necessarily backwards compatible, sort of changes the meaning of existing things;
  • no indication that this is a method (no self parameter);
  • seems weird that using a pattern should be visible to the caller; I think of the pattern as not part of the public signature.

I do think this problem is real, certainly. I’ve been reluctant to pursue adding more complex annotations into the parameter itself. I imagine maybe one could use something like:

fn bar(&mut self)
    use self.x
{
    self.x += 1;
}

where the use declarations are a way to indicate the caller what parts of self you use (and naturally these are checked). This doesn’t seem too terrible. We’d presumably though need to add similar things to fn types and so forth, and it (maybe) creates a kind of subtyping relationship – are you allowed to convert a fn bar(self: &mut Foo) use self.x into a fn bar(self: &mut Foo)? Presumably we’d support coercions or something.

It’s interesting to note the interaction of this with traits – it’s not obvious how this could work there at all, unless you can introduce abstract partitions into the state (though my RFC for fields in traits is certainly related… It’s also weird that this use exposes private information, like field names, into the public signature, which seems quite undesirable. Lots of stuff to work out here, which is why I’ve preferred to punt on this issue. :slight_smile:

I think that ‘uses’ should be part of the type, so that they can be used with generics. For example:

impl PartialEq for MyStruct{id} {
   fn eq(&self, other: &MyStruct{id}) -> bool { self.id == other.id }
}

Then, say, a mutable method of MyStruct could test self == other even while having other fields mutably borrowed.

MyStruct{id} would be a subtype of MyStruct with automatic coercion from &MyStruct to &MyStruct{id}.

Can we infer from the body of any function (without unsafe code) the list borrowed fields? If this can be done safely and efficiently, no syntax will be necessary. Even if this cannot be done for all cases, it would allow to elide certain annotations making some (most?) of the code nicer to read and write.

Yes, there are many small troubles, and the syntax makes the method signatures more complex and heavier.

The annotations like &self(mut x) or use self.x are ways to specify what fields a method is allowed to use. So they are ways to better specify the flow of information between the methods of a struct. This is handy even in regaular OOP in languages like Java/D. Sometimes in those languages I've desired a way to know what fields a method is allowed to use. In D you can only specify a method to be const, immutable, or mutable, regarding its usage of instance fields. But an annotation like that is able to give a more granular information.

The syntax &self(mut x) gives more information than use self.x, because it also specifies that the field x will be mutated (and the compiler could even give an error if you don't mutate x). This means an annotation like &self(mut x, y) means that x will be written while y will be just read.

A fully specified syntax could use in, out, and inout, like: &self(in x, out y, inout z), this gives information regarding what instance variables are read, written, and read and written by the method. Static analysers like this kind of information a lot.

This kind of knowledge about the flow of information between methods is quite related to the #[outer()] annotation I discussed in past, the purpose is almost the same:

No matter the syntax, it seems to me that if you restrict the scope to any fields, all of those fields need to have greater or equal publicity to the method. Otherwise this clearly violates encapsulation IMO.

I think a syntax like this makes the most sense, but YMMV:

fn foo(self { &mut bar, &baz }, quux: Quux) {  ... }

self would then be inaccessible in the method.

Yeah, but you don't actually want to be forced to make your fields public. In any case, at this point we are walking right into a whole field of research about how best to specify which APIs can be composed with what -- i.e., you might like to be able to say things like "the method foo can be called in parallel with bar", without having to say what state they affect.

One way to do this that I personally pursued as part of my PhD is to partition the state abstractly, so that you can say "the method foo uses only the group of fields called "foo-fields", and bar uses the group of fields called bar-fields, and these are disjoint", without actually revealing your fields. But there have been many other approaches (and probably I was just reinventing someone else's prior work, at this point I don't recall too well the full catalog).

I've been happy that with Rust we've largely sidestepped this whole problem. I've usually found that if "abstract groups of fields" are needed, you can usually achieve the same effect by defining two fields whose types are structs with private fields:

  • foo: FooFields (and putting the method foo on there); and,
  • bar: BarFields (and putting the method bar on there

This sort of says the same thing but without needing complex language features. But it makes it annoying because you can't do self.foo() you must do self.foo.foo(). And sometimes these divisions are not so simple and clear (for example, maybe foo and bar are disjoint, but baz is only disjoint from bar but not from foo, and so forth...).

I wonder if there is a way to overcome some of this annoyingness (e.g., by "mirroring" the methods of foo from self such that self.foo() is syntactic sugar for self.foo.foo() or something).

Similarly, if I have to do some complex contract, one thing I occasionally (though rarely) do in my own code is to define a mirror structure that borrows fields. So imagine I have:

struct TheOwner {
    map: HashMap,
    vec: Vec
}

and I have some algorithm that writes to map but reads from vec. It might be defined on a struct:

impl TheOwner {
    fn algorithm(&mut self) {
        (TheAlgorithm { map: &mut self.map, vec: &self.vec }).go();
    }
}

struct TheAlgorithm<'algorithm> {
    map: &'algorithm mut HashMap,
    vec: &'algorithm Vec,
}

impl<'algorithm> TheAlgorithm<'algorithm> {
    fn go(&mut self) { ... }
}

This is annoying to define, but very flexible, and achieves the same effect as the inout, in categorizations.

This RFC seems relevant. It doesn't address the idea of restricting borrow in delegation, but that could be an interesting approach.

However, its interesting that you mention not wanting to be forced to make your fields public. My reasoning was sort of opposite - that we should be cautious about introducing backward compatibility traps by which changing the inner structure of a type could make two methods no longer disjoint, introducing a breaking change?

Is this issue of disjoint fields such a horrendous issue that it warrants a specific consideration at the language level?

I understand that sometimes the straightforward approach does not work because of it, but before considering enriching the language, I first wonder if this actually needs solving.

For now, it’s always been a very minor issue for me; I don’t encounter the problem often and can usually easily work around it.

2 Likes

I had been under the impression that this issue could be fixed by making the borrow checker somehow “smarter”. Had I misunderstood or has this option been taken off the table (and is it off for now or for good)?

The borrow checker can be modified to make it able to see when a method is not touching the unborrowed fields. But this increases the compilation times, and most importantly, you can't tell what fields are borrowed from the method signature, so you need a struct-level inference. Is this going to reduce the ability to perform separate compilation?

That's not the issue, the issue is that small changes in the body of the method change the signature of the method. So API stability is prone to breakage. The same reasoning applies to const fn, pure, full lifetime inference, return type inference, ...

1 Like

So the problem boils down to the fact that the computer wouldn't warn the programmer when he changed the (inferred) api by modifying the body of a function? If this is the case, then this adds confidence to my long-standing hunch that the IDE and the compiler should be one monolithic thing where one couldn't be separated from the other, and source files should be binary instead of text and the only way to modify them would be through that IDE/compiler application. This would enable all kinds of nice programming language features.

Are you really trying to start an editor war? You can make your IDE as fancy as you like, but don't try to tell me I can't use Vim...

Well, obviously I assumed that the Über-language would be integrated to Vim specifically. That goes without saying.

1 Like

No, this is not necessary at all.

Currently, the type-level contract of the API is visible in the signature. If we were to introduce contract elements that were inferred, there are different ways to prevent accidental breaking changes. Elm has experimented with automatically identifying changes in the inferred types of public signatures. The integrations you enumerated may be interesting for other reasons, but they are not necessary to solve this problem.

Personally, while a tool like elm's is interesting, I'm in favor of keeping the contract of an API explicit in the source.