Fields in Traits


#10

Thanks so much for taking this on! :heart: I started writing a monster response but I fear I’ll never finish it. And besides I think monster posts are kind of annoying to read. So I’m going to write a few smaller responses. =)


#12

Lately I’ve become enamored with the idea of using fields-in-traits to define “views” onto a struct as well. The idea would be to enable “partial self borrowing”.

I gave an example of source code in this post, but the problem usually arises like this:

  • You have some struct MyStruct with various fields
  • You have a helper function fn foo(&self) -> &Bar that accesses some subset of those fields
  • You have another helper function fn mutate_baz(&mut self, bar: &Bar) that mutates a different (and disjoint) subset of those fields
  • You want to invoke self.mutate_baz(self.foo()), but the borrow checker is unhappy.
  • Right now, there are two possible fixes:
    • break out those subsets of fields into distinct structs and put the methods on those structs (self.baz_fields.mutate(self.foo_fields.get()))
    • inline the two methods (e.g., foo and mutate_baz) into the caller
  • I find the problem is most acute in between private methods, but it can arise in public interfaces too – e.g., it affects collections where you want to enable access to distinct keys (you can view split_at_mut as being a sort of solution to this desire for slices), though that obviously has complications.
    • In general though in a public interface you will want the ability to check and document the fact that methods can be invoked separately.

Anyway, the goal here would be that one can solve this by problem by declaring (somehow!) that those methods (foo and mutate_baz) operate on disjoint sets of fields. But how to do that? One idea was to leverage fields-in-traits and use those traits to define views on the original struct. I’ll sketch the idea here with let syntax:

struct MyStruct {
    field_a: A,
    field_b: B,
    field_c: C,
    field_d: D,
}

trait View {
    let field_a: A; // not declared as `mut`
    let mut field_b: B; // declared as `mut`.
}

impl View for MyStruct {
    let field_a = self.field_a;
    let mut field_b = self.field_b;
}

Now I could create a view like this:

impl MyStruct {
    fn foo(&mut self) {
        let view: &mut dyn View = &mut *self;
        ...
    }
}

Under the base RFC, this is two operations: we create a pointer (self) of type &mut MyStruct, then we coerce that into a trait reference (as usual). But we could think a more “composite operation” that the borrow checker is more deeply aware of: that is, a kind of borrow where the result is not a &mut MyStruct that is then coerced, but rather where the result is directly a &mut dyn View. In that case, the borrow checker can understand that this borrow can only affect the fields named in the view. This means that we can then permit other borrows of the same path for different views, so long as those views are compatible.

Once we’ve defined the views, you can imagine using them in the self like so, fn mutate_bar(self: &mut BarView). This is an obvious case where the borrow-checker can make self.mutate_bar(...) use this more limited form of borrow.

Things I don’t love about using traits for this:

  • You have to impl them, and presumably there are some restrictions on the traits/impls so that we can identify the fields that are affected.
  • &mut dyn View instead of &mut View

Things I do love:

  • I like having named views because they are intuitive and can be documented and part of your public API if you really want.
    • We can maybe also check that they access disjoint sets of field, though I think the current RFC doesn’t quite address this need.
  • This feels like a pretty clean and comprehensible mechanism, even if we layer some sugar on top.

#13

Let me elaborate on what I was thinking here, though it’s been a while since I’ve had my head in this space and I think that the “gnome-class” effort has evolved quite a bit. Still, I think it’s worth talking about, because the use case seems like an important one.

The idea was that sometimes field offsets do need to be computed dynamically. In the case of GObject, there is a little bit of code that is ordinarily baked into a macro, which computes a negative offset from the pointer if I recall.

I had hoped to allow people to write “unsafe impls” where you give a little snippet of code to compute the field offset. I imagined code that would return a *mut T (or *const T for read-only fields). Something like:

impl Foo for Bar {
    let x = unsafe {
        // a block of code where `self` is in scope
        GObject_helper_compute_offset(self, 0) // or whatever
    };
}

It would then be on the implementor to guarantee the disjointness requirements. And yes, this seems to imply that we extend the proposal with the ability to support fields that are reached not via an interior offset but via executing some code found in the vtable. (More on that in a second.)

Well, there is a tension, but I’d not say mutually exclusive. In order to achieve performance parity with C++, we already need the ability to tag traits and place limits on their impls. For example, it would be useful to be able to tag traits as #[repr(prefix)], which means that the fields in the traits must appear as a prefix of the structs that implement those traits (this in turn implies limitations on the impls: e.g., you can only implement this for a struct in the current crate, etc etc). So – presumably – limiting to interior fields, but with arbitrary offsets, would be another kind of repr (roughly corresponding to “virtual inheritance” in C++). And the most general form would permit executing a small shim to identify the offset.

I think in the end we want this anyhow, even for safe code, because it allows us to support general paths:

struct Foo {
    data: Box<FooBox>
}

struct FooBox {
    a: A,
    b: B
}

impl SomeTrait for Foo {
    let f = self.data.a; // this is not interior to `self`
}

So, while I could see trying to cut out the unsafe part and leave that for a possible future extension, I do think we should make provisions for executing shims, which then leaves the door for those shims to be written by the user.


#14

I’ve been wondering about this too. It basically comes down to the ability to borrow – that is, we could certainly permit you to define a “get-set-only” field that cannot be borrowed (so &self.a would fail – or perhaps create a temporary – but let x = self.a would work). This is strongly related to the desire for DerefGet (where let x = &*self would fail) and IndexGet (let x = data[x] works, but not &data[x]).

I’m somewhat torn about this. When it comes to DerefGet and IndexGet, I’ve leaned towards saying “just use the fn traits” – so write let x = data(x) instead of let x = data[x] – this would preserve the syntactic property that any lvalue (that is, assignable path) can be borrowed. But I think maybe I’m preserving a distinction that isn’t that important, actually, and it’d be nicer to just enable the sugar.


#15

To be clear, I don’t think we would need to “roll those in” to this RFC – just saying that the path we chart here affects those proposals too.


#16

I don’t think this is true in the existing proposal, but I think it arises in the “views” variant i’ve been talking about. That is, in the existing proposal, the disjointness requirement isn’t something we have to check in “client code” – rather, we check when you define the impl that all the disjointness conditions are met.

But there are some borrow checker interactions that weren’t cleared defined in the RFC. For example, would accessing a trait field a be considered to overlap with a struct field b, presuming that b is not mapped to a? I had actually assumed it would be, and hence this code would error:

trait Foo {
    let a: u32
}

struct Bar {
    b: u32,
    a2: u32,
}

impl Foo for Bar {
    let a = self.a2;
}

fn main() {
    let mut bar = Bar { ... };
    let p = &mut bar.b;
    let q = &mut bar.a; // where `a` is defined in the trait
}

Put another way, the borrow checker here sees two paths, where I’ve written the field names with “fully qualified paths” telling you where they came from:

bar.(Bar::b) // inherent field of `Bar`
bar.(Foo::a) // field defined in trait `Foo`

My assumption was that we would consider two inherent fields (e.g., b and a2) to be disjoint if they come from the same struct. We would also consider two trait fields to be disjoint if they come from the same trait (or supertrait/subtrait relationship). But fields from two unrelated traits would be considered to maybe overlap – and the same for a field from some trait and some struct.

But I guess we can imagine the borrow checker “seeing through” the borrow of a to understand that it really maps to a2 and hence is disjoint from b. In that case, we do want to think about privacy/encapsulation.

And certainly this comes up in the views concept I was kicking around.


#17

So, the RFC disallows moves from a field, roughly for this reason. That interacts also with the idea of “getter” fields, I guess, since they must produce new owned values always.


#19

I don’t think that this fits the views idea very well. This is because to implement a trait you might want to use multiple fields for a method, but if the trait only gave you one you are now screwed. The views idea seems like a good one but I think that it would be substantially different from what is here that it should be a different proposal (possible obsoleting this one).

If I was implementing the views proposal I would want to write something like this.

trait Foo {
  view ViewA
  view ViewB

  fn first(...) use ViewA -> &Thing;
  fn second(...) use ViewB -> &mut Thing;
}

As in I would want the “view” to be completely abstracted from fields so as to not constraining the impl’ing type. For a impl using only safe I think you would have to map a view to some set of fields (0 or more) but an unsafe impl could possible do something else.


#20

This seems like it falls back to partial borrows. If that is the only thing that we want I think that binding it to virtual fields seems overly restrictive and a method can work just as well if you can specify what part gets borrowed.


#21

This is defintely an interesting idea, providing 3 methods of dispatch that can be chosen from, indirect function call, indirect offset and direct. This seems to be focused on the performance aspect.

The more I think about it, the more I think that two (or more) problems are being confused. It is important that one isn’t excluded by solving the other, but I think we should consider the performance and partial borrow cases separately.


#22

You could split these into two traits, it might not be the most natural way to do it, but it seems like something that sugar can be added for later, e.g.

trait FooA {
     let a;
     fn first(...) -> &Thing;
}

trait Foo: FooA {
   let b;
   fn second(...) -> &mut Thing;
}

impl<T: Foo> FooA for T;

#23

I think that two traits actually makes the problem worse. I don’t want to allow partial borrows between traits[1]. Otherwise we need a complicated way of defining what traits can be used together. My intention in that example was that a single trait could define a subset of itself that could be borrowed. So you could borrow ViewA independently of ViewB. This makes the following code legal.

let foo: Foo = unimplemented!();
let borrow1 = foo.first();
let borrow2 = foo.second();

In current Rust (assuming that first and second have the signature Fn(&self) -> &Thing) this would be illegal because you are borrowing from foo twice, and one is mutable.

[1] There might need to be a story of this with “related” traits like a trait that requires another to be implemented.


#24

How how do you specify that? And how do you document it for your users? This is what a named view offers, basically – a way to document it that is not “here is a list of private fields that you should not know about; now check if they intersect one another”. (And, even better, by giving names to the views, potentially a way to assert that they ought to be disjoint.)

I agree that those are distinct problems, but I am not yet convinced that they want to be distinct language mechanisms. To be honest, I think that the performance aspects (which have to do with how we will compile code that is using a trait object that contains fields) and this other concept are pretty much orthogonal.


#25

Yes, this is sort of the ‘new thing’. I’m not sure I would characterize it as “complicated”, but it seems like something we would have to add.

Maybe it’s just a layer on top of the core system – that would be nice. I have to get a bit more crisp in my thinking about how views would interact with the borrow checker, and in particular what it might mean to have “public views”.

Everything here also intersects the this question I raised earlier, which isn’t specific to views – how smart would the borrow checker be with understanding disjointness?


#26

So I think what I am trying to express is that I think we need something to express these borrows (which I think you agree with), but binding them directly to fields feels too restrictive to me. I think that we would want to have something more abstract like the views you were describing.

However these features seem to make the “fields in traits” idea a little useless since this covers 1 and 2. Maybe we could consider some simple syntax sugar for mapping get/get_mut to a field with an appropriate view.


#27

I really like this idea. But what about introducing more sugaring to it? So it feels more like a “view”. I imagined something like this:

view CustomAccess for MyStruct {
    let field_a: A = self.field_a;
    ...
}

// This is basically sugar for:
// trait CustomAccess {
//     let field_a: A;
// }
//
// impl CustomAccess for MyStruct {
//     let field_a = self.field_a;
// }

A “view” then should probably be a restrictive subset of a trait/impl/field-in-trait (only fields are allowed).


#28

TLDR at the end;

My two cents on the matter, I do not think Traits should or realistically could include fields because:

  • It forces the user of a trait to maintain a field they may not want (I’ll explain forces later)
  • Traits already effectively have this power without forcing it upon the user.
  • Trait fields would either break the current implementation of Traits or force overhead on users.

Example of what I’m talking about:

pub struct Foo {
    x: usize
}

pub trait Bar: BorrowMut<Foo> {
    fn do_something_cool_with_foo(&mut self);
    fn do_other_thing_with_foo(&mut self) {
        //Your awesome implementation.
    }
}

impl<T> Bar for T where T: BorrowMut<Foo> {}

This is valid for Traits as they currently are and, if implemented in a library crate, the functionality of Bar is provided for all types which choose to implement BorrowMut<Foo> implicitly with little to no overhead.

This implementation also effectively implements “traits with fields” by dint of the fact that before a user can make use of the Bar trait for their MyType type they must first provide some implementation to treat MyType as if it had the data of Foo.

However if Traits themselves could be made with fields either:

  • Every type that wants to make use of the Bar trait would have to implement each of the fields currently in Foo in some way, requiring they think and understand how the data should be managed.
  • The Bar trait provides implementation for initialising and managing the data already and the line “impl<T> Bar for T where T: BorrowMut<Foo> {}” would implement Bar on all types which make use of it without necessarily telling the user that there’s a memory overhead added to their MyType by using Bar or otherwise cloging up their compiler output with warnings to let them know its happening.

TLDR: Traits as they currently exist alongside Rusts borrowing can already implement “traits with fields” in an effective and user friendly way that requires a conscious decision by the user to take responsibility for any memory overhead.


#29

This looks really promising (so I hope that it moves forward). However, there is something that is missing from this (or even view traits) as apposed to getters/setters (and how they are used in languages like C#) is that with getters/ setters you can define bounds and checks where as with this you cannot.

I think that the ability to return a result from a setter as very cool and succinct.

example:

if (foo.bar = 15)? > 10 { ... }

This would try and assign 15 to the bar field in foo and then check to see if it above 10


#30

I would really hate if an assignment to a field (or even the seemingly read-only access thereof) could now invoke arbitrary functions. The “setter returns a value” way is an extension of this and I find it even more convoluted. I would not ever want to have to read any code using this sort of feature.


#31

What’s wrong with doing it the old-fashioned way - through a function call?