Fields in Traits

You already have the Index and Deref traits which allow impls that may panic and do arbitrary "hidden" computations to what only looks like memory access (at least in the eyes of a C programmer).


I'd like to take a step back and ponder the nature of traits. I don't feel totally comfortable with the idea that a trait can specify the contents of a type -- it feels too close to inheritance.

8 Likes

I think if you were disallowed from borrowing from multiple traits at the same time this wouldn't be an issue. However is this a reasonable restriction? I think it is probably the right decision since it allows the implements to focus only on the single trait they are implementing without worrying about breaking users or other traits.

One major downside that I can imagine is related traits and how aliasing would work between them. You might want to use two traits together or have a trait that encompasses two traits and ensures that each trait can be used simultaneously.

I'm not a C programmer though. I also don't think the existance of those is a good reason to introduce more places that can panic.

2 Likes

Maybe this subject has changed a lot since I last read about it, but I was under the impression that the primary, overriding motivation for “fields in traits” was to allow enforcing a performance guarantee that certain field lookups really are just field lookups, but that in order to retain basic composability in the typical case we did not want to restrict where in the type those fields might be located. Thus, enforcing “prefix layout” to get not-even-“virtual” field lookups would be a separate feature requiring opt-in.

In particular, I thought that meant it would be perfectly legal for a type to map multiple trait fields to the same concrete field, which I thought ruled out the possibility that we’d get any finer-grained borrow information from this feature (in addition to what @HadrienG said). It sounds like to actually get fine-grained borrow information we’d have to enforce that multiple trait fields always mean multiple fields in the type, and never allow borrowing “through” multiple traits, which seems like a pretty harsh restriction to get this information only in fields-in-traits scenarios.


I haven’t seen anyone yet talk about a use case where “virtual field lookup” is good enough for performance but “virtual methods” are not. Or about what the concrete, technical requirements are for integration with things like GObject. That’s what I’d like to hear more about, since the potential borrow checker benefit seems pretty dubious, and “convenience” in this case could be easily solved by sugar.

4 Likes

Thanks so much for taking this on! :heart: I started writing a monster response but I fear I'll never finish it. And besides I think monster posts are kind of annoying to read. So I'm going to write a few smaller responses. =)

Lately I've become enamored with the idea of using fields-in-traits to define "views" onto a struct as well. The idea would be to enable "partial self borrowing".

I gave an example of source code in this post, but the problem usually arises like this:

  • You have some struct MyStruct with various fields
  • You have a helper function fn foo(&self) -> &Bar that accesses some subset of those fields
  • You have another helper function fn mutate_baz(&mut self, bar: &Bar) that mutates a different (and disjoint) subset of those fields
  • You want to invoke self.mutate_baz(self.foo()), but the borrow checker is unhappy.
  • Right now, there are two possible fixes:
    • break out those subsets of fields into distinct structs and put the methods on those structs (self.baz_fields.mutate(self.foo_fields.get()))
    • inline the two methods (e.g., foo and mutate_baz) into the caller
  • I find the problem is most acute in between private methods, but it can arise in public interfaces too -- e.g., it affects collections where you want to enable access to distinct keys (you can view split_at_mut as being a sort of solution to this desire for slices), though that obviously has complications.
    • In general though in a public interface you will want the ability to check and document the fact that methods can be invoked separately.

Anyway, the goal here would be that one can solve this by problem by declaring (somehow!) that those methods (foo and mutate_baz) operate on disjoint sets of fields. But how to do that? One idea was to leverage fields-in-traits and use those traits to define views on the original struct. I'll sketch the idea here with let syntax:

struct MyStruct {
    field_a: A,
    field_b: B,
    field_c: C,
    field_d: D,
}

trait View {
    let field_a: A; // not declared as `mut`
    let mut field_b: B; // declared as `mut`.
}

impl View for MyStruct {
    let field_a = self.field_a;
    let mut field_b = self.field_b;
}

Now I could create a view like this:

impl MyStruct {
    fn foo(&mut self) {
        let view: &mut dyn View = &mut *self;
        ...
    }
}

Under the base RFC, this is two operations: we create a pointer (self) of type &mut MyStruct, then we coerce that into a trait reference (as usual). But we could think a more "composite operation" that the borrow checker is more deeply aware of: that is, a kind of borrow where the result is not a &mut MyStruct that is then coerced, but rather where the result is directly a &mut dyn View. In that case, the borrow checker can understand that this borrow can only affect the fields named in the view. This means that we can then permit other borrows of the same path for different views, so long as those views are compatible.

Once we've defined the views, you can imagine using them in the self like so, fn mutate_bar(self: &mut BarView). This is an obvious case where the borrow-checker can make self.mutate_bar(...) use this more limited form of borrow.

Things I don't love about using traits for this:

  • You have to impl them, and presumably there are some restrictions on the traits/impls so that we can identify the fields that are affected.
  • &mut dyn View instead of &mut View

Things I do love:

  • I like having named views because they are intuitive and can be documented and part of your public API if you really want.
    • We can maybe also check that they access disjoint sets of field, though I think the current RFC doesn't quite address this need.
  • This feels like a pretty clean and comprehensible mechanism, even if we layer some sugar on top.
7 Likes

Let me elaborate on what I was thinking here, though it's been a while since I've had my head in this space and I think that the "gnome-class" effort has evolved quite a bit. Still, I think it's worth talking about, because the use case seems like an important one.

The idea was that sometimes field offsets do need to be computed dynamically. In the case of GObject, there is a little bit of code that is ordinarily baked into a macro, which computes a negative offset from the pointer if I recall.

I had hoped to allow people to write "unsafe impls" where you give a little snippet of code to compute the field offset. I imagined code that would return a *mut T (or *const T for read-only fields). Something like:

impl Foo for Bar {
    let x = unsafe {
        // a block of code where `self` is in scope
        GObject_helper_compute_offset(self, 0) // or whatever
    };
}

It would then be on the implementor to guarantee the disjointness requirements. And yes, this seems to imply that we extend the proposal with the ability to support fields that are reached not via an interior offset but via executing some code found in the vtable. (More on that in a second.)

Well, there is a tension, but I'd not say mutually exclusive. In order to achieve performance parity with C++, we already need the ability to tag traits and place limits on their impls. For example, it would be useful to be able to tag traits as #[repr(prefix)], which means that the fields in the traits must appear as a prefix of the structs that implement those traits (this in turn implies limitations on the impls: e.g., you can only implement this for a struct in the current crate, etc etc). So -- presumably -- limiting to interior fields, but with arbitrary offsets, would be another kind of repr (roughly corresponding to "virtual inheritance" in C++). And the most general form would permit executing a small shim to identify the offset.

I think in the end we want this anyhow, even for safe code, because it allows us to support general paths:

struct Foo {
    data: Box<FooBox>
}

struct FooBox {
    a: A,
    b: B
}

impl SomeTrait for Foo {
    let f = self.data.a; // this is not interior to `self`
}

So, while I could see trying to cut out the unsafe part and leave that for a possible future extension, I do think we should make provisions for executing shims, which then leaves the door for those shims to be written by the user.

I've been wondering about this too. It basically comes down to the ability to borrow -- that is, we could certainly permit you to define a "get-set-only" field that cannot be borrowed (so &self.a would fail -- or perhaps create a temporary -- but let x = self.a would work). This is strongly related to the desire for DerefGet (where let x = &*self would fail) and IndexGet (let x = data[x] works, but not &data[x]).

I'm somewhat torn about this. When it comes to DerefGet and IndexGet, I've leaned towards saying "just use the fn traits" -- so write let x = data(x) instead of let x = data[x] -- this would preserve the syntactic property that any lvalue (that is, assignable path) can be borrowed. But I think maybe I'm preserving a distinction that isn't that important, actually, and it'd be nicer to just enable the sugar.

To be clear, I don't think we would need to "roll those in" to this RFC -- just saying that the path we chart here affects those proposals too.

I don't think this is true in the existing proposal, but I think it arises in the "views" variant i've been talking about. That is, in the existing proposal, the disjointness requirement isn't something we have to check in "client code" -- rather, we check when you define the impl that all the disjointness conditions are met.

But there are some borrow checker interactions that weren't cleared defined in the RFC. For example, would accessing a trait field a be considered to overlap with a struct field b, presuming that b is not mapped to a? I had actually assumed it would be, and hence this code would error:

trait Foo {
    let a: u32
}

struct Bar {
    b: u32,
    a2: u32,
}

impl Foo for Bar {
    let a = self.a2;
}

fn main() {
    let mut bar = Bar { ... };
    let p = &mut bar.b;
    let q = &mut bar.a; // where `a` is defined in the trait
}

Put another way, the borrow checker here sees two paths, where I've written the field names with "fully qualified paths" telling you where they came from:

bar.(Bar::b) // inherent field of `Bar`
bar.(Foo::a) // field defined in trait `Foo`

My assumption was that we would consider two inherent fields (e.g., b and a2) to be disjoint if they come from the same struct. We would also consider two trait fields to be disjoint if they come from the same trait (or supertrait/subtrait relationship). But fields from two unrelated traits would be considered to maybe overlap -- and the same for a field from some trait and some struct.

But I guess we can imagine the borrow checker "seeing through" the borrow of a to understand that it really maps to a2 and hence is disjoint from b. In that case, we do want to think about privacy/encapsulation.

And certainly this comes up in the views concept I was kicking around.

1 Like

So, the RFC disallows moves from a field, roughly for this reason. That interacts also with the idea of "getter" fields, I guess, since they must produce new owned values always.

I don’t think that this fits the views idea very well. This is because to implement a trait you might want to use multiple fields for a method, but if the trait only gave you one you are now screwed. The views idea seems like a good one but I think that it would be substantially different from what is here that it should be a different proposal (possible obsoleting this one).

If I was implementing the views proposal I would want to write something like this.

trait Foo {
  view ViewA
  view ViewB

  fn first(...) use ViewA -> &Thing;
  fn second(...) use ViewB -> &mut Thing;
}

As in I would want the “view” to be completely abstracted from fields so as to not constraining the impl’ing type. For a impl using only safe I think you would have to map a view to some set of fields (0 or more) but an unsafe impl could possible do something else.

This seems like it falls back to partial borrows. If that is the only thing that we want I think that binding it to virtual fields seems overly restrictive and a method can work just as well if you can specify what part gets borrowed.

This is defintely an interesting idea, providing 3 methods of dispatch that can be chosen from, indirect function call, indirect offset and direct. This seems to be focused on the performance aspect.

The more I think about it, the more I think that two (or more) problems are being confused. It is important that one isn't excluded by solving the other, but I think we should consider the performance and partial borrow cases separately.

You could split these into two traits, it might not be the most natural way to do it, but it seems like something that sugar can be added for later, e.g.

trait FooA {
     let a;
     fn first(...) -> &Thing;
}

trait Foo: FooA {
   let b;
   fn second(...) -> &mut Thing;
}

impl<T: Foo> FooA for T;

I think that two traits actually makes the problem worse. I don't want to allow partial borrows between traits[1]. Otherwise we need a complicated way of defining what traits can be used together. My intention in that example was that a single trait could define a subset of itself that could be borrowed. So you could borrow ViewA independently of ViewB. This makes the following code legal.

let foo: Foo = unimplemented!();
let borrow1 = foo.first();
let borrow2 = foo.second();

In current Rust (assuming that first and second have the signature Fn(&self) -> &Thing) this would be illegal because you are borrowing from foo twice, and one is mutable.

[1] There might need to be a story of this with "related" traits like a trait that requires another to be implemented.

How how do you specify that? And how do you document it for your users? This is what a named view offers, basically -- a way to document it that is not "here is a list of private fields that you should not know about; now check if they intersect one another". (And, even better, by giving names to the views, potentially a way to assert that they ought to be disjoint.)

I agree that those are distinct problems, but I am not yet convinced that they want to be distinct language mechanisms. To be honest, I think that the performance aspects (which have to do with how we will compile code that is using a trait object that contains fields) and this other concept are pretty much orthogonal.

Yes, this is sort of the 'new thing'. I'm not sure I would characterize it as "complicated", but it seems like something we would have to add.

Maybe it's just a layer on top of the core system -- that would be nice. I have to get a bit more crisp in my thinking about how views would interact with the borrow checker, and in particular what it might mean to have "public views".

Everything here also intersects the this question I raised earlier, which isn't specific to views -- how smart would the borrow checker be with understanding disjointness?

So I think what I am trying to express is that I think we need something to express these borrows (which I think you agree with), but binding them directly to fields feels too restrictive to me. I think that we would want to have something more abstract like the views you were describing.

However these features seem to make the "fields in traits" idea a little useless since this covers 1 and 2. Maybe we could consider some simple syntax sugar for mapping get/get_mut to a field with an appropriate view.

I really like this idea. But what about introducing more sugaring to it? So it feels more like a "view". I imagined something like this:

view CustomAccess for MyStruct {
    let field_a: A = self.field_a;
    ...
}

// This is basically sugar for:
// trait CustomAccess {
//     let field_a: A;
// }
//
// impl CustomAccess for MyStruct {
//     let field_a = self.field_a;
// }

A "view" then should probably be a restrictive subset of a trait/impl/field-in-trait (only fields are allowed).

1 Like