Fields in Traits

Hello everyone. I am looking to follow up on the “Fields in Traits” RFC which aims to provide the ability for a trait to contain fields as well as methods. As currently envisioned his would boil down to an memory offset which could be used statically or put into the vtable to locate the desired field in implementing types.

Continuing the discussion from https://github.com/rust-lang/rfcs/pull/1546
Current RFC state: https://github.com/nikomatsakis/fields-in-traits-rfc/blob/master/0000-fields-in-traits.md

The main thing I am looking to do right now is collect different possible use cases and requirements for this feature. I have collected a couple bellow gathered from the RFC, discussions and personal use cases. Please let me know of others.

  1. Performance. This can transform a virtual method call into an indirect lookup.
  2. Better borrow granularity. This can allow concurrent borrows of different part of an object from a trait as each virtual field can be borrowed independently.
  3. Convenience: This allows &foo.x, &mut foo.x and foo.x = as opposed to a collection of getters/setters.
  4. Integration with other object systems. (ex: GObject) I think this falls under “Convenience”

If you want me to detail any of these use cases just ask.

It’s worth noting that I believe 1 and 4 are mutually exclusive (unless we are going to generate vtables at runtime) but the others seem to be covered by the RFC as is with only minor rewording.

8 Likes

1-3 is what I had in mind.

The only worry I have about fields in traits is that, as currently specified, they must map to a field (duh), that is, there is no way for them to map to a const, or to a value computed from two other types. So if you want to implement the trait for two types, and in one type there is no need for the field because it is either constant or can be recomputed from something else then… AFAICT you are out of luck. Either you add a field to the type, or you can’t implement the trait.

This is part of the trade-off of indirect lookups vs virtual method calls, but IMO limits severely the situations in which using fields in traits is a good idea. They can only be used for traits in which you are 100% sure that all current and future types are going to have to store the “value” as a field. If you are only 99% sure, you might as well just go with a getter/setter pair or similar.

8 Likes

It also effectively prevents enums from implementing the trait. I’d like to see some way to weasel oneself out from the necessity of a there to be an actual backing field – even if it were unsafe: one could override the “fieldness” with an unsafe implicitly called method that returned a reference to a memory location, and the unsafe code promises not to have side-effects and that the memory location is disjunct from other memory locations provided by the other fields. Of course this is just a strawman idea, and one with quite a lot of downsides…

4 Likes

I’m a bit worried about how this would interact with the borrow checker. In the current design, I understand that I can have two unrelated traits A and B which both alias the same field in a given struct. But this means that changing the mapping of a field in a trait impl is a breaking change, as it can create mutable aliasing situations which did not exist before, and thus lead the borrow checker to reject some existing client code which borrows mutably from both A and B.

So far, changing a trait impl could not cause trait clients to stop compiling due to an implementation detail of another trait impl, and this is probably a property that we want to keep. So unless a clear answer to this concern has already been given, I would rather disallow aliasing of fields across trait impls entirely in the first version of this RFC. But the question is: in a distributed development environment, can it be done?

2 Likes

Wouldn’t it have to map to normal fields to allow normal function? If it looks like a field you’d probably want to support &mut val.foo which won’t work with a const, and taking a reference will generally be problematic if it’s a computed owned value. The latter would also mean you could hide computation behind field access, meaning foo.x + foo.x could perform two computations (and maybe even mutations).

Another thing I’ve been wondering is how destructuring is going to work. E.g. let Foo { x, y } = value when a trait supplies a new z field. Without the mapping to fields, you might break code that destructures things if they have to be mentioned as well, or if you don’t have to mention it, you might introduce invisible and unexpected Drop::drop invocations.

In general I’d be opposed to anything that can make x.foo or let Foo { x } panic.

You already have the Index and Deref traits which allow impls that may panic and do arbitrary "hidden" computations to what only looks like memory access (at least in the eyes of a C programmer).


I'd like to take a step back and ponder the nature of traits. I don't feel totally comfortable with the idea that a trait can specify the contents of a type -- it feels too close to inheritance.

8 Likes

I think if you were disallowed from borrowing from multiple traits at the same time this wouldn't be an issue. However is this a reasonable restriction? I think it is probably the right decision since it allows the implements to focus only on the single trait they are implementing without worrying about breaking users or other traits.

One major downside that I can imagine is related traits and how aliasing would work between them. You might want to use two traits together or have a trait that encompasses two traits and ensures that each trait can be used simultaneously.

I'm not a C programmer though. I also don't think the existance of those is a good reason to introduce more places that can panic.

2 Likes

Maybe this subject has changed a lot since I last read about it, but I was under the impression that the primary, overriding motivation for “fields in traits” was to allow enforcing a performance guarantee that certain field lookups really are just field lookups, but that in order to retain basic composability in the typical case we did not want to restrict where in the type those fields might be located. Thus, enforcing “prefix layout” to get not-even-“virtual” field lookups would be a separate feature requiring opt-in.

In particular, I thought that meant it would be perfectly legal for a type to map multiple trait fields to the same concrete field, which I thought ruled out the possibility that we’d get any finer-grained borrow information from this feature (in addition to what @HadrienG said). It sounds like to actually get fine-grained borrow information we’d have to enforce that multiple trait fields always mean multiple fields in the type, and never allow borrowing “through” multiple traits, which seems like a pretty harsh restriction to get this information only in fields-in-traits scenarios.


I haven’t seen anyone yet talk about a use case where “virtual field lookup” is good enough for performance but “virtual methods” are not. Or about what the concrete, technical requirements are for integration with things like GObject. That’s what I’d like to hear more about, since the potential borrow checker benefit seems pretty dubious, and “convenience” in this case could be easily solved by sugar.

4 Likes

Thanks so much for taking this on! :heart: I started writing a monster response but I fear I'll never finish it. And besides I think monster posts are kind of annoying to read. So I'm going to write a few smaller responses. =)

Lately I've become enamored with the idea of using fields-in-traits to define "views" onto a struct as well. The idea would be to enable "partial self borrowing".

I gave an example of source code in this post, but the problem usually arises like this:

  • You have some struct MyStruct with various fields
  • You have a helper function fn foo(&self) -> &Bar that accesses some subset of those fields
  • You have another helper function fn mutate_baz(&mut self, bar: &Bar) that mutates a different (and disjoint) subset of those fields
  • You want to invoke self.mutate_baz(self.foo()), but the borrow checker is unhappy.
  • Right now, there are two possible fixes:
    • break out those subsets of fields into distinct structs and put the methods on those structs (self.baz_fields.mutate(self.foo_fields.get()))
    • inline the two methods (e.g., foo and mutate_baz) into the caller
  • I find the problem is most acute in between private methods, but it can arise in public interfaces too -- e.g., it affects collections where you want to enable access to distinct keys (you can view split_at_mut as being a sort of solution to this desire for slices), though that obviously has complications.
    • In general though in a public interface you will want the ability to check and document the fact that methods can be invoked separately.

Anyway, the goal here would be that one can solve this by problem by declaring (somehow!) that those methods (foo and mutate_baz) operate on disjoint sets of fields. But how to do that? One idea was to leverage fields-in-traits and use those traits to define views on the original struct. I'll sketch the idea here with let syntax:

struct MyStruct {
    field_a: A,
    field_b: B,
    field_c: C,
    field_d: D,
}

trait View {
    let field_a: A; // not declared as `mut`
    let mut field_b: B; // declared as `mut`.
}

impl View for MyStruct {
    let field_a = self.field_a;
    let mut field_b = self.field_b;
}

Now I could create a view like this:

impl MyStruct {
    fn foo(&mut self) {
        let view: &mut dyn View = &mut *self;
        ...
    }
}

Under the base RFC, this is two operations: we create a pointer (self) of type &mut MyStruct, then we coerce that into a trait reference (as usual). But we could think a more "composite operation" that the borrow checker is more deeply aware of: that is, a kind of borrow where the result is not a &mut MyStruct that is then coerced, but rather where the result is directly a &mut dyn View. In that case, the borrow checker can understand that this borrow can only affect the fields named in the view. This means that we can then permit other borrows of the same path for different views, so long as those views are compatible.

Once we've defined the views, you can imagine using them in the self like so, fn mutate_bar(self: &mut BarView). This is an obvious case where the borrow-checker can make self.mutate_bar(...) use this more limited form of borrow.

Things I don't love about using traits for this:

  • You have to impl them, and presumably there are some restrictions on the traits/impls so that we can identify the fields that are affected.
  • &mut dyn View instead of &mut View

Things I do love:

  • I like having named views because they are intuitive and can be documented and part of your public API if you really want.
    • We can maybe also check that they access disjoint sets of field, though I think the current RFC doesn't quite address this need.
  • This feels like a pretty clean and comprehensible mechanism, even if we layer some sugar on top.
7 Likes

Let me elaborate on what I was thinking here, though it's been a while since I've had my head in this space and I think that the "gnome-class" effort has evolved quite a bit. Still, I think it's worth talking about, because the use case seems like an important one.

The idea was that sometimes field offsets do need to be computed dynamically. In the case of GObject, there is a little bit of code that is ordinarily baked into a macro, which computes a negative offset from the pointer if I recall.

I had hoped to allow people to write "unsafe impls" where you give a little snippet of code to compute the field offset. I imagined code that would return a *mut T (or *const T for read-only fields). Something like:

impl Foo for Bar {
    let x = unsafe {
        // a block of code where `self` is in scope
        GObject_helper_compute_offset(self, 0) // or whatever
    };
}

It would then be on the implementor to guarantee the disjointness requirements. And yes, this seems to imply that we extend the proposal with the ability to support fields that are reached not via an interior offset but via executing some code found in the vtable. (More on that in a second.)

Well, there is a tension, but I'd not say mutually exclusive. In order to achieve performance parity with C++, we already need the ability to tag traits and place limits on their impls. For example, it would be useful to be able to tag traits as #[repr(prefix)], which means that the fields in the traits must appear as a prefix of the structs that implement those traits (this in turn implies limitations on the impls: e.g., you can only implement this for a struct in the current crate, etc etc). So -- presumably -- limiting to interior fields, but with arbitrary offsets, would be another kind of repr (roughly corresponding to "virtual inheritance" in C++). And the most general form would permit executing a small shim to identify the offset.

I think in the end we want this anyhow, even for safe code, because it allows us to support general paths:

struct Foo {
    data: Box<FooBox>
}

struct FooBox {
    a: A,
    b: B
}

impl SomeTrait for Foo {
    let f = self.data.a; // this is not interior to `self`
}

So, while I could see trying to cut out the unsafe part and leave that for a possible future extension, I do think we should make provisions for executing shims, which then leaves the door for those shims to be written by the user.

I've been wondering about this too. It basically comes down to the ability to borrow -- that is, we could certainly permit you to define a "get-set-only" field that cannot be borrowed (so &self.a would fail -- or perhaps create a temporary -- but let x = self.a would work). This is strongly related to the desire for DerefGet (where let x = &*self would fail) and IndexGet (let x = data[x] works, but not &data[x]).

I'm somewhat torn about this. When it comes to DerefGet and IndexGet, I've leaned towards saying "just use the fn traits" -- so write let x = data(x) instead of let x = data[x] -- this would preserve the syntactic property that any lvalue (that is, assignable path) can be borrowed. But I think maybe I'm preserving a distinction that isn't that important, actually, and it'd be nicer to just enable the sugar.

To be clear, I don't think we would need to "roll those in" to this RFC -- just saying that the path we chart here affects those proposals too.

I don't think this is true in the existing proposal, but I think it arises in the "views" variant i've been talking about. That is, in the existing proposal, the disjointness requirement isn't something we have to check in "client code" -- rather, we check when you define the impl that all the disjointness conditions are met.

But there are some borrow checker interactions that weren't cleared defined in the RFC. For example, would accessing a trait field a be considered to overlap with a struct field b, presuming that b is not mapped to a? I had actually assumed it would be, and hence this code would error:

trait Foo {
    let a: u32
}

struct Bar {
    b: u32,
    a2: u32,
}

impl Foo for Bar {
    let a = self.a2;
}

fn main() {
    let mut bar = Bar { ... };
    let p = &mut bar.b;
    let q = &mut bar.a; // where `a` is defined in the trait
}

Put another way, the borrow checker here sees two paths, where I've written the field names with "fully qualified paths" telling you where they came from:

bar.(Bar::b) // inherent field of `Bar`
bar.(Foo::a) // field defined in trait `Foo`

My assumption was that we would consider two inherent fields (e.g., b and a2) to be disjoint if they come from the same struct. We would also consider two trait fields to be disjoint if they come from the same trait (or supertrait/subtrait relationship). But fields from two unrelated traits would be considered to maybe overlap -- and the same for a field from some trait and some struct.

But I guess we can imagine the borrow checker "seeing through" the borrow of a to understand that it really maps to a2 and hence is disjoint from b. In that case, we do want to think about privacy/encapsulation.

And certainly this comes up in the views concept I was kicking around.

1 Like

So, the RFC disallows moves from a field, roughly for this reason. That interacts also with the idea of "getter" fields, I guess, since they must produce new owned values always.

I don’t think that this fits the views idea very well. This is because to implement a trait you might want to use multiple fields for a method, but if the trait only gave you one you are now screwed. The views idea seems like a good one but I think that it would be substantially different from what is here that it should be a different proposal (possible obsoleting this one).

If I was implementing the views proposal I would want to write something like this.

trait Foo {
  view ViewA
  view ViewB

  fn first(...) use ViewA -> &Thing;
  fn second(...) use ViewB -> &mut Thing;
}

As in I would want the “view” to be completely abstracted from fields so as to not constraining the impl’ing type. For a impl using only safe I think you would have to map a view to some set of fields (0 or more) but an unsafe impl could possible do something else.

This seems like it falls back to partial borrows. If that is the only thing that we want I think that binding it to virtual fields seems overly restrictive and a method can work just as well if you can specify what part gets borrowed.

This is defintely an interesting idea, providing 3 methods of dispatch that can be chosen from, indirect function call, indirect offset and direct. This seems to be focused on the performance aspect.

The more I think about it, the more I think that two (or more) problems are being confused. It is important that one isn't excluded by solving the other, but I think we should consider the performance and partial borrow cases separately.

You could split these into two traits, it might not be the most natural way to do it, but it seems like something that sugar can be added for later, e.g.

trait FooA {
     let a;
     fn first(...) -> &Thing;
}

trait Foo: FooA {
   let b;
   fn second(...) -> &mut Thing;
}

impl<T: Foo> FooA for T;