[Pre-RFC] Associated field/constant accessors in traits

Summary

Add associated "accessors" to traits, to allow access of member fields through trait objects without virtual function calls; plus enabling associated constants stored directly in the vtable.

(Edit: and also perhaps provide borrow splitting and as an realization of view types)

Motivation

Currently, to access data stored in a type via a trait object, you have to use a getter method, which incurs a virtual function call. This has extra overhead and inhibits optimization.

Same goes for associated constant items, they require adding a virtual call when ideally they can be stored directly in the vtable. (See also: Make associated consts object safe)

Design

We introduce a new type of associated items in traits: accessors. They can be seen as side-effect free methods that returns a constant value or a references to a member inside the type.

For example:

struct Earth {
    mass: f32,
}

struct Moon {
    mass: f32,
}

trait CelestialBody {
    // Syntax pending, see below for alternatives
    /// Mass in kilograms
    fn mass(&self) -> &f32 = _;

    // mutable accessor can exist too:
    // fn mass(&mut self) -> &mut f32 = _;
}

impl CelestialBody for Earth {
    fn mass(&self) -> &f32 = &self.mass;
}

impl CelestialBody for Moon {
    fn mass(&self) -> &f32 = &self.mass;
}

// Using the accessor:

let x: &dyn CelestialBody = &Earth { mass: 1. };
x.mass(); // syntax pending

Inside the vtable, an offset into the implementing type is stored. And when accessor is used, the value is retrieved by derefencing at that offset. This guarantees that the implementing type is move safe.

More concretely:

#[repr(C)]
struct Vtable {
    drop_in_place: unsafe fn(*mut ()),
    size: usize,
    align: usize,
    mass_offset: isize,
}

// use
let x: &dyn CelestialBody = &Earth { mass: 1. };
*((x as *const _ as *const ()).byte_offset(vtable.mass_offset) as *const f32);

Similarly, constant associated items looks like this:

trait CelestialBody {
    // &self signifies this must be used with a actual reference to an instance of
    // CelestialBody. 
    const fn MASS(&self) -> f32 = _; 
}

impl CelestialBody for Earth {
    const fn MASS(&self) -> f32 = 5.972e24; // This must be a const experssion
}

// use
let x: &dyn CelestialBody = &Earth { mass: 1. };
x.MASS(); // 5.972e24

In this case, the actual value are stored directly inside the vtable. The value will be byte-copied when it's accessed.

Why not use the existing associated constant syntax

We could choose to make associated consts object safe, however, this approach has a difficult problem:

Constant associated accessors are fundamentally different from normal associated constants, in that they can only be accessed through actual reference to an instance of a given trait; whereas normal associated constant can be accessed through just a type name.

For example:

trait Trait {
    const CONSTANT: u32;
}

// Use
fn use<T: Trait + ?Sized>() -> u32 {
    <T as Trait>::CONSTANT
}

Since dyn Trait implements Trait, we should be able to use::<dyn Trait>(). But this won't work.

If we introduce a different syntax for accessing associated consts via a trait object, then we create a sort of implicit type bound that can't be known without looking at the function body. And forcing the use of new syntax wherever T: ?Sized bound exists would potentially be a breaking change.

So instead, we add a new kind of associated item, which can be accessed with a consistent syntax whether through a trait object, or through a concrete type - you must use a actual reference to value.

(this problem is pointed out by @steffahn in the discussion linked above)

Drawbacks

?

Alternative syntaxes

use

trait Trait {
    use self::{field1: u32, mut field2: bool};
}
struct A(u32, bool);
impl Trait for A {
    use self::{0 as field1, 1 as field2};
}

let

trait Trait {
    let value: u32;
    let mut value_mut: u32;
}

struct Type(u32);
impl Trait for Type {
    let value = self.0;
}

// use
let concrete: &dyn Trait = &Type(10);
println!("{}", concrete.value); // 10

This syntax seems more natural, problem arises when we consider associated constants:

trait Trait {
    // What should syntax for constant accessor be?
    // const VALUE: u32 // this syntax already exists

    // Proposal 1:
    dyn const VALUE: u32;
    // Proposal 2:
    const VALUE: u32 where Self: ?Sized; // somewhat difficult to convey
                                         // this can not be accessed like T::VALUE
}

Same as suggested but with a different keyword

Same as the examples in the design section, but replace fn with something else.

Use an attribute

#[accessor]
fn mass(&self) -> f32;

// impl
#[accessor]
fn mass(&self) -> f32 { &self.mass }

Problem with this is, the user might think they are writing a normal function, and could be surprised by things that would work in a normal function not working here.

Bare

Like members in a struct definition:

trait Trait {
    value: u32;
    mut value_mut: u32;
}

This syntax just feels wrong.

Unresolved questions

  1. Is borrow splitting allowed?

Prior art

1 Like

Using a simple offset implies that the assigned expression (&self.mass) must not have any additional indirection. It could allow digging into nested fields, but not through any dereferencing. Off-hand, I can't think of any existing language feature with that kind of constraint.

Right, the accessors are for accessing direct member fields of self. I choose &self.mass because it's a familiar syntax, maybe that doesn't convey the limitation well enough?

Maybe the limitation can be lifted a little bit. e.g.

struct A(u32);
struct B(A);

// this should probably be allowed, like you suggested
fn get(&self) -> u32 = &self.0.0

But designing exactly what is and what is allowed can be tricky (what about array indexing? unsafe code that adds an offset to self then dereference?), so I went for the simplest one.

In theory, it's possible to encode more info in the vtable to support multiple level of indirections. But that is kind of like embedding a small byte-code program with instructions of how to dereference, and its cost will quickly catch up with a virtual function call.

What advantage does this offer over

impl CelestialBody for Earth {
    fn mass(&self) -> &f32 { &self.mass }
}

is it just intended to force a more optimized vtable representation?

Yes. Calling mass() through a dyn CelestialBody forces a virtual function call, compiler can't optimize away the call (even when the returned value is not used!) unless the call is devirutalized.

This is meant to eliminate the virtual call altogether.

1 Like

Are these supposed to offer borrow splitting as well? That's the primary advantage of public fields over accessors normally.

1 Like

Unfortunately borrow splitting could be unsound if done naively.

trait Trait {
    fn field1(&mut self) -> &mut u32 = _;
    fn field2(&mut self) -> &mut u32 = _;
}

struct A(u32);
impl Trait {
    fn field1(&mut self) -> &mut u32 = &mut self.0;
    fn field2(&mut self) -> &mut u32 = &mut self.0;
}

Maybe we can require all accessors to reference different fields, not sure if there will be unforeseen consequences.

1 Like

That's harder (impossible?) if there are multiple traits involved, possibly across multiple crates.

We can allow borrow splitting only when all used accessors are from the same trait, and not when accessors from different traits are used at the same time.

(Disclaimer: Just musing out loud here. Beware undigested thoughts.)

One version might be to use something more pattern-like here. Maybe fields in traits could be exposed more like a view type, so you provide not each field separately, but everything in a particular view together.

That way it would be relatively easy for the compiler to check (it's the same as it already does in a method today) and checking the consumption would be plausible too (as they would need to come from the same view).

Of course, at that point just returning a view type from a method becomes much more similar...

2 Likes

This definitely sounds interesting. What would the declaration in the traits look like?

Actually sounds like accessors can serve like a sort of view type:

trait A {
    fn field(&mut self) -> &mut u32 = _;
}
struct T(u32, bool);
impl A for T {
    fn field(&mut self) -> &mut u32 = &mut self.0; // <--- (1)
}

fn use<T: A>(value: &mut T) { ... }

// main
let mut x = A(0, true);
let x1 = &mut x.1;
use(&mut x); // this only borrows x.0, because of (1)
*x1 = false;

// this would only work if the trait only has accessors and no methods.
// i.e. methods can be seen as borrowing all fields by default

This feels a bit like a scope creep, but definitely interesting.

To add a bit more, view type alone is not enough to enable the kind of optimization I am seeking.

Consider this hypothetical syntax:

trait Trait {
    fn field(& {_} self) -> &u32;
}
impl Trait for ... {
    fn field(& {some_member} self) -> &u32 { ... };
}

It's not enough to just store the offset of some_member in the vtable for fn field, as the function body can be arbitrarily complex.

Compared to:

trait Trait {
    fn field(&self) -> &u32 = _;
}
impl Trait for ... {
    fn field(&self) -> &u32 = &self.some_member;
}