Generalized Partial Borrows

Credit to pczarn in this thread Partial borrowing (for fun and profit) · Issue #1215 · rust-lang/rfcs · GitHub for the general idea and syntax.

I was looking through some suggestions regarding view types and partial borrowing and had an epyphany. This is my proposal:

Definitions

The Nill Lifetime

For my proposal to be sound and integrate well into the existing rules of borrowing, I had to include the nill lifetime - an antithesis of the 'static lifetime. Let '! denote the lifetime that represents no code regions at all. In other words, for<'a> 'a: '! is a true statement. Effectively, it is meaningless - no reference &'! T can even be used because everything outlives it. On its own, the nill lifetime is useless, but it will come in handy later.

Lifespan

Let us define a lifespan to be a lifetime-acces pair, where 'access' is either mutable or immutable. In other words, 'a mut is a lifespan, and so is 'static an so on. I will denote a lifespan as "a, analogous to the 'a syntax.

Hence, any reference may be boiled down to a lifespan &'a mut T is the same as &"a T, where "a = 'a mut.

On its own, this notion is virtually useless. It is important for my definition of composite lifespans.

Composite Lifespan

A composite lifespan is a "tuple" containing multiple (composite) lifespans, like ('a, 'b mut, 'static). For clarity, I will use the syntax ~a to denote a composite lifespan named a. Can't think of a better symbol to use, so it's a placeholder. Note that composite lifespans may be recursive: ('a, ('b mut, 'c), 'd mut) is a valid composite lifespan.

Fragmented References

Composite lifespans allow us to define a more general version of partial borrows and views into structs. The main challenge with implementing view types are the private fields of a (tuple) struct. I therefore propose the following syntax:

pub struct Struct<T>{
    pub x: i32,
    pub y: i32,
    z: T
}

struct ref &("x, "y, ~z) Struct<T>
where
    &~z T:
{
    x: "x,
    y: "y,
    z: ~z
}

I the struct ref block we tell the compiler that referencea to Struct shall have composite lifespans consisting of two lifespans "x and "y and a composite lifespan ~z that may be used with the type T (hence, the where &~z T: clause).

To use this syntax in function calls one would do:

impl<T> Struct<T> {
    pub fn get_x<'x>(&('x mut, '!, '!) self) -> &'x mut i32 {
    &mut self.x
    // `&mut self.y`
    //--^ value does not live long enough: '! needs to outlive 'x
}
    
    pub fn get_y<'y>(&('!, 'y mut, '!) self) -> &'y mut i32 {
        &mut self.y
    }
    
    pub fn get_z<'z>(&('!, '!, 'z mut) self) -> &'z mut T {
        &mut self.z
    }
}

fn main() {
    let mut s: Struct = ...;
    let x = s.get_x();
    let y = s.get_y();
    let z = s.get_z();
    *x = -1;
    *y = 42;
    *z = ...;
}

Let's observe what happens behind the scenes:

  • We create an object of type Struct.
  • The call to Struct::get_x takes a fragmented reference &('0 mut, '!, '!) s which tells the borrowck "whatever inside s that is tied to the first lifetime is borrowed mutably for '0".
  • The subsequent call to Struct::get_y uses a fragmented reference &(!, '1 mut, '!) s. Normally, borrowing the same struct mutably twice is impossible, however, in this case, the stuff in sthat is mutably borrowed throughout the lifetime'1(namely,s.x, though it could be any number of private fields!) cannot be borrowed inside the body of the function Struct::get_y` for it has been declared to possess the nill lifetime, which does not overlap with any other lifetime ('nothing' cannot overlap with anything, per definition).
  • By fhe same reasoning, Struct::get_z may be called in conjunction with the first two methods, thus letting us borrow the private field disjointly from the public ones. Currently, such a mechanism is impossible due to the inability to deconstruct private fields.

Rules

  • Any number of fragmented references are allowed to coexist simulataneously as long as the sets corresponding to each entry across all said references obey the "multiple readers, single writer" rule. For instance, given a type with a fragmented reference &(~a, ~b, ~c), a set of {&(~a_i, ~b_i, ~c_i)} can exist if the sets {&~a_i}, {&~b_i}, {&~c_i} obey the ownership rules. Because fragmented references may be recursive, so is the rule described above. The terminating condition is when we get to a set of lifespans, for which the ownership rules already exist in the language.
  • All non-uniform fragmented references to primitive types shall be valid yet unusable in safe Rust - to be discussed later.
  • Only one type of a composite lifespan shall exist for product types (structs, tuple structs and tuples).
  • A regular reference with lifespan "0 to a type with a defined composite lifespan (~1, ~2, .., ~n) is the same as &("0, "0, .., "0).
  • A lifespan "0 may coerce to "1 if and only if:
    1. "0 is mut or both aren't.
    2. The lifetime of "0 contains the lifetime of "1
  • A composite lifespan ~0 may coerce to a composite lifespan ~1 if and only if every entry in ~0 can coerce to the corresponding entry in ~1. Recursion stops at lifespans.
  • Any lifetime may coerce to '!.
  • '! shall not outlive any lifetime that isn't '!.
  • Any number of (im)mutable references with the nill lifetime may exist at the same time with any set of references that itself obeys the ownership rules.
  • Only &'_ T may implement the Copy trait.
  • If a type doesn't implement a composite lifespan rule, every field/element recieves a unique associated lifespan by default.
  • Only fragmented references of the same "kind" to a primitive type shall exist at the same time. For instance, one cannot use a &('!, '_ mut) i32 while a &('_, '!, '!) I32 to the same object already exists and its lifetime overlaps with that of the former.
  • Given a type with public fields, during a partial borrow/deconstruction the object is being borrowed with a fragmented reference whose lifespans which are associated to the borrowed fields are inferred, with the rest being '! or a recursive composite lifespan that ends with '!.

These rules ensure that composite lifespans and fragmented references are useful, complement the concepts of ownership and references and do not break the already established rules of the language

Syntax

  • Defining a fragmented reference for a struct:
    struct ref &(~a, ..., "b, ..., 'c, ...) Struct<T, ...>
    where
        &~a T:,
        ...
    {
        field0: ~a,
        ...,
        field_m: "b,
        ...,
        field_n: 'c,
        ...
    }
    
    Any field can be assigned any of the mentioned (composite) lifespans/lifetimes. For any generic field there must be a corresponding where clause. Fields may be associated with the nill lifetime, making the field unusable via a (fragmented) reference.
  • Defining a fragmented reference for a tuple struct:
    struct ref &(~a, ..., "b, ..., 'c, ...) Struct<T, ...>(~a, ..., "b, ..., 'c, ...)
    where
        &~a T:,
        ...;
    
    Or
    struct ref &(~a, ..., "b, ..., 'c, ...) Struct<T, ...>
    where
        &~a T:,
        ...
    {
        0: ~a,
        ...,
        m: "b,
        ...,
        n: 'c,
        ...
    }
    
    The same considerations as above apply here too.
  • Defining a fragmented reference to a (tuple) struct as a function parameter:
    fn foo<'a, ...>(_: &('a, 'a mut, '!, '! mut, (...), ...) Struct) {...}
    
    Or
    fn foo(_: &('_, '_ mut, '!, '! mut, (...)) Struct) {...}
    
    In other words, one should use the nill lifetime and a set of generic lifetimes or the ellided lifetime, perhaps recursively composite if the type allows so.
  • Defining a fragmented reference to a tuple as a function parameter:
    fn foo<'a, ...>(_: &('a T1, 'a mut T2, '! T3, '! mut T4, (...) T...)) {...}
    fn bar(_: &('_ T1, '_ mut T2, '! T3, '! mut T4, (...) T5)) {...}
    
    Which is syntactic sugar for
    fn foo<'a, ...>(_: &('a, 'a mut, '!, '! mut, (...), ...) (T1, T2, ...)) {...}
    fn bar(_: &('_, '_ mut, '!, '! mut, (...)) (T1, T2, T3, T4, T5)) {...}
    

Usage

Apart from the example that was showed previously, with the partial borrowing through function barriers, fragmented references can be used as views in an iterative context: Suppose we have a slice of Points, with Point being

pub struct Point(f32, f32);

There is currently no safe, regulated way to obtain an iterator over the first coordinate separately from the second coordinate. That is, a function set_xs must either accept an iterator that yields the first element of each Point or a mutable slice to the whole thing, allowing one to mutate the second element though the API clearly speaks against it. With fragmented references one could do the following:

struct ref &("x, "y) Point("x, "y);

fn set_xs(curr: &('_ mut, '!) [Point], new: &[f32]) {
   todo!()
}

Behind that todo!() should be a new function for the slice type:

impl<~a, T> &~a [T] 
where
    &~a T:
{
    pub fn iter_frag(self: &~a [T]) -> IterFrag<~a, T> {
        // `self` is unusable as a non-uniform fragmented reference
        let ptr: *mut _ = unsafe{ transmute::<_, &mut [T]>(self) } as _;
        IterFrag::new(ptr)
    }
}

IterFrag is a generalised Iter/IterMut that yields composite references instead of regular ones, thus restricting the access to the unwanted fields of the underlying data. Though we had to resort to unsafe to somehow use the slice reference, the whole operation is safe because of the ownership rules regarding fragmented references and the inner implementation of IterFrag which does not touch the fields which are not borrowed by the caller.

Partial Borrowing

The proposed concepts must not interfere with its current slternative - partial borrows. Instead, is must compliment it. When a partial borrow/deconstruction occurs, the borrowck treats the event as if a fragmented reference was aquired. For example:

pub struct Point(pub f32, pub f32, f32);

struct ref &("xy, "z) Point("xy, "xy, "z);

let mut p: Point = ...;
let &mut Point { 0: ref x, 1: ref y, .. } = &mut p;

In the last line the compiler inserts a reference of the kind &('0 mut, '0 mut, '!) p so that methods that use &('!, '!, '_ mut) Point may still be called. Note that the same would be true if only one of the fields was accessed, such as let mut &mut Point { 0: ref x, .. } = &mut p, for the lifespan associated with p.0 and p.1 is the same.

1 Like

Note that people tried to do this previously and it ended up being unsound. (Albeit I don't remember exactly how it was unsound.)

That then led to unsafe<'a> Foo<'a> instead of Foo<'unsafe>:

3 Likes

Could you please elaborate on the link between the nill lifetime I propose and the unsafe binders you referred to?

The connection would be interesting to hear in more detail.

Reading the initial design doc in the issue linked above, it sounds like unsafe_binders makes a clear safety boundary for an unsafe lifetime. Since you wouldn't be able to do anything with a '! nill lifetime in OP, I wonder what trickiness you'd need to get into to cause an issue.

IMHO, my attempt for partial borrow is better:

Example:

struct StructABC { a: u32, b: i64, c: f32, }

// function with partial parameter Struct
fn ref_a (s : & StructABC.{a}) -> &u32 {
    &s.a
}

let s = StructABC {a: 4, b: 7, c: 0.0};

// partial expression, partial reference and partial argument
let sa1 = ref_a(& s);
let sa2 = ref_a(& s.{a});

I don't see a resolution of the private field issue. In the RFC you linked there is no mention of exposure ofvprivate fields, so I assume it isn't implemented. How does one specify the access/lifetime of a private field without leaking privaye info?

It is written in "unresolved question" section:

It would be wonderful to have some pseudo-field, which meant "all not public(private) fields". Maybe !pub is Ok.

let fprivs = &foo.{!pub}; 
// fprivs : Foo.{privfld1, privfld2, privfld3, privfld4, privfld5,}

Not all private fields are made equal. What if I want to get access only tovone private field and leave the rest unborrowed? What if I want to borrow al of them but one? With my approach it's straightforward - just tweak the struct ref implementation, no strings attached

Because you said the nil lifetime is the other end of the lattice from 'static. And that's exactly what 'unsafe was too, so your nil lifetime is just as unsound as that was.

It is not forbidden to borrow any number of private fields, for example:

let fprivs = &foo.{privfld1, privfld3, privfld5}; 

I wish to have &foo.{!pub} case because in many ways I don't want to know internal structure.

1 Like

Interesting idea. I like having some way to specify exactly which field(s) of a structure you are taking which kind of a reference to. I am not qualified to analyze soundness so instead I'm going to point out some minor irritants:

  • The word nil is spelled with only one L. (The word null is spelled with two Ls, except when you're talking about the legacy ASCII control character U+0000. Yes, English is annoying.)
  • Please do not introduce new uses of unpaired quotation marks. 'x for regular lifetimes was already a mistake. (It might be better to split the proposal; the "x notation appears to be necessary only to enable people to be generic over mutability and that's a whole separate can of worms, I think.)
1 Like

Spliting the proposal would further complicate things, for everything is tied to everything else. For instance, the nil lifetime is not needed in the language without the rest of the proposal. Being generic over mutability might sound scary, but since you cannot know wether te reference is mutable or not at definition site, the only thing you could do is coerce the rederence to an immutable one or use unsafe. You could also forbid the usage of lifespans ("x or any other syntax you prefer) in generic contexts, only allowing compoait lifespans to be used generically, and those are entirely unusable for primitive types and can only be used with a product type if specified.

The causality here doesn't track. Just because one concept which you claim is "on the other side of the lattice from 'static" (which I'm not convinced of) isn't sound doesn't mean that another such concept is unsound too. Without an analogy, may you please point out what exactly is wrong with a nil lifetime that cannot be used?

With your symtax you can either borrow specific private (usually undocumented) fields by naming them (which leaks private information) or borrow all of them indiscriminately by using !pub. Is there some middle ground that doesn't expose structure?

I think there is a way to combine both the desire to not export internal structure and being flexible and borrow private fields individually: Let a struct definition (or its crate) specify groups of private fields that can be borrowed together. Moving a private field to another public group is a breaking change of course, but in the following we could move c into Inner without a breaking change, as long as we update the ref_groups accordingly.

struct Inner {
    i_a: u32,
    i_b: u32,
}

struct StructABC {
    a: u32,
    b: u32,
    c: Inner,
    pub d: u32,
    pub e: u32,

    // (This syntax probably has a lot of issues)
    pub ref_group private = [a, b, c], // Defined implicitly (`!pub`)?
    pub ref_group x = [a, b],
    pub ref_group y = [c.a],
    pub ref_group z = [c.i_b, d], // Groups could include public variables
    ref_group w = [a, c], // Intentionally private, used as alias in this crate.
}

Everything accessible from the outside is marked as pub, we're not exposing the internal structure. If the crate that defines the struct doesn't use ref_groups we still have the !pub or implicit private one (unless we want that to be opt-in). And crates using this still have a lot of flexibility (see below).

fn ref_a (s : & StructABC.{a}) -> &u32 {
    &s.a
}
fn do_something (s : & StructABC.{z, x}) {}
fn ref_private(s: &StructABC.{d, private} {}

And in most cases you likely only need the private (!pub) ref_group and the names of public fields.

2 Likes

Sounds awfully similar to my proposal with associated lifetimes, just with different syntax

3 Likes

Yes. Mainly because I find this hard to read, especially since it introduces 3 new symbols to an already complex lifetime system: "a, ~a, '! and less intuitive than having a single 'lifetime annotation and specifying the fields after the type. [1]

fn foo(_: &('_, '_ mut, '!, '! mut, (...)) Struct) {...}
fn normal(s: &StructABC) -> &u32 {}
fn ref_a(s: &StructABC{a}) -> &u32 {}
fn do_something(s: &StructABC{z, x}) -> &u32 {}
fn do_something_mut(s: &mut StructABC{&z, x}) -> &u32 {}

In foo: Without having a close look at the ref definition of Struct you have no idea which fields you can access in what way. [2]

I like the syntax by @VitWW a lot more because its easier to parse (mentally), while the issue of private field exposure can still be solved (see my last comment).

Additionally, it doesn't require changes everywhere if you add another lifetime/reference-group, since each one has a name (the public field name or the name of the ref_group) instead of being a list of lifetimes like a tuple.

I think the issue of referring to private fields isn't a big problem [3] (hence my attempt at showing how it can be solved there).

With your syntax lifetime elision becomes weird/barely possible when using partial borrows (from what I can tell):

pub fn get_z(&('!, '!, mut) self) -> &mut T {
    &mut self.z
}

While we get it by default here:

pub fn get_z(&mut self{z}) -> &mut T {
    &mut self.z
}

Even when returning multiple references: Unless x and z need separate lifetime annotations (which I think is the case because both reference self) we only need one mention of the lifetime 'a (in the arguments).

pub fn get_z<'a>(&('a, '!, 'a mut) self) -> (&'a i32, &'a mut T) {
    (&self.x, &mut self.z)
}

// The inner & is only needed to indicate that x doesn't need to be mutable.
pub fn get_z(&mut self{&x, z}) -> (&i32, &mut T) {
    (&self.x, &mut self.z)
}
// Without lifetime elision
pub fn get_z<'a>(&'a mut self{&x, z}) -> (&'a i32, &'a mut T) {
    (&self.x, &mut self.z)
}

  1. Is there a (parsing) reason for the . between the type name and the list of fields? ↩︎

  2. ref_groups for have the same issue, but they are named and are only for private fields which you can't access directly anyways. ↩︎

  3. Famous last words ↩︎

3 Likes

Any implementation of this concept must introduce new syntax, there's no way around it. I get the issue with "x - it may be considered redundant and included as a special case of ~x = ("x,), though that would entail more where clauses.

'! is arguably an intuitive thing in the context of ownership and lifetimes, though 'nil would be more ergonomic (at the cost of introducing a new keyword).

The whole shtick with partial borrowing and views revolves around lifetimes, which is why I think they should be an integral part of the syntax.

This is a serious problem with my syntax, so I propose a revision - similarly to how Rust has tuple structs and structs with named fields, so should fragmented references. For example:

pub struct Foo {
    pub x: ...,
    pub y: ...,
    z: ...,
    w: ...
}

struct ref &{x: "x, y: "y, priv: "p} Foo {
    x: "x,
    y: "y,
    z: "p,
    w: "p
}


fn bar(_: &{x, .. /*`..` are assumed to be `'!`*/} Foo) -> (&i32, &str) {...}  // without lifetime elision

fn baz(_: &{x: '_, ..} Foo) -> (&'_ i32, &'_ str) {...}  // with elision

fn hoo<'x>(_: &{x: 'x, ..} Foo) -> (&'x i32, &'x str) {...}  // explicit lifetime

And a tuple fragmented reference could be accessed by indices:

fn foo(_: &{0, 2, ..}) {...}

I like your sub-idea about addition grouping :star_struck: I've change a bit syntax

struct StructABC {
    a: u32,
    b: u32,
    c: Inner,
    pub d: u32,
    pub e: u32,

    // grouping pseudo-fields
    pub let private : {a, b, c}, // Defined implicitly (`!pub`)?
    pub let x : {a, b},
    pub let y : {c.{i_a}},
    pub let z : {c.{i_b}, d}, // Groups could include public variables
    let w : {a, c}, // Intentionally private, used as alias in this crate.
}
1 Like

Tuples are weird in both representations/syntaxes:

fn foo(value: &(u8, u8, u8).{1}) {}
fn foo(value: &{1} (u8, u8, u8)) {}

Aren't non-borrowed variables (let x: MyStruct) also just a view (think about padding)? You could thus probably argue that all struct definitions already define a view (same for enums) and thus all types are already a view, regardless of the existence of lifetimes or borrowing.

Though I think in the end it doesn't really matter which way to see it as long as its intuitive to read/understand, which both variants are now.

:+1:

I wonder if there is any situation where you wouldn't want , .. here.

Personally I still think Foo is more relevant/important (in regards to what you can give the function) than which fields of foo are relevant (this one means you almost have to read the type + lifetime backwards to know where x comes from), but I get where you're coming from and think both variants are fine.