Notes on partial borrows

One of the borrow checker's major limitations is that it's not possible to create a reference that borrows from a subset of a type (such that borrows of non-overlapping subsets will not conflict). A langauge-level solution to this problem has long been desired.

For example, given a struct Foo { x: i32, y: String, z: u32 }, here's a (non-exhaustive) list of things people might want to express:

  • A reference to a Foo takes a shared borrow of fields x and y only, both for lifetime 'a
  • A reference to a Foo that takes a shared borrow of field x and a mutable borrow of field z, both for lifetime 'a
  • A reference to a Foo that takes a shared borrow of field x for lifetime 'a, and a mutable borrow of field y for lifetime 'b

Prior discussion:

There are two potential approaches here. Either the referent (T in &T/&mut T) specifies what can be accessed—usually called "view types"; or the referee (the &/&mut itself) does—I'll call this "reference views".

Approach 1: View types

Niko Mastakis's blog post takes the first approach, with the following justification:

When I’ve thought about this problem in the past, I’ve usually imagined that the list of “fields that may be accessed” would be attached to the reference. But that’s a bit odd, because a reference type &mut T doesn’t itself have an fields. The fields come from T.

However, this model has several issues:

  • Mutability gets specified twice, as part of both the reference and the view.
    • This inconvenience is balanced by the advantage of being able to specify mutability restrictions for owned values; but RFC 3323 likely makes that feature redundant.
  • It's impossible to have a reference that borrows multiple fields for different lifetimes.
  • How should owned view types be treated from an opsem perspective? Can they be memcpyed? Transmuted to [MaybeUninit<u8>]? Stored in a Box? mem::swapped? drop()ped? etc.

For these reasons, I believe that Niko's original instincts were correct, and putting the view on the reference is the right choice.

Approach 2: Reference views

Placeholder syntax

What might "reference views" look like? The &'a T/&'a mut T family would be extended with many more access modes. I won't make a full syntax proposal in this post. But for the sake of examples, this is the notation I'll be using, given the definition of Foo from earlier:

Field x Field y Field z Reference type
Shared borrow for 'a Shared borrow for 'a Not borrowed &<'a x, 'a y> Foo
Shared borrow for 'a Not borrowed Mutable borrow for 'a &<'a x, 'a mut z> Foo
Not borrowed Mutable borrow for 'a Mutable borrow for 'b &<'a mut y, 'b mut z> Foo

&'a Foo is equivalent to &<'a x, 'a y, 'a z> Foo; similarly, &'a mut Foo is equivalent to &<'a mut x, 'a mut y, 'a mut z> Foo. (Or are they? See "Aliasing model" section.) &<> Foo is a reference with access to nothing.

Subtyping implications

For view references to be usable, a function that accepts them should be strictly more flexible than one that accepts the correspoding non-view reference. For example:

  • fn foo<'a>(_: &<'a x> Foo) is more flexible than fn foo<'a>(_: &<'a x, 'a y> Foo) is more flexible than fn foo<'a>(_: &'a Foo).

  • fn foo<'a>(_: &<'a mut z> Foo) is more flexible than fn foo<'a>(_: &<'a mut y, 'a mut z> Foo) is more flexible than fn foo<'a>(_: &'a mut Foo).

Function signatures are contravariant in their inputs, so the above implies that:

  • &<'a x> Foo is a supertype of &<'a x, 'a y> Foo is a supertype of &'a Foo.
  • &<'a mut x> Foo is a supertype of &<'a mut x, 'a mut y> Foo is a supertype of &'a mut Foo.

How does this extend to views with both shared and mutable portions? First of all, views compose:

  • fn foo<'a>(_: &<'a x> Foo) and fn foo<'a>(_: &<'a mut y> Foo) are more flexible than fn foo<'a>(_: &<'a x, 'a mut y> Foo).

Therefore,

  • &<'a x> Foo and &<'a mut y> Foo are supertypes of &<'a x, 'a mut y> Foo.

But, what's the relationship between &<'a x> Foo and &<'a mut x> Foo? Well,

  • Logically, fn foo<'a>(_: &<'a x> Foo) should be more flexible than fn foo<'a>(_: &<'a mut x> Foo), right?
  • So, &<'a x> Foo is a supertype of &<'a mut x> Foo.
  • Therefore, &<'a x, 'a y, 'a z> Foo is a supertype of &<'a mut x, 'a mut y, 'a mut z> Foo.
  • Therefore, &'a Foo is a supertype of &'a mut Foo.
  • Therefore, &'b &'a Foo is a supertype of &'b &'a mut Foo.
  • But that would be unsound.

Therefore, either substituting fn foo<'a>(_: &<'a mut x> Foo) with fn foo<'a>(_: &<'a x> Foo) must be forbidden, or the variance rules must give view mutability special treatment.

Monomorphization and TypeId

Under this scheme, &'a T and &'a mut T are equivalent modulo views. Yet they also have different TypeIds. Therefore, we must say that references that differ in views have distinct TypeIds and are monomorphized separately.

Multiple-lifetime views

Consider the following trait, and its implementation for Foo:

trait Frob {
    fn frob<'a>(&'a mut self) -> &'a mut u32;
}

impl Frob for Foo {
    fn frob<'a>(&'a mut self) -> &'a mut u32 {
        *self.x += 1;
        &mut self.z
    }
}

As written, Foo::frob's signature requires that self remain mutably borrowed, and therefore inaccessible, in its entirety for as long as the returend &mut u32 is live.

fn test(f: Foo) {
   let mut_u32 = f.frob();
   dbg!(&f.x); // ERROR: f is mutably borrowed
   dbg!(mut_u32);
}

This is needlessly restrictive, however: the returned mutable borrow only involves the z field. The Frob implementation could be updated to use a view reference to reflect this:

impl Frob for Foo {
    fn frob<'a: 'b, 'b>(&<'b mut x, 'a mut z> self) -> &'a mut u32 {
        *self.x += 1;
        &mut self.z
    }
}

In the updated signature, the returned &mut only references 'a; self.x was borrowed for 'b, not 'a, so it can be borrowed again as soon as frob returns, even while the &'a mut u32 is still live.

fn test(f: Foo) {
   let mut_u32 = f.frob();
   dbg!(&f.x); // works! only `f.z` is still borrowed.
   dbg!(mut_u32);
}

However, the new signature introduced a new lifetime parameter 'b to frob's signature. As long as the parameter is late-bound (as it is in this example), this transformation is at most a minor breaking change (might break turbofish); but if it's early-bound, it could be major breakage (I think?).

Aliasing model

A view reference like &<x, mut z> Foo might have access to several discontiguous regions of memory, and there could be another reference like &<y> Foo concurrently accessing memory in the "gap" between x and z. The Rust aliasing model, and the annotations that Rust emits to LLVM, would need to reflect this.

One question that would need an answer is which view references are allowed to read or write padding bytes. There are several options here:

  • View references have no access to padding, only full &T and &mut T do. (This means that &T/&mut T are no longer equivalent types to a view over all fields.)
  • Only a view over all fields has access to padding, and this access is mutable only if all the views are mutable-access.
  • A reference with view of two fields separated only by padding has view over that padding, and that view if mutable iff the view to both fields is mutable. Access to padding at the beginning or end of a struct is granted by view over its first or last field (as laid out in memory) respectively.
  • References can access the padding immediately following each field they can view (with the corresponding mutability); access to padding before the first field is controlled by access to the first field.
  • Etc…

Conclusion

I have no plans to develop this into a full proposal, but I hope these notes are helpful to anyone else who decides to do so. I would appreciate people's thoughts, especially on view types versus reference views, mutability subtyping, and aliasing-model questions.

7 Likes

(Member of T-opsem; NON-NORMATIVE thoughts.)

To note, both the SB and TB proposed models operate on a per-byte basis, so discontinuous borrows aren't intrinsically problematic. (In fact, it's not impossible that we'll permit discontinuous allocated objects.) As a lower bound, &<x, mut y> Foo can behave like (&x, &mut y).

But padding would need to be addressed somehow. It's an identical question to that which arises with UnsafeCell — we permit deallocation of borrowed memory if it's entirely covered by UnsafeCell (rationale: Arc getting freed while another thread is still in AtomicUsize::fetch_sub) — is padding ever considered “covered” by UnsafeCell when the type's fields are? It would imo be nonideal if the field adjacency rules for padding's retag UnsafeCellness differs from it's mutness (though they do have different trade-offs and implications, potentially justifying a difference).

UnsafeCell has a bit of a cop-out option available, where UnsafeCell is (w.r.t. validity) “infectious,” and a reborrow is always either entirely “Freeze” or not[1]. (This is how rustc and LLVM work today.) The same cop-out mostly can't work for borrow views without weakening uniqueness guarantees[2].

...Except that maybe it can, by utilizing similar machinery to how TB fixes the “&Header problem” (also extern type). In (overly reductive) short, SB provisionally retags “extended” provenance range, but only actually (retroactively) commits the effects of that reborrow (i.e. invalidation of other provenance) when those bytes are accessed. It might be possible to extend that behavior to padding. For the same reason as &Header, it should definitely apply to view-excluded fields[3].

I think the most straightforward approach would be to include padding in the retag of its minimally permissive neighbor. That's equivalent to your third option. If you allow

Also, &<> Foo might be more interesting than &() — specifically, it would be reasonable for it to apply an “exists” guard (LLVM dereferencable(size_of)) to the Foo allocation, even if it doesn't otherwise eagerly retag any bytes.

Multiple-lifetime views

Could this potentially solve the &mut T -> &T “borrow downgrade” case as well? E.g. fn foo(&<'a mut self, 'b self> T) -> &'b T. TB is already equipped to handle this, as it treats all &mut borrows as “three phase” (c.f. two-phase borrows).

View Types vs Reference Views

Just to note, “enum variants as types” is quite similar to views as types. You can almost model enum variant types as view types, where the view is the fields of the variant; the limiting factor is the special handling of the virtual discriminant field.

I believe rustc is also more equipped to deal with type refinement than reference refinement (e.g. Freeze/Unpin are qualities of the type at compile time despite being properties of the borrow at runtime). But on the other hand, the existence of &Foo.{x, y} implies the ability to manipulate Foo.{x, y} by value[4]... which might be useful for self methods that only consume a subset of fields?

Not really? That's about more granular access/visibility control. It does have some overlap, but doesn't cover all expressible cases, such as a method only partially moving from self.

View types theoretically aren't that complicated, as they're “just” the use of “preserved” padding instead of “clobbered” padding. They also vastly simplify the interaction with generics, as for<T> &T can monomorphize to &Foo.{x,y} or &Foo.{z} just fine, but you'd probably both &<'_ x, '_ y> Foo and &<'_ z> Foo have T=Foo, and it's the & which differs. Add a trait bound and now this really matters, since e.g. Debug::debug takes an unrestricted &T as input.

Subtyping implications

Also, subtyping opens up some impressively annoying questions w.r.t. impl applicability. Pattern restricted types (and enum variant types, which are a special case of such) are (last I saw) struggling with figuring out reasonable rules; it's important for availability of universal traits like Debug, Eq, and Any.

Lifetime variance doesn't cause problems because of the limited way of constraining impl applicability. You can constrain the “top,” but only relatively; it's always an applicable impl with all 'static lifetimes. You can constrain the “bottom,” but only by requiring it be exactly 'static. So you can only ever have one impl apply to the variant type family.

I think pattern restricted types have so far leant towards just forbidding any impls on restricted types, requiring you to use a newtype to write any new impls. (The newtype is also necessary to enable use of any created niches, as without it's a subtype of the unrestricted type.) But a) “pattern restriction safe” (i.e. no &mut self methods) is a new quality (and breakage axis) for traits that doesn't currently exist, and b) pattern restricted types are subtypes of the unrestricted type, whereas view restricted types are supertypes of the unrestricted type, thus the existing impls are for the subtype, not the supertype, making our case fundamentally inverse pattern restricted types' anyway.

Conclusion

Fully understandable. This is the closest I've been to believing view restrictions are plausible, so it's good discussion. The big additional thing I'd personally want to see for a full proposal is some resolution to how impl selection and coherence should work with views.

It's also probably worthwhile to mention how view restrictions differ from pattern restrictions, and how we'll differentiate the two. They're very different – the restrictions are going in opposite directions — but that's not immediately obvious from just syntax, especially since mut is a part of pattern syntax.


  1. But that does open a bit of a question: should UnsafeCell still “infect” a viewed reborrow which excludes any UnsafeCell fields? ↩︎

  2. It's considered reasonably acceptable for UnsafeCell (i.e. shared mutable) to be cross field infectious because it being cross variant infectious is effectively a necessity (such that reborrows don't need to read the (potentially niched) discriminant). But it's not 100% a given, as it's possible to use that granularity for optimization, even if LLVM can't (currently). ↩︎

  3. Which does build a bit of a footgun if UnsafeCell doesn't infect excluding views, since the field would be retagged as if it weren't covered by UnsafeCell. (This is necessary for LLVM noalias semantics, which can't be span restricted.) ↩︎

  4. By value views should invariant over their field set because they'd forget to drop any additionally provided fields. Which is interesting combined with by reference views being variant. ↩︎

4 Likes

I very much hope so! Though the scheme I've described is insufficiently general to handle all cases of this. For example, fn frob<'a: 'b, 'b>(&<'b mut x, 'a z> self) -> &'a u32 would have only the shared borrow persist past the end of the function, but mutable access to z would be forbidden within the function. To fully address all cases, you would want a view like &<'b mut 'a x> Foo, where the reference can access Foo for a length of 'a but that access is mutable only for the length of 'b.

Now I'm wondering, would it make sense to have a view type that grants access to fields in several variants of an enum, conditional on the enum being in that variant? All such view references would have a read view of the discriminant (and would thus conflict with a full &mut, which would be the only way to get write access to the discriminant). I think the aliasing model would break this, but maybe I'm wrong?

Consuming something means assuming responsibility for dropping it—which, as you note, means that a by-value view type would need to be invariant in the view. This makes them useless for refining existing API contracts. Given that fact, I struggle to see a use-case.

Actually, I think reference views run into none of the problems that pattern types have here.

  • View references, despite their subtyping relationships, have unique TypeIds. This means we can allow writing distinct impls for different views of the same type without any soundness issues. HRTB subtyping already works like this (with a FCW, but there's no remaining unsoundness).
  • As you say, view references are supertypes of the standard references. This means, among other things, that implementing most of the standard traits for them (Eq, Debug etc) makes very little sense. Non-mut view references would need Copy and Clone impls, and Send/Sync would also need to be properly handled, but I think that more or less covers it.

Pattern restrictions go on the type, view restrictions go (I propose) on the reference. This distinct placement makes them easy to tell apart: &<view> Type is pattern.

Okay this is as off topic as it gets, but I really really wanted that! Mainly for storing the tag of an enum separated from the payload, but still work with safe Rust. But also:

If we could have multiple structs at the same memory location, but whose fields were at non-overlapping memory offsets (like view types, but for owned structs), so that the fields of struct A would be considered to be padding by struct B (and vice versa)... that would be pretty cool. Maybe even a language feature to split a struct into two or more structs located at the same memory address. For example, splitting struct X(i32, f32) into a struct that has only a field i32 at offset 0 (and then padding), and a struct that has only a field f32 at another offset, both at the same memory location (and thus can be accessed at the same time in safe Rust)

Well now this doesn't seem so off topic anymore: this is just an owned version of view types, and then view types proper are a borrow of those split structs.

I had thought about view types, too, and this is a concern I came up with. I feel like treating different views with different TypeIds has the potential to overdo on monomorphization.

Of course, &'a T and &'a mut T already have different TypeIds, but maybe one can avoid giving a new one to each kind of view type, too.

And/or maybe there are in general improvements to be had around cutting down on monomorphization. TypeIds and specialization seem to be the main thing hindering this possibility.

Again, for view types in particular, any &'static<…some view…> Foo would be a 'static type, and thus with the current Any API, different views require different TypeIds based on that observation alone. I’ve been contemplating whether it’s possible to de-couple T: 'static from T: Any (with a backwards-compatibility story where prior to edition X, a T: 'static bound [even an implied one] sort-of desugars to T: Any). But that seems hard, too.

I suspect this won't be a huge issue either. View references (as I'm imagining them) won't implement most traits, and won't be created unless you explicitly use the "view reference" operator. I don't expect them to show up everywhere. (This is in contrast to pattern types, which will absolutely show up everywhere, absolutely cannot afford to have different TypeIds, and where dealing with Any is absolutely a problem I have no idea how to solve.)

@Jules-Bertholet I've created 2 variants of partial types for resolving partial-borrow problem (closed in Rfcs). Now, the third version is in progress/draft - partial_types3.

Main idea - get an mathematical guarantee to borrow-checker that borrowing the whole variable with partial type and pretending to borrow just permitted fields is fully safe

And since it is a guarantee by type , not by values , it has zero cost in binary! Any type error is a compiler error, so no errors in the runtime.

The core Idea of this proposal is "Proxy Borrowing" - we borrow the whole variable, but borrow-checker pretends it borrow just permitted fields of this variable.

And Type-checker gives a mathematical guarantee, than all fields with no-access to read remain intact and all partly-immutable fields remain immutable!

And this mean, that Proxy Borrowing borrowing is fully safe and zero cost in binary.

How it looks:

struct StructABC { a: u32, b: i64, c: f32, }

// function with partial parameter Struct
fn ref_a (s : & StructABC.{a}) -> &u32 {
    &s.a
}

let s = StructABC {a: 4, b: 7, c: 0.0};

// partial expression at argument
let sa  = ref_a(& s.{a});
let sbc = & s.{b, c};

And yes, for full flexibility, partial mutability is also needed:

struct S4 {a : i32, b : i32, c : i32, d : i32}

let mut.{b, c} s_mbc  = S4 {a: 6, b: 7, c: 8, d: 9};

// full flexibility 
impl Point {
   pub fn mx_rstate(self : &mut.{x} Self.{x, state}) 
   { /* ... */ }

   pub fn my_rstate(self : &mut.{y} Self.{y, state}) 
   { /* ... */ }

   pub fn mxy_rstate(self : &mut.{x,y} Self.{x, y, state}) { 
    /* ... */
    self.{x, state}.mx_rstate();
    /* ... */
    self.{y, state}.my_rstate();
    /* ... */
   }
}

A lot of logic schemes and syntax is ready and it looks Rust-friendly.