Blog post: View types for Rust

FYI, using partial_borrow invokes UB, at least according to miri with the track-raw-pointers feature.

MIRIFLAGS="-Zmiri-track-raw-pointers" cargo +nightly miri run
1 Like

Yes. I think this is a false positive, although it's not entirely clear. According to the current SB 2.1 spec, raw pointers are not tagged. cf https://github.com/rust-lang/unsafe-code-guidelines/issues/134 Storing an object as &Header, but reading the data past the end of the header · Issue #256 · rust-lang/unsafe-code-guidelines · GitHub Stacked Borrows cannot properly handle `extern type` · Issue #276 · rust-lang/unsafe-code-guidelines · GitHub

Okay, fair point, I guess it might not be clear yet whether this really is UB or not. Note however that your crate does not interact nicely with the API provided by crates such as replace_with.

use partial_borrow::prelude::*;
use replace_with::replace_with_or_abort;

#[derive(PartialBorrow)]
struct Foo {
    field: String,
}

fn main() {
    let mut foo = Foo { field: String::new() };
    let r: &mut partial!(Foo mut field) = foo.as_mut();
    replace_with_or_abort(r, |p| {
        #[repr(C)]
        struct S {
            p: partial!(Foo mut field),
            field: String,
        }

        let mut s = S {
            p, field: "Hello World!".to_owned(),
        };

        let ref_1: &mut String = &mut *s.p.field;
        let ref_2: &mut String = &mut s.field;
        dbg!(&ref_1, &ref_2);
        // outputs:
        // [src/main.rs:26] &ref_1 = "Hello World!"
        // [src/main.rs:26] &ref_2 = "Hello World!"

        let r: &str = ref_2;
        std::mem::take(ref_1);

        println!("{}", r);
        // outputs garbage such as
        // *~c��G
        // (use after free)

        s.p
    });
}
1 Like

How alarming. Thanks for pointing this out. I would welcome an issue here Issues · Ian Jackson / rust-partial-borrow · GitLab (I just discovered that the issue tracker was disabled, sorry, I have enabled it.) That might be better than having this conversation here in the Internals thread...

Trying to log into that GitLab instance, I first tried to sign in with Google, only to get a strange error from Google:

Authorization Error

Error 403: org_internal

This client is restricted to users within its organization.

Then I tried to register a regular account, which worked, but apparently my account needs to be approved by administrators.

Anyway… this sort of thing is why we should stabilize extern types, while also making size_of_val panic when used on them (as opposed to the current nightly behavior of returning 0). The partial! type has size 0 and so replace_with thinks it can move it around in memory by copying 0 bytes. But in reality, pointers to that type are more like opaque handles. Anything that tries to check such a type's size is probably about to do something dangerous, so it should really be a panic.

3 Likes

I really like this proposal, looking at this thread, I see two large points left to be resolved;

  • How would the syntax look like?
  • How would this interact with traits and lifetimes?

I think both of those could be developed in a proper RFC, I strongly urge to bring this concept there.

Side note - this is unfortunately not true when it comes to impl Trait (including the async fn desugaring) and auto traits. Auto traits 'leak' through the underlying type through the opaque type, so you need to type-check the body of the callee to know if its opaque return type is Send/Sync

3 Likes

Is there a mistake in the example from the blog post? The code compiles when I change into_iter() to iter(), which seems easier than using view patterns.

        // for (bar, i) in self.bars.into_iter().zip(0..) {
        // replaced with:
        for (bar, i) in self.bars.iter().zip(0..) {

I think you’ll have to assume that a ChocolateBar cannot be cloned, and has a by-value fn into_wrapped(self, opt_ticket: Option<GoldenTicket>) -> WrapperChocolateBar method.

Sorry for bumping this thread again. I've tried to imagine a better syntax inspired by some ideas that people expressed ITT and based on some of my ideas about similar feature. Hence, I've came up with something interesting which I want to share here.

Still not sure if that wouldn't drag this thread into a wrong direction or expand its scope into unmanageable scale, however, as discussion is stalled adding something new to it doesn't seem to be inappropriate. That said, first of all I'm interested in whether this should be moved in a separate thread or suggestion is natural enough to belong here.

I'll start with expressing some thoughts about the original idea which links it to my:

1. Too much freedom over view creation

There's the following construct:

let self1 = &{golden_tickets} self;

I think it may not be necessary because it could be easily substituted with methods and functions. This requires more boilerplate but for the majority of use cases it may not be too much:

impl WonkaShipmentManifest {
    fn golden_tickets_view(&{golden_tickets} self) -> &{golden_tickets} Self {
        self
    }
}
let self1 = self.golden_tickets_view();

// Or even

fn tickets(t: &{golden_tickets} WonkaShipmentManifest) -> &{golden_tickets} WonkaShipmentManifest {
    t
}
let self1 = tickets(&wonka_shipment_manifest);

// And if functions could be generic over any view we may wrap this into macro:

let self1 = view!(wonka_shipment_manifest { golden_tickets, bars });

That said, it isn't clear why do we need an extra construct besides of that. API authors could predict which views API users may need, and providing such methods is rather trivial. Users then won't be lost in a big number of possibilities of what's possible to do with structs. Furthermore, there would be less unfamiliar syntax to learn, no exposure of private fields in public API (at least in expression context), and we will have some safeguard that prevents using views in places where reference would be fine.

Perhaps I just have a wrong expectation of how often views would be used, or miss some subtle detail, so it would be really helpful if you'll motivate better this construct.

2. Named groups of fields are more ergonomic

Enumerated fields in view notation for me feels too complicated. Even if there are 1-2 fields as in your examples the resulting type signature appears too long and disproportionate amid the other code. Moreover, it allows reordering so there would be many ways to represent the same type; for example &{bars, golden_tickets} WonkaShipmentManifest and &{golden_tickets, bars} WonkaShipmentManifest. Type aliases might help with hiding all of that, however, for me it seems to be just another layer of complexity on top.

Because of that perhaps what you've described as "named groups" must be the default requirement. Even with views that consists from a single field it could be simpler to eye-parse and reason about. And anyway, if my previous guess that API authors must expose appropriate views with methods is correct then it shouldn't be either a big burden for them to appropriately name these views along the way.

The payoff will be that we would have some common language for referring to views, plus less heavy syntax (without braces perhaps) and less repetition:

// Something like this:

struct WonkaShipmentManifest {
    bars: Vec<ChocolateBar>,
    golden_tickets view tickets: Vec<usize>,
} //               ^^^^^^^^^^^^
  // We may annotate `bars` with the same label to create a group

impl WonkaShipmentManifest {
    // This is just placeholder syntax which at this point doesn't matter
    //                          v                 v
    fn golden_tickets_view(&self.tickets) -> &Self.tickets {
        self
    }
}

fn example(shipment: &WonkaShipmentManifest.tickets) { ... }

That said, field reordering and exactness of &{x, y} Z syntax also could be better motivated so people would have a chance to know why an obvious alternative of named groups is wrong.

3. Similarities with lifetimes are too intrusive

Without the mentioned previously functionality we've left with almost the same mental model as with lifetimes. Namely, we cannot manually create lifetimes like let t: &'x T = t; and it's all the responsibility of API authors to name them on struct declarations, attach them to fields, group them, and then to expose an appropriate interface for that through methods... Moreover, even ITT people expressed that they have an intuition that there might be some similarities between views and lifetimes. Because the bar for novelty in Rust is very high this similarity should be of particular interest for us.

So, it would be useful to know why analogy with lifetimes is wrong if you would decide to promote the initially suggested syntax into proposal or RFC as is.




My idea: structs generic over views (assignments) similarly to lifetimes (scopes)

This could be too lengthy, so skim to the end if you're interested in the final result.

It's the next logical step after everything said above. We already know that views may have the same constrains as lifetimes, so let imagine what if we also give them the same properties:

  • To reuse the same 'a notation for declaring views (if it would be possible to infer where's the lifetime and where's the view) and the same &'a X, &'a mut X syntax for referring to them.
  • To make structs generic over views into them similarly to how currently we can make them generic over lifetimes declared inside.
  • To declare view inheritance like a: b: c where a contains fields of b and b contains fields of c similarly to how we declare with 'a: 'b: 'c that reference 'a outlives 'b and 'b outlives c.

How this could look is demonstrated on the next snippet:

//  This isn't a lifetime but a generic label
//  which can point either to lifetime or view.
//  We may have a convention for disambiguation:
//  [a-m] for lifetimes and [n-z] for views.
//  It may look weak but on the other hand side
//  the syntax remains extremely lightweight.
//                           vv
struct WonkaShipmentManifest<'n> {
    bars: Vec<ChocolateBar>,
    golden_tickets 'n: Vec<usize>,
} //               ^^
  // This is how views are "attached" to fields.
  // By annotating multiple fields with the same label
  // it allows to create a named group that could be referred later.
  // Maybe we should find a better placement for attachment —
  // this just looks similar to labels on expressions and I like it.

// Right, structs are generic over their views similarly
// to how they're generic over their lifetimes. This repetition
// might be annoying with a big number of views, but that's how
// generics in Rust works and as for now we must accept it.
//    vv                        vv
impl <'n> WonkaShipmentManifest<'n> {
    // Using views in methods looks not different from using lifetimes.
    // The premise here is that they're similar conceptually.
    //                       vv
    fn should_insert_ticket(&'n self, index: usize) -> bool {
        self.golden_tickets.contains(&index)
    }

    // Returning views also isn't different from returning references.
    //                                   vv
    fn golden_tickets_view(&'n self) -> &'n Vec<usize> {
        self.golden_tickets
    }
}

So, view declaration and view usage syntax here are very familiar: only 'n: field is added. I've reused the ' symbol because people are already get used to it and it was proven as the best choice for lifetimes. Through, at this stage it's hard to tell whether it's good choice for views: perhaps something like field *a: Type for declaration and *a mut self / *a self for usage would be cleaner. Still sigil is anyway not the most important in my suggestion so let just assume that ' was added as a placeholder which we may change in the future.

Yet the above example didn't demonstrated declaring view inheritance through 'n: 'o, 'o: 'p, 'p: 'etc syntax and what matters what's the use case for it.

I want to introduce another example which demonstrates the use case: such overlapping views could allow builder pattern to enforce a particular order of method calls or require some method calls without needing typestates — so called typed builder pattern. That said, we annotate every builder field with different view, define that each subsequent view inherits every previous, and then upgrade with setters from 'n to 'o, from 'o to 'p, etc. until the "complete" view — which contains all fields required in .build(), will become ready.

Below is naive semi-functional implementation:

#[derive(Default)]
struct Builder<'n, 'o, 'p> where 'o: 'n, 'p: 'o {
    a 'n: A,
    b 'o: B,

    // Required to create the "final" view
    end 'p: (),
}

impl <'n, 'o, 'p> Builder<'n, 'o, 'p> {
    fn new() -> Self { Default::default() }

    // Obviously ownership/borrowing system wouldn't allow this to work
    // as expected and we must find a workaround to make it functional.
    //        vvvvvvvvvvvv                          .........      vvvvvv
    fn with_a(&'n mut self, a: A) -> &'o mut Self { &mut Self { a, ..self } }
    fn with_b(&'o mut self, b: B) -> &'p mut Self { &mut Self { b, ..self } }
    fn build(&'p self) -> { .. } //  ^^^^^^^^^^^^
} //         ^^^^^^^^ Taking `&self` into `build` is also problematic

So, this only demonstrates the intuition; fortunately I see the path how to make it fully functional.


The first problem here is that we cannot transform e.g. &'n mut Self to &'o mut Self inside of method body mainly because of two independent reasons:

  • The field provided by 'n is behind reference and we cannot move out of it
  • The view created in method's body would be dropped at the end of scope and we cannot reference it

Someone may expect that fix would be embodied into something like "owned" views suggested in the original proposal, so signatures of setters would become ('n: self, ..) -> 'o: Self. However, that would increase the complexity of views and creates an ugly discrepancy between "upgradability" of owned views and borrowed. I really think that views must always stay behind references similarly to lifetimes to keep it simple.

So, this could be fixed relatively easily and fix even seems to be a very expected feature alongside with views: we allow upgrading from view 'n to 'o using assign/deref-assign syntax. That said, to upgrade we simply must assign every field from 'o view into 'n view:

fn with_a(&'n mut self, a: A) -> &'o mut Self { *self.a = a; self }
//         ^^^^^^^^^^^           ^^^^^^^^^^^^
// It's possible to assign `self.b` because we know that `'n` becomes `'o'.

// BTW this may compensate the lack of `let self1 = &{golden_tickets} self;`.

Next, the second problem with the above semi-functional example is that build takes self as reference so we will have trouble to move the underlying value out of it. Moreover, it requires self being "marked" with an extra 'o view which is absolutely useless for anything else while adding a lot of complicatedness and verbosity. Ideally build should take self by value; although, taking self by value is possible to achieve just by changing the signature this will allow writing Builder::new().build() where calling setters is skipped. We obviously don't want that since the whole point of using views in builder is to enforce setters and method call order.

So, the fix is quite interesting on its own: we introduce an "empty" view which would be supertype for every other view so we'd be able to "upgrade" from it into any other view by assigning all required fields of the expected view. Next step is that we return type generic over it from new while expecting build to take self by value — these types wouldn't match and method order remains preserved. So, there's no more need in the "final" marker view which I've introduced earlier:

#[derive(Default)]
struct Builder<'n, 'o> where 'o: 'n, {
    a 'n: A,
    b 'o: B,
}

impl <'n, 'o> Builder<'n, 'o> {
    // This is just placeholder name which we may change in the future.
    //                  vvvvv vvvvv
    fn new() -> Builder<'nil, 'nil> { Default::default() }

    fn with_a(&'nil mut self, a: A) -> &'n mut Self { *self.a = a; self }
    fn with_b(&'n mut self, b: B) -> &'o mut Self { *self.a = a; self }
    //                               ^^^
    // We assume that view which contains all fields is equivalent to `Self`.
    //       vvvv
    fn build(self) -> { .. }
}

This gives a lot of extra useful properties:

  • We don't need the final marker view anymore because we have "initial" 'nil marker instead
  • This is very similar to 'static lifetime which is implicitly available everywhere
  • Constructor, finalizer, first and intermediate setters are distinct between each other
  • We can enhance it further and skip Default and Option usage in builder altogether!

Especially interesting is the last: it's possible to allow substituting Default::default() in new with something like Self {} and completely skip deriving Default for Builder struct. That's because underlying memory behind 'nil is guaranteed to be reassigned on method's call side before being read ('nil doesn't provide access to any field but could be upgraded) so locally it's safe to allow this memory to stay uninitialized before that. This even renders holding unassigned fields of builder in Option unnecessary (when we know that it always must be assigned) as we've basically achieved null safety on type level!

struct Builder<'n, 'o> where 'o: 'n, {
    a 'n: A,
    b 'o: B,
}

impl <'n, 'o> Builder<'n, 'o> {
    fn new() -> Builder<'nil, 'nil> { Self {} }
    ..
}

//    Builder<'nil, 'nil>
//      vvvvvvvvvvvvvv
let x = Builder::new().with_a(a) ..
//                     ^^^^^^^^^
//                 Builder<'n, 'nil>

I either like the alignment of code with "empty" views and how easy it's possible to gather all necessary information to understand it (in comparison with typed builder implementation, of course). Perhaps this wouldn't project onto a real world code; however, I've neither demonstrated a complete solution...

Because something still didn't added up:

fn with_b(..) -> &'n mut Self { .. }
fn build(self) -> { .. }

The third, the final, and the most obnoxious problem with the above semi-functional example is that (&'n mut self, ..) -> &'o mut Self setters don't transfer ownership into build and this last step just prevents the whole pattern to function properly. As it was said earlier, we don't want to taking self by reference in build to become a requirement since that may create troubles with moving out of this reference.

Fix for that isn't trivial but still looks viable: we need to explicitly embed into method signature mention that it updates the reference from view 'n into view 'o. Then on method's call side view update would be treated in the same way as with assignment e.g. x.with_a(a); is considered the same as x.a = a;. And if we'd think about this a bit it becomes quite logical: methods that takes &mut self are basically wrappers around = — especially those that are setters on builders.

It's the most complicated fix to explain in words, so example might explain better:

// This is placeholder syntax which may change in the future.
//        vvvvvvvvvvv
fn with_a(&'nil -> 'n mut self, a: A) {
    *self.a = a;
}

Notice that I've also removed the -> &'m mut Self from this method signature because it anyway wouldn't allow method chaining: we won't be able to pass reference returned from this method into build. Right, the complete solution doesn't support convenient method chaining. However, this gives another advantage: if view upgrading propagates through a different mechanism than method chaining then we don't need view inheritance in builder pattern. That said, the following definition of Builder struct should work just fine:

struct Builder<'n, 'o> {
    a 'n: A,
    b 'o: B,
}

...

let mut x = Builder::new();
x.with_a(a);
x.with_b(b);
let y = x.build();

// The following is also possible:
let mut x = Builder::new();
x.b = b;
x.with_a(a);
let y = x.build();

// We still can enforce order of assignment with making builder 
// private under some module so only setters would be public. 

So, let take a look at how all combines together:

struct ProductBuilder<'n, 'o, 'p> {
    a 'n: A,
    b 'o: B,
    c 'p: C,
}

impl <'n, 'o, 'p> ProductBuilder<'n, 'o, 'p> {

    fn new() -> ProductBuilder<'nil, 'nil, 'nil> {
        Self {}
    }

    fn with_a(&'nil -> 'n mut self, a: A) {
        self.a = a
    }

    fn with_b(&'n -> 'o mut self, b: B) {
        self.b = b
    }

    fn with_c(&'o -> 'p mut self, c: C) {
        self.c = c
    }

    fn build(self) -> Product {
        Product::new(self.a, self.b, self.c)
    }
}


/************ USAGE *************/

let mut p = ProductBuilder::new(); // <'nil, 'nil, 'nil>
p.with_a(a);                       // <'n, 'nil, 'nil>
p.with_b(b);
p.with_c(c);                       // <'n, 'o, 'p> == Self
let product = p.build()


// It's possible to do the following:

let mut p = ProductBuilder::new();
p.a = a; // We can read this field after that.
println!("{}", p.a);
p.b = b;
p.with_c(c); // Works because all fields of `'o` are assigned.
drop(p.c); // Lowers to view `'o`.
p.with_c(d);
if condition {
    p.with_a(e); // Views are never "downgraded".
}
let product = p.build()


// BTW, we may have a special syntax for method chaining
// which in other languages is called "method cascading".
// For example with reusing notation from Dart programming
// language the first snippet might be simplified into:

let p = ProductBuilder::new()
    ..with_a(a)
    ..with_b(b)
    ..with_c(c)
    .build();

Honestly, I don't know how hard something like that would be to implement if even possible as e.g. upgrading views may require an extra analysis to be performed by compiler, and overall addition might be simply "too advanced" to fit into Rust's weirdness budget. Perhaps it's possible to implement simple state machines with this syntax but it's hard to say how practical the result would be. It's also hard to predict what corner cases this might introduce and overall how hard something like that would be to learn for newcomers.


Ultimately, three questions remain:

  • Am I misunderstood/miscalculated/forgot something which renders the above approach impractical?
  • What else patterns this could improve besides of typed builder?
  • Is there any useful overlap between views and lifetimes so we can write where: 'view + 'lifetime or even annotate lifetimes and views with the same label in order to allow self-referential structs or any similar functionality?

Hi there! I'm quite new to Rust and just recently hit this very limitation that view types is trying to solve. To be honest reading through all of the examples mentioned here makes me wanna throw rust away and forget about it lol, no offense =)

IMO for solving original problem it seems unnecessary to introduce any new syntax at all. Why can't we let compiler (borrow checker) to do the job for us by letting it collect info/metadata on which path's are being used in borrows of the function recursively and then checking this metadata at the places where functions are called?

In fact such functionality already exist at least partially for closures.

The benefit would be that no extra syntax and no new concepts have to be added to rust and as such the learning curve can remain what it is today and not worse. Also there is no risk of messing with public/private APIs as the checks are done in the context of usage so only what is accessible in particular context needs to be validated which is nice.

Am I talking nonsense here or does it make any sense at all?

This is a non-starter, unfortunately. One of the core Rust principles is that every breaking change has to be reflected on the function signature, to prevent semver hazards. If borrow splitting is computed via the function's body, it's a violation of this rule and opening the door to lots of problems and accidental breaking changes.

This is mentioned on the blog post, too:

This is part of the reason that Rust does its analysis on a per-function basis: checking each function independently gives room for other functions to change, so long as they don’t change their signature, without disturbing their callers.

5 Likes

Okay, I can see how the recursive nature of inference could be propagating a cascading breaking change through to a consumer, and that indeed would be a mess.

But what is we can limit this type of auto inference to only a user code and not anything else like externs/crates? This would allow users to have full control over only their functions so there is no risk of cascading BCs.

I think for the most part this feature is only useful in your own codebase only.

I just came across this blog post and I love the idea of views, but do they have to be disjoint? The use-case I’m thinking of is a split method on an async TcpStream that returns read and write halves of the original TcpStream. Internally the Read and Write halves still point to the same TcpStream struct, but can both be used mutably. Using syntax similar to what was shown in the blog post with named views, it might look something like fn TcpStream::split(&self) -> (&{Read} Self, &{Write} Self) { (self, self) }

This does put more responsibility on the programmer to ensure correct usage of self, which might be a bit of a contentious idea, but I think we would still want overlapping views even if just for Sync (and Send?) types.

Edit: After skimming through this thread again, I think this might be the same idea @josh had by "defining compatible sets of fields". Also Tokio's TcpStream is thread safe so having mutable views with shared fields that aren't Sync might not even be an issue (which I guess doesn't make sense either). Still a really neat idea and possibly another step towards fearless concurrency.

P.S. I know this is completely unrelated to view types, but I like the idea of Rust automatically figuring out disjoint views as a “magical” experience moment. Especially if it can be done trivially and the magic is fairly mundane, if you take a minute to figure it out. Rust novices get a feature that kinda just works, and for more advanced uses, computer witches can opt-in to explicit views :woman_mage:.

Could Drop be allowed to be implemented for a view type? Currently, implementing Drop for a type forbids all partial moves out of the type, because you don't know what the Drop impl will access. But if you implement Drop for a view type, Rust knows that dropping can only use the fields in the view, so you are free to move out of every other field.

1 Like

Another idea: could pin-projection work through this system? One view of a type would be projectable Pin<&mut Type> -> &mut UnpinFieldsView, while pinned fields would have a projection Pin<&mut Type> -> Pin<&mut PinFieldsView>.

A sketch of a design:

// In `core`

use core::marker::PhantomData;
use core::pin::Pin;

/// `DummyInvariant` is invariant over views. In other words,
/// `DummyInvariant<Type>` is not a subtype of `DummyInvariant<ViewOfType>`.
/// `&mut Type` *is* a subtype of `&mut ViewOfType`.
/// This is never constructed, it only exists for the purpose of its invariance.
#[lang = "dummy_invariant"]
#[fundamental]
pub struct DummyInvariant<T>(PhantomData<T>);

#[sealed]
pub trait DummyTrait {
    type Inner;
}

impl<T> DummyTrait for DummyInvariant<T> {
    type Inner = T;
}

pub trait UnpinProject: DummyTrait {}

impl<'a, T> Pin<&'a mut T> where
    T: ?Sized, 
    DummyInvariant<T> as DummyTrait: UnpinProject,
{
    fn get_mut(self) -> &'a mut T {
         unsafe { self.get_unchecked_mut() }
    }
}

// In user crate

use core:: ... ::{DummyInvariant, DummyTrait, UnpinProject};
use core::pin::pin;

struct PinnedStruct {
    pinned_field: i32,
    unpinned_field: i32,
}

impl UnpinProject for DummyInvariant<view of PinnedStruct { unpinned_field }> {}

fn main() {
   let pin_s: Pin<&mut PinnedStruct> = pin!(PinnedStruct { 0, 0 });
   
   // `Pin<&mut PinnedStruct>` is a subtype of `Pin<&mut (view of PinnedStruct { unpinned_field })>`

   // We get an unpinned mutable reference, but of a view with only `unpinned_field`.
   // So we can't get a `&mut pinned_field`
   let unpin_view: &mut (view of PinnedStruct { unpinned_field }) = pin_s.get_mut();
    *unpin_view.unpinned_field = 3;
}

Not sure how to extend this to project pinned fields as well...

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.