Can `Pin` own its referent instead of borrowing mutably?

Which matches up with the approach taken in your blog post which I was referring to as an ouroboros borrow (ouroborrow?) :wink:. An interesting difference is that the &move approach separates ownership from storage, so the owner (PinMove) can still be moved around and accessed, with only the storage being immovable/inaccessible. With the ouroboros approach, the owner (Slot; see Appendix: Ouroboros approach) has to become immovable/inaccessible because it also stores the pinned value.

Ownership might also be the solution to @cramertj’s invariant—i.e. Drop::drop is called before the storage is freed. With a PinBox<T> you know that T isn’t owned by a ManuallyDrop, so you can safely perform setup that must be undone before freeing by Drop::drop. The stack equivalent could be to perform that setup where the ownership is still known. If you can intercept the &'a mut Slot<'a, T> before it becomes a Pin<'a, T>, you know the owner isn’t ManuallyDrop and can safely perform setup. With PinMove, you would also know the owner, but because you retain access to the PinMove throughout the pinning, there’s some more flexibility.

Appendix: Ouroboros approach

I’m arbitrarily renaming the Pin in your blog post to Slot to remove the name conflict with the current pinning design. I at one point suggested Hasp instead, but I have a feeling that one won’t catch on :wink:.

struct Slot<'a, T> {
    value: T,
    _marker: PhantomData<&'a mut &'a ()>,
}

impl<'a, T> Slot<'a, T> {
    fn new(value: T) -> Self {
        Slot { value: value, _marker: PhantomData }
    }
    
    fn pin(&'a mut self) -> Pin<'a, T> {
        unsafe { Pin::new_unchecked(&mut self.value) }
    }
}

Inspired by this thread and the recent DynSized RFC, I had an idea for how to replace Pin with a forwards-compatible, library-only subset of !Move. Which is actually just !DynSized.

In ‘short’:

  • Pin<'a, T> is replaced by &'a mut T. So Future, Generator, etc. can just take &mut self again! Here I’ll talk about Generator for convenience, but the same applies to Future.
  • To enforce immovability, users never get access to owned generators directly; thus, instead of generator functions returning impl Generator, they would return something like impl Anchor<Inner=impl Generator>, for some trait
    trait Anchor {
        type Inner: ?Sized;
        // PinBox and friends would use this method to implement DerefMut;
        // it's unsafe because you should only call it if you can guarantee
        // that `self` will never move again.
        unsafe fn get_mut(&mut self) -> &mut Self::Inner;
    }
    
  • Okay, the hard part. Why doesn’t the above ‘just work’?
    • Because of mem::swap and friends. Given &mut MyGenerator you can move it or (in some cases) extract an owned MyGenerator, which breaks the assumption of immovability.
  • So what if we mark MyGenerator as !Sized? (Notwithstanding the semantic strangeness of making something !Sized that doesn’t actually have an unknown size.) That would block mem::swap and similar.
    • But there’s one dangerous function that doesn’t have a Sized bound: mem::size_of_val. In particular, the Box-to-Rc conversion works even for unsized types and assumes it’s kosher to move them by memcpying mem::size_of_val(foo) bytes.
    • Although you couldn’t actually use that conversion to break the above scheme, since you’d never be allowed to get a Box<MyGenerator>, its presence in the standard library is a license for unsafe code to use the same technique in broader scenarios.
  • But this is the same problem meant to be solved by !DynSized!
    • The above-linked RFC proposes having size_of_val either panic or return 0 for a !DynSized type, combined with a lint; an alternative mentioned is to make DynSized a new default bound and add ?DynSized.
    • Meant to be used with extern type, i.e. FFI opaque pointers, which itself is already accepted and implemented (but not stable).
    • But it would work just as well for immovable types. The desired semantics are effectively the same at least from a generic code perspective.
  • But even ignoring how crazy that sounds, how is it a library-only solution if DynSized doesn’t even exist yet?

Well, for now we can simulate it with a silly hack. Generator functions would expand to something like:

// The anchor contains the actual state...
struct MyGeneratorAnchor {
    // local variables go here...
    x: i32,
}

// The generator type is just an extern type!
// In other words, &MyGenerator is an 'opaque pointer' that secretly points
// to MyGeneratorAnchor.
extern { type MyGenerator; }

impl Anchor for MyGeneratorAnchor {
    type Inner = MyGenerator;
    unsafe fn get_mut(&mut self) -> &mut Self::Inner {
        // create the opaque pointer by casting to &mut MyGenerator:
        &mut *(self as *mut MyGeneratorAnchor as *mut MyGenerator)
    }
}

impl Generator for MyGenerator {
    fn resume(&mut self) -> whatever {
        // undo the above cast:
        let anchor: &mut MyGeneratorAnchor = unsafe { &mut *(self as *mut MyGenerator as *mut MyGeneratorAnchor) };
        // … now we can use local variables …
    }
}

So, how does this solve the problem?

  • MyGenerator is !Sized (extern type already has that effect), so anything that tries to use the standard mem::swap, mem::size_of, etc. just won’t compile.
  • mem::size_of_val(foo: &MyGenerator) will work but return 0 (because it’s an extern type). So if some tricky unsafe code tries to, say, swap two &mut MyGenerator instances by memcpying bytes around, that’ll just silently do nothing, leaving the generators’ actual data intact. That’s not great, in the sense that it’s a surprising result, but it’s not unsound either. And again, this is the exact same problem that applies to ‘real’ FFI opaque pointers.
    • …There might be some even weirder things unsafe code might try to do that would be unsound – though it’s hard to say it’d be justified in doing so – but that would be equally unsound with extern type used for FFI! (For example, a version of take_mut that takes ?Sized and returns the existing object copied to a box.)
  • In the hopefully near future, the DynSized RFC or something similar will land, providing for a lint or error when someone tries to use size_of_val here.
  • In the meantime, since essentially no generic code actually does that, you’ll be able to use the generic ecosystem as usual with &MyGenerator and &mut MyGenerator – with no need for everything to add an extra variant for Pin. ?Sized is enough.

Thoughts?

2 Likes

It doesn’t seem worth it to me. The design of DynSized is influenced by a couple things: backwards compatibility and being relevant to a low-level feature. This creates a mismatch in design goals compared to an entirely new higher-level usability feature (Pin supports generators and other self-referential types). People will typically expect to run into something a bit hackish when doing FFI, but not for a headline language feature.

Well, for one thing, backwards compatibility is important to the design of any new Rust feature. I don’t see how that differs much between this and extern type. After all, just as Pin<'a, T> is a wrapper for &'a mut T for weird T, FFI users could make MyFFIReference<'a> wrapper structs; but extern type provides extra sugar to allow native references to be used safely, a goal that I think applies here as well.

And keep in mind that this would be a replacement for Pin, which is itself hackish – I’d argue more so. Since Pin<T> is not a native mutable reference:

  • You can’t reborrow it for a shorter lifetime;
  • The syntax is clunkier, and it’s non-obvious from the name that Pin<'a, T> acts like &'a mut T;
  • It doesn’t work with any existing generic code that expects &mut T.

Also:

  • You can’t borrow a field as a Pin, i.e. given struct Struct { field: MyGenerator }, you can’t go from Pin<Struct> to Pin<Field>

…except my proposal wouldn’t allow that either (with both versions it’s possible to achieve the same effect, but only with further hacks). However, my proposal is forwards-compatible with some form of native support for immovable structs – a feature I’d love to see added in the future – which could make this ‘just work’. That is, you could write something like

struct TwoGenerators {
    some: SomeGenerator,
    another: AnotherGenerator,
}
fn get(two: &mut TwoGenerators) -> &mut SomeGenerator {
    &mut two.some
}

By contrast, if such a feature were designed based on Pin, the language could hypothetically add magic to make this work:

fn get(two: Pin<TwoGenerators>) -> Pin<SomeGenerator> {
    &mut two.some
}

but that would be a lot more awkward. (maybe &pin?)

Anyway, the main source of hackiness in my proposal – well, that doesn’t also apply to Pin – is the idea that size_of_val(&MyGenerator) would compile but return 0 (or panic in the future). This matches the latest DynSized RFC’s design (link again for reference) – but even that RFC suggests a “Milestone 3” where size_of_val would eventually gain a proper DynSized bound, resolving the issue.

It does question whether the epoch system will permit such a change. As an alternative, there’s nothing about that RFC that would be incompatible with eventually moving to a ?DynSized-based approach (where DynSized is added everywhere as an implicit bound). Or perhaps having it be implicit but only for old-epoch code.

Either way, I think that the vast majority of generic parameters currently marked ?Sized can work with non-DynSized types. If a DynSized bound were made implicit, almost everything would want to migrate from ?Sized to ?DynSized; if it’s kept explicit, very little code would need to change from ?Sized to ?Sized + DynSized.

But again, extern type already exists (although it’s not stable), and has the same requirements for correctness as what I’m proposing. That is, to properly support it, there’s no way around either adding some form of DynSized, or adding an unsafe code guideline strictly limiting the use of size_of_val – either of which would also suffice for generators. It’s true that having !DynSized be just for FFI would let us sort of “shove it under the rug”, whereas having it also be for generators would force it to be more prominent. But I’m not convinced by that argument.

  • For one thing, safe FFI wrappers ought to be, well, safe, and at least potentially (depending on the design of the foreign library) no less hacky than any other Rust API. If DynSized-related issues are unacceptable for normal Rust code (whether compile-time breakage or runtime panics), they’re also unacceptable for FFI wrappers, and we should just get rid of extern type before it’s stabilized.

  • For another, I personally think that full-fledged support for immovable types will be an important part of Rust’s future – or ought to be. So the only question is whether to stick with Pin forever, or eventually have some form of !Move. But most code that could work with T: !Move can work just as well with T: !DynSized, and that’s largely also the same as the code that currently takes T: !Sized – it’s mostly a question of whether some code only handles references to T (in which case the most general bound, !DynSized, should work), or whether it handles it by value (in which case you currently need Sized anyway). There are only a few exceptions, like the Box-to-Rc thing.

  • Thus, it makes perfect sense for the ultimate design of Move to only be a slight addition on top of DynSized, a subtrait that most code won’t want to explicitly bound on. If Move were somehow orthogonal to DynSized, that would just feel like two redundant language features. So it’s logical, not merely a hack, to base an initial version of immovable types (i.e. generators/async fns) on top of !DynSized. In the future, they could switch to impl’ing DynSized but not Move, to properly express that they actually do have a size, and that would be a backward-compatible change just like adding an impl for any other trait. Or we could just stick with DynSized = Move forever, since as I said, there’s very little justification for generic code to query the size of some generic T if it’s not allowed to memcpy based on that size.

1 Like

What I meant for backwards compatibility was that extern types are already in the wild, whereas immovable types aren’t. Subordinating immovable types to the design considerations of extern types forces compromises into the design of immovable types which don’t need to exist.

Viewing Pin<T> as a wrapper for &mut T for weird T begs the question somewhat, since you’re arguing for it to be equivalent to &mut T where T: !DynSized. Also, to the extent that this describes the implementation, you’re violating the abstraction barrier. Pin<'a, T> can be viewed in its own right as a unique reference type with lifetime 'a which, for T: !Unpin, can only exist where &mut / move access has been permanently disabled for the referent.

You can’t reborrow it for a shorter lifetime;

https://doc.rust-lang.org/nightly/std/mem/struct.Pin.html#method.borrow

For one thing, safe FFI wrappers ought to be, well, safe, and at least potentially (depending on the design of the foreign library) no less hacky than any other Rust API. If DynSized-related issues are unacceptable for normal Rust code (whether compile-time breakage or runtime panics), they’re also unacceptable for FFI wrappers, and we should just get rid of extern type before it’s stabilized.

In principle, yes, but extern types are a bit of a necessary evil unless/until the soundness issues can be fully addressed. The opportunity exists to have immovable types enter the world entirely absent these issues.

Box-to-Rc comes to mind. I’m pretty sure you’d end up with an uninitialized T at the end of it. I’m not convinced by the argument that immovable types will always be sufficiently abstract that you can’t put them in a Box. Maybe you can come up with a scheme that works for generators, but that’s far short of a generalized feature, and doesn’t address what external library code might (validly?) do.

If DynSized is accepted, and if it moves sufficiently close to closing the soundness holes (for some meaning of “sufficiently close”), then yes, I can see this approach being valid. I don’t think an unmitigated zero-size hack should make it into stable/production code.

If the idea is for this proposal to be fully backward-compatible with the existing Pin design, then this doesn’t work. Being !DynSized is inseparable from the type, whereas the T in a Pin<T> can have had a fulfilling life as a movable value before getting pinned. Also, unless you want a transition period in which !Unpin types are warned to implement !DynSized with ultimate breakage, Pin<T> needs to be able to maintain its invariants without any changes needed to T. For these reasons, Pin<T> would probably have to be equivalent to something like &mut Pinned<T> where Pinned<T>: !DynSized. (Edit: And this doesn’t address any potential method conflicts, Deref impl conflicts, …).

The current design of Pin currently has that for the &T case. But yeah, having it just work for &mut T where T: ?Sized is nice, though I’m not sure how common that is compared to &T where T: ?Sized. I’m not entirely convinced about the need for distinguishing whether the referent is movable vs immovable for &T.

1 Like

Fair enough. But – first of all, I don’t think this question really affects any of the points I’ve made. I wrote:

After all, just as Pin<'a, T> is a wrapper for &'a mut T for weird T, FFI users could make MyFFIReference<'a> wrapper structs; but extern type provides extra sugar to allow native references to be used safely, a goal that I think applies here as well.

I guess that if Pin is seen as a fundamental reference type, the guidance could be that instead of making your own MyFFIReference<'a> wrapper, you should design your APIs around Pin<'a, MyFFIType>. But then what do you use for non-unique references? &MyFFIType doesn’t work because of the same issue with size_of_val. You still need DynSized. On the other hand, if you purely used, e.g., MyFFIReference<'a> and MyFFIReferenceMut<'a>, then you wouldn’t need either DynSized or extern type, but the API would be less ergonomic, which is the same issue that applies to Pin.

In any case, my problem with viewing Pin as a fundamental reference type is that at least for general self-referential structs, there are also use cases for reference types that correspond to & and, potentially, &move (the original proposal in this thread), while guaranteeing immovability. So we’d end up with 6 reference types instead of 3 (or without &move, 4 instead of 2).

For example, this is unsound using regular &:

struct S {
    foo: [i32; 32],
    // points to one of foo's elements:
    bar: Cell<Option<&'foo i32>>, 
}
impl S {
    fn set_bar(&self) {
        self.bar.set(Some(&self.foo[0]));
    }
}

But it would be sound with an immovable version of &.

(&Pin<T> could work as a substitute, but has limitations due to being a double pointer, e.g. you can’t go from &Pin<Struct> to &Pin<SomeField>.)

[edit: See also @RalfJung’s recent post Safe Intrusive Collections with Pinning, which explores a somewhat different direction than what I’m proposing, but also concludes that an immutable pin is useful. I think the invariant he wants can also be achieved with my Pin-less approach.]

Anyway…

You can’t reborrow it for a shorter lifetime;

https://doc.rust-lang.org/nightly/std/mem/struct.Pin.html#method.borrow

I missed that, but let’s correct the point to: “you can’t implicitly reborrow it for a shorter lifetime”.

In principle, yes, but extern types are a bit of a necessary evil unless/until the soundness issues can be fully addressed. The opportunity exists to have immovable types enter the world entirely absent these issues.

I think that’s short-term thinking. To the extent that soundness issues exist, they’re an urgent problem and need to be solved before extern type can be stabilized. (Or else there’s not much benefit to having it when unit structs already satisfy the unsound-but-mostly-works use case.) However, I don’t think they’re a big deal; in particular, I’m not sure whether any code actually exists that uses size_of_val to copy bytes from a non-owned type. And the DynSized RFC proposes having size_of_val panic (instead of returning 0), which will eliminate any such unsoundness, at the cost of surprising behavior.

Box-to-Rc comes to mind. I’m pretty sure you’d end up with an uninitialized T at the end of it. I’m not convinced by the argument that immovable types will always be sufficiently abstract that you can’t put them in a Box. Maybe you can come up with a scheme that works for generators, but that’s far short of a generalized feature, and doesn’t address what external library code might (validly?) do.

Regarding Box-to-Rc specifically: in the short term, exposing a Box<Immovable> to safe code would be unsound because you could move out of it.

In the medium term, having size_of_val panic fully removes the potential for unsoundness, again at the cost of surprising behavior.

In the long term, the goal is to have the standard library include appropriate DynSized bounds so the compiler can reject such code at compile time. The DynSized RFC proposes one way of doing that while minimizing disruption.

Actually, I’d only skimmed that RFC, and it turns out I misunderstood what it was proposing :sweat:. Now that I understand, I like it much better! And think it’s clearly better than !DynSized. I thought it was proposing a lint at monomorphization time if size_of_val etc. is instantiated with !DynSized types. But it’s actually proposing that for each caller, the compiler try to prove, in a generic context, that T: DynSized holds, and lint if it can’t. So to avoid the lint, callers must change their bounds to add either T: DynSized or #[assume_dyn_sized], either of which propagates the requirement outward. Thus, as long as your code compiles without producing the lint, you have a strong guarantee it’s fully DynSized-safe, even in the fact of clients instantiating your generic code with arbitrary parameters.

You probably already knew that, but I didn’t, and after thinking about it, I’m now much more confident that DynSized can be adopted quickly and painlessly. I also commented on the RFC with a suggestion: instead of adding a special-case #[assume_dyn_sized], we can get the same effect using specialization, if the compiler is changed to support #[deprecated] on impls.

Moving on:

If the idea is for this proposal to be fully backward-compatible with the existing Pin design, then this doesn’t work.

No, it’s not meant to be backwards-compatible with Pin. Even if it’s a bit late, there’s still some time to kill Pin before it’s stabilized. :stuck_out_tongue:

The current design of Pin currently has that for the &T case. But yeah, having it just work for &mut T where T: ?Sized is nice, though I’m not sure how common that is compared to &T where T: ?Sized. I’m not entirely convinced about the need for distinguishing whether the referent is movable vs immovable for &T.

Above I showed a use case for immovable &T – basically any use of interior mutability with self-references. The current design… oh, I missed the unconditional Deref impl on Pin even for T: !Unpin. So you get to work with native &T, at the cost that immovable types based on Pin can never have interior mutability. Not great, IMO…

To be honest, there’s a lot about your proposal that confuses me @comex. Here are some points:

  • What does calling a generator return? It seems like it returns a pointer to the generator, but then where is the generator allocated? On the heap? That’s exactly what we don’t want - to heap allocate generators every time. (If we were okay with that, this would all be fairly trivial actually: we’d just implement Generator for PinBox<{anon type}> and call it a day).
  • Your commentary about the relationship between DynSized and Unpin doesn’t seem correct for me. We know the size of a generator - statically in fact. This is very important: DynSized is required to be able to deallocate something, because the dealloc API takes a Layout argument. If we were to say that the anonymous generator were !DynSized, we’d never be able to deallocate them.
2 Likes

I am similarly confused. To add to @withoutboats list:

Pin is forwards compatible with introducing &pin into the language some day. That would provide both implicit reborrowing and implicit field access (which currently both exist, bot only explicitly and, for field access, unsafely: Pin::map).

I also have to say it sounds extremely hackish to me to tie pinning and sizedness together. There can be sized types that require pinning (like most generators), and there can be unsized types that do not (like all the unsized types we have currently). You are right that a !DynSized type is implicitly pinned, but requiring all pinned types to be !DynSized seems overkill. For example, I might want to take two futures and put them into a pair, but only the last element of a struct can be !Sized – so you’d restrict composition of futures, for no good reason.

Also, forgive me if you have commented on that already (in that case I missed it), but did you comment on whether this is a “local extension”? The fact that Pin is automatically correct for any type that does not care about pinning is a big win. In contrast, your comment about Rc and Box seems to indicate that your proposal is incompatible with some existing code.

EDIT: Oh, also I think your proposal is incompatible with code I consider legal. I would think one can have a strange unsized variant of swap with a type like

fn swap_unsized<T: ?Sized>(x: &mut &mut T, y: &mut &mut T)

that does the following:

  • Compare the (dynamic) sizes of *x and *y to make sure they are equal; panic if they are not.
  • swap the fat pointer information between *x and *y
  • mem::swap the memory at **x and **y (it’s the same size, after all)

Is this crazy? Yes. But I think it should be legal. Not because it is useful, but because it demonstrates that unsized types really are just that—unsized—and not have some other move-related magic to them that nobody expects there.

Unsizing and pinning really are two orthogonal features. We should keep them separate.

1 Like

I’d also like to make a more higher-level comment:

You seem to be worried that we have a combinatorial explosion of references. I fully sympathize with that! This combinatorial explosion of references is paralleled by a combinatorial explosion of “typestates” and invariants, and that worries me very much. It’s why I was opposed to shared pinned references. So I very much agree there is a problem here with Pin that I’d like to see solved some day. I just think that sizedness is not the solution.

On the formal model side, I actually have some thoughts for how to reduce the number of invariants. At some level it’s just a mathematical trick, but I think it could actually be nice. Instead of having four of them (T.own, T.shr, T.pin and T.shrpin), we could have just two of them, with types like

enum PinState { Unpinned, Pinned }

// This trait defines which invariants make up a Rust type.
// It's an unsafe trait because the invariants actually have to satisfy some axioms, as discussed in my blog posts.
unsafe trait Type {
  fn own(self, pin: PinState, ptr: Pointer);
  fn shr(self, pin: PinState, ptr: Pointer, lft: Lifetime);
}

(I am using you as a guinea pig for an even more Rust-y syntax for all this math. Let’s see how that goes :wink: )

So, it wouldn’t actually be four invariants, it’d just be two parameterized invariants. That will probably be much more convenient to work with.

Now, I said this is just a mathematical trick because the amount of information is still exactly the same – instead of two functions, I have one function taking a two-element type. That’s the same thing. But it may be informative, and it may even feed back into language design. This approach invites us to think of pin not as a reference type, but as a reference modifier. So, there wouldn’t be &pin, there would be &[list of modifiers]. We could have &pin, &mut pin and &move pin (assuming &move ever becomes a thing). Maybe even mut and move could become a modifier, so a reference is determined by

  • mut, move or shared (with no keyword)
  • pin or not

There could be some kind of modifier polymorphism as well—something we anyway want to shared and mutable references, to avoid writing every function on slices twice.

Given that the invariants of the owned and pinned are meaningfully different, I don’t think we want to conflate concepts here. But It would be nice indeed to to be able to talk about references in a more general way, without stating the exact typestate the referent is in.

2 Likes

Sorry, I guess I wasn’t clear enough.

Calling a generator function returns, by value, a type MyAnchor satisfying MyAnchor: Anchor, MyAnchor::Inner: Generator. For clarity I’ll refer to MyAnchor::Inner as MyGenerator.

MyAnchor is not a pointer. In fact, it contains the generator function’s state, and is sort of the “real” type of the generator, whereas &MyGenerator is a “fake” type that actually points to MyAnchor. (MyGenerator never exists as a value itself.)

The reason it makes sense to have a separate type is to represent the fact that the generator can be moved before being pinned. So &mut MyAnchor is like &mut MyGenerator under the existing design, whereas &mut MyAnchor::Inner is like Pin<MyGenerator>.

MyAnchor, the real type, is Sized. MyGenerator, the marker type, is !DynSized which means you can’t try to deallocate it – but that’s okay, because it should only ever exist as a reference.

True, but for now it is less ergonomic, and in the long term it still means twice as many reference types. Also, unless the unconditional Deref impl on Pin is removed, a backwards compatible builtin version won’t work properly with self-referencing types containing interior mutability.

It is a local extension. It may hypothetically be incompatible with some existing unsafe code that uses size_of_val like the Box-to-Rc conversion (specifically, using it as a size to memcpy), but with &mut T instead of any kind of owned T, but for something other than swapping (which is why your example below is OK). As I said, I suspect that no such code may exist, and in any case the DynSized RFC would fix this properly. Also, the same issue applies to all uses of extern type.

If I understand correctly, this would not be unsound under my proposal. Under the initial implementation where size_of_val would return 0, it would succeed but do nothing. If the DynSized RFC is implemented, it would trigger the size_of_val panic, but swap_unsized would also get linted for using size_of_val without a T: DynSized bound.

Now I’m even more confused. What prevents you from moving around MyAnchor between calls to resume?

The fact that Anchor::get_mut is unsafe, and should only be called by types such as PinBox that ensure the anchor will never move again.

Why that? The built-in version just provides nicer syntax and automatically performs some operations that are already sound to do now (though some require unsafe code). I see no conflict between &pin and the RefCell. The concern about Pin::deref is entirely orthogonal to having nice syntax and automatic reborrowing.

How would that code look like?

Oh, you’re assuming pinned things would be !DynSized. I see. Makes sense. (Also this is an example where returning 0 would have catastrophic effects as the vtables got swapped. Yet another reason why size_of_val should panic or error at compile-time, but not return bogus fake data. But that’s a separate discussion.)

My point is that Pin::deref is the only reason that the existing Pin design largely provides compatibility with existing generic code taking &T. If that’s removed, you’d have to use &Pin<T>, which is not as nice because you need something keeping the Pin<T> alive – in particular, as I’ve said, you can’t go from &Pin<SomeStruct> to &Pin<SomeField>. But if it’s not removed, then it won’t be a good long-term design (thus, not a good basis for &pin) because of the interior mutability issue. On the other hand, my proposal completely avoids this issue because &T is a pinned reference.

There could be some kind of modifier polymorphism as well—something we anyway want to shared and mutable references, to avoid writing every function on slices twice.

Such a feature definitely sounds useful for the subset of cases where & and &mut both work. But it will necessarily have its share of mental overhead. It’s better if we can avoid foisting it on everything that takes &T and doesn’t try to move it (but also doesn’t care if it gets moved later).

More generally, I claim that almost all generic code falls into one of two categories, for a given generic parameter T:

  • Handles T by value: currently requires Sized; with the unsized rvalues RFC, could extend to non-Sized but DynSized types. Doesn’t make sense for FFI !DynSized types because the size is unknown. Also doesn’t make sense for immovable types because handling by value = moving.

  • Only handles references to T (immutable, mutable, or even a hypothetical &move): currently works with !Sized types; should work fine with !DynSized types as well as immovable types.

Since whether a given piece of code works with immovable types and with !DynSized types is very highly correlated, it makes sense to “conflate” them somewhat.

Now, they don’t need to be completely conflated. My thinking is that we’ll eventually want a separate Move trait, with this hierarchy:

Sized : Move : DynSized

Why that particular hierarchy? Well:

  • Sized must inherit from Move, because existing generic code with only T: Sized as a bound expects to be able to move Ts around. […well, an alternative would be to have all parameters have a Move bound by default, but that would make immovable types nearly useless.]

  • Move may as well inherit from DynSized, because it makes no sense to have a type that’s !DynSized (meaning, at least naively, that Rust has no idea what its layout is), yet can be moved.

There’s no real reason why we couldn’t add Move up front (other than making for a slightly more complex proposal), but it should be a fully backwards compatible change to add it later, and change generators to impl DynSized but not Move. After all, adding trait impls is generally backwards compatible.

You might argue that Sized: Move still makes no sense, and if my proposal requires it, then that’s just another reason to reject my proposal. And… you’d have a point. :stuck_out_tongue: But I still think it’s better than adding a whole new family of references for not much benefit.

Er… right, that’d be true if you converted the generators to trait objects. But otherwise there’d be no vtable.

The interior mutability issue is not helped by making this built-in syntax. I don’t see the connection.

This is a completely separate question, and your concern is resolved by having a PinShr with the ability to go from PinShr<SomeStruct> to PinShr<SomeField>. Having that is orthogonal to having Pin::deref, and it is orthogonal to having built-in syntax.

However, this is indeed a good counter-argument for "just use &Pin<T>", which I have been saying. I am not saying that any more. :slight_smile: Based on what I realized when writing my last log post, I am saying, either we abandon the entire pinned-shared thing (but nobody wants that), or we have a dedicated type of shared pinned references. I guess in built-in form, we’d eventually have &pin (shared) and &mut pin.

Indeed I am arguing exactly that. :wink: This prevents you from composing futures by putting two of them in a pair. Why is that not a big deal?

Not orthogonal :slight_smile:

PinShr<T> would work, and &pin T would work better, but neither would be compatible with existing code or traits that expect &T. &Pin<T> would be compatible but it has the limits mentioned. &T would be compatible but it requires the problematic Deref impl.

This is not an obscure use case! For example

  1. Let’s say we want a self-referencing type to impl Debug. It could be any self-referencing type: a future combinator, a builtin async function (which could at least print the file:line of the function, maybe even the stored variables), or something totally unrelated to async/generators, the kinds of general self-referencing structs that are currently the domain of rental.

    Well, Debug takes &self. Assuming we don’t have a Deref impl for PinShr<T: !Unpin>, the only option is to impl Debug for PinShr<T> and take &PinShr<T>. In this case, that’s not the end of the world since the caller can create the reference, but it’s a double pointer, which is inefficient.

  2. What if we want to impl one of Index, Borrow, AsRef, or Deref, among other similar traits? Not so useful for futures/generators, but perfectly reasonable for general self-referencing structs. All of these traits have signatures like

    fn deref(&self) -> &Self::Target;
    

    In this case, impling for PinShr<'a, T> doesn’t work well, because the result reference will only last as long as the caller’s temporary borrow of the PinShr, not for all of 'a.

  3. Of course, there’s another problem: what if Self::Target is itself immovable, so we want PinShr<'a, T> -> PinShr<'a, Target>? Well, each of those traits has a mutable counterpart, and in theory we could create new immutable-pin and mutable-pin variants of each – either as separate traits, or someday with some kind of abstraction mechanism for reference modifiers, as you’ve alluded to. But:

    • Creating separate traits would add quite a lot of boilerplate to both the standard library and custom container types, so I don’t see it happening.
    • An modifier abstraction mechanism would definitely be nice to have, if only to unify the immutable and mutable variants of the traits (and maybe someday move variants). But it sounds pretty far off, considering that for now we don’t even have a sketch of a design. And even with an abstraction, doubling the number of cases (to 4 or 6) still adds complexity and mental overhead.

It doesn’t, not exactly, as you can still compose anchors. This does make the implementation of future combinators a bit more complicated, and it’s the part of the design I’m least confident about. My dream is that in the long run, construction in place will ‘just work’ with both function calls and returns (I detailed a possible design in another thread), so there will be no need for anchors; you just deal with !Move types directly.

But that’s just a dream, so what would it look like for now? Well, combinators would use the same two-type approach as generator/async functions. Here’s an example based on Join from the futures library, a combinator that combines two futures:

trait FutureAnchor = Anchor where Self::Inner: Future;

// All of this stuff is boilerplate that could be generated by a macro: {

// The anchor itself:
struct JoinAnchor<A: FutureAnchor, B: FutureAnchor> { a: A, b: B }
// The marker type, which impls Future.
extern { type Join<A: FutureAnchor, B: FutureAnchor> };

impl<A, B> Anchor for JoinAnchor<A, B> {
    type Item = Join<A, B>;
    unsafe fn get_mut(&mut self) -> &mut Join<A, B> {
        &mut *(self as *mut _ as *mut Join<A, B>)
    }
}
// Private helpers:
impl<A, B> Join<A, B> {
    fn cast_to_join_anchor(&mut self) -> &mut JoinAnchor<A, B> {
        unsafe { &mut *(self as *mut _ as *mut JoinAnchor<A, B>) }
    }
    // Field accessors:
    // Note that these call `Anchor::get_mut()` on the anchor fields, returning
    // the inner futures themselves.
    fn a(&mut self) -> &mut A::Inner {
        unsafe { self.cast_to_join_anchor().a.get_mut() }
    }
    fn b(&mut self) -> &mut B::Inner {
        unsafe { self.cast_to_join_anchor().b.get_mut() }
    }
}

// } End boilerplate.

impl<A, B> Future for Join<A, B> {
    // ... same as existing impl, but using a() and b() accessors
}

This requires unsafe code for the same reason that in a Pin version, unsafe code would be required to go from PinShr<Join<A, B>> to PinShr<A>: you have to promise that you won’t try to move one of the struct fields by hand. However, the unsafety can be entirely encapsulated by the (hypothetical) macro: it would only expose the accessor methods and prevent the user code from directly accessing the fields of JoinAnchor, e.g. by using an internal mod declaration and privacy.

1 Like

By the way: I’m silly and only just realized that making size_of_val panic for !DynSized can be done as a pure library change. And even if it couldn’t, it would be a straightforward compiler change, not something complicated and unspecified like, say, implementing native immovable structs would be. So let’s just call it part of the base proposal; forget the whole thing about it returning 0 to start with, and the resulting unsoundness question.

I think @comex 's proposal could be fixed/generalized like this:

Define

extern
{
    type NoMoveExtern;
}

// T could also be PhantomData<T> (we aren't constructing or dropping this type, so it doesn't matter)
struct NoMove<T>(T, NoMoveExtern);

Now we replace PinMut<'a, T> with &'a mut NoMove<T> and PinShr<'a, T> with &'a NoMove<T> and otherwise keep the same behavior of the pin RFC. Due to NoMove<T> being !DynSized, this should be an equivalent formulation since it can’t be swapped out.

Thus, traits like Generator could be implemented for &mut T where T: ?DynSized rather than PinMut<T>, and there would be a blanket impl of Generator for NoMove<T> where T: Generator (movable generators would rely on that, while immovable ones would implement Generator for NoMove).

If it works, it seems this might be a bit better than PinMut/PinShr since it doesn’t need a new reference type.

Having a ?DynSized bound on Self for traits by default would be required though to make traits supporting immovable types not being special at all, and I think this is incompatible, but maybe it could be changed with a Rust edition.

Note however that you can no longer do “-> impl Generator” for such traits, but it will need abstract type syntax, so you can say “abstract type T where NoMove<T>: Generator” and then do “-> T”.

1 Like

What is the case for a type that is !Move and yet DynSized? It seems like for such types I could write my swap_unsized function. So any actually immovable types likely have to be !DynSized. Right?

Yes, I was expecting that. New reference types come with new matching traits. So, all of this is still “just” spelling out the cost of having more reference types.


So if I understand correctly, this boils down to whether we want to have a new reference type (or, rather, two of them), with all the baggage that implies; or whether we want to “hack” something together based on !DynSized. Both have ergonomics hits:

  • A new reference type requires support in all container-like data structures if we want to have pinned stuff in them, and new Deref traits if we want smart pointers and the like.
  • !DynSized requires Future implementations to actually have two types, two trait impls, and some non-trivial relationship between them. (Notice that formally, having two types is like having twice as many invariants–so the owned and shared invariant of the extern type would correspond to the pinned and shared-pinned invariant in the Pin proposal.)

For futures alone, there are likely going to be far more implementations of the trait that people wanting to call poll directly. So in that context—right now the main motivation for pinning—I think Pin has significantly less boilerplate. Also notice that the vast majority of existing smart pointers are unsound for Pin anyway.

Now, much of the additional complexity in Pin (the part that requires changes and consideration in containers, like adding Pin-related methods to Vec) is to be able to safely obtain pinned references. I noticed your Anchor::get_mut is unsafe. Is there any way, in safe code, to call resume on a future? With Pin, there is, thanks to PinBox and the possibility of a stack pinning API. A version of RefCell that hands out pinned references also seems feasible, and one could imagine adding a get_pin(self: Pin<Vec<T>>, index: usize) -> Pin<T>. How would any of that look like in your model?

For more general use of pinning, like for intrusive collections, the trade-off might shift. However, as you observed, Pin::deref mitigates much of that. I find this function funny, but less funny than using sizedness :wink: and the only concrete issue we have so far is that we cannot have fn get_pin(self: Pin<RefCell<T>>) -> Pin<T> just return a reference. However, so far I don’t see how your model handles safe creation of “pinned” references at all, so this disadvantage on the Pin side is far outweighed on the !DynSized side by not even being able to express this pattern. This is similar to the smart pointer situation btw; how would I obtain (in safe code) something like Arc<MyGenerator>?

(Based on @glaebhoerl’s RefCell in the other thread, we actually can have fn get_pin(self: Pin<RefCell<T>>) -> Pin<T> if we make it first set the pinned bit of the RefCell, and then return. This is compatible with Pin::deref and fixes the unsoundness I discovered in the RFC. That code will then instead panic, which is not entirely satisfactory but then this is RefCell we are talking about, which is all about run-time checks instead of static checks.)

1 Like