Strong updates status

They can:

fn main() {
    let mut foo = Vec::new();
    let bar: &mut Vec<u8> = &mut foo;
    std::panic::catch_unwind(std::panic::AssertUnwindSafe(move || {
        bar.push(42);
        panic!();
    })).unwrap_err();
    assert_eq!(foo[0], 42);
    println!("Mutable reference crossed panic boundary");
}

My mistake, so we either can't keep &mut [P .. P] references across function calls that may panic or we just abort in that case. Not good, not horrible.

I guess you meant &Mut [P .. Q] where P is not a subtype of Q.

And yes, that conclusion (can't panic if a type can't be dropped) is what I wrote earlier and I think it's anyway needed if we want to handle linear types in a more general way.

After some thought: It would be best to abort and warn. Unwinds already may abort at any time if a drop call panics (something goes wrong) and since there may be functions that rust wrongly assumes to possibly unwind, we want to allow people to still compile while holding across functions they know to not panic with their use, but want to inform them of the consequences.

This seems to fully abide by the expectations currently put on panics/unwinds while preserving user freedom and preventing abort pitfalls (in case somebody relies on catch_unwind)

Yes, that was my conclusion too (can't panic if a type can't be dropped) above. You could indeed just see it as &mut [P .. Q] implementing Drop:

  • If P is a subtype of Q, then drop() does nothing and simply returns.
  • If P is not a subtype of Q, then drop() panics (describing the intent that it can't be dropped) possibly aborting execution if dropped during unwinding (describing the intent that a panic can't happen if it can't be dropped).
1 Like

We can expand this to the whole Non-Drop and Non-Forget problem.

To solve it, we have two options, elegant but maybe impossible or quick and dirty:

  • Elegant: We introduce Drop and Forget as an auto trait, and require it like Sized for any type parameter. Drop would have to be backwards compatible while Forget is a new trait. If a !Drop type is dropped it doesn't compile, similar with Forget. (The only weirdness is that we can't by default accept a & &mut [ T..U ] as a generic &T) Unwind points may be treated as drop points (don't recommend)
  • Dirty: We implement Drop for &mut [ T..U ] T≠U that just aborts and warn about it. And in forget we just abort if that is passed in.

While the second one is less desirable, it is more attainable.

Just to note, the traits used for the solutions already sort-of/partially exist.

  • std::marker::Destruct marks types which can be dropped. Currently that is all types, and the bound is only useful as a const capability, but it exists as a lang trait already. In a future where we support more exotic types, perhaps via some unsize hierarchy, Destruct could fairly easily participate as a default-bound capability.
  • Pin/Unpin discuss one aspect of forgetting; specifically, if a value is pinned (and isn't Unpin), then forgetting it and then invalidating its memory without running its destructor is unsound.
    • (But you need a stronger guarantee that prevents forgetting even if the immediate memory remains valid, as lifetime validity isn't protected by pinning.)
  • Implementing Drop for &mut [ P .. Q ] would also necessarily implement it for &mut T, but the drop timing implications can already be avoided with the "dropck eyepatch" #[may_dangle] semantics. …But actually this wouldn't be sufficient if the update is at lifetime expiry but subtyping validation happens at Drop timing with any of the usual #[may_dangle] demos (uninitialized let assigned in an inner scope).
    • In short: the dirty solution doesn't support &mut [ T .. T ] being the same as &mut T because whether a type has a semantic destructor impacts lifetimes.
2 Likes

So I thought about this again, dropping and explicit forgetting aren't the only ways to break this. You can also move the reference to behind a pointer and then drop that. This can be done with Box::leak and can't easily be guarded against.

The first option I thought of would be to disallow putting unfinished references in a Box (or any other structure that can be leaked) and have no interface for Box to allow changing type within it, the problem with that is, that it requires some more compiler magic for these types, to detect them through nesting (in structs/enums), and it greatly diminishes their utility, if you can't use them for data on the heap.

Or maybe we could somehow disallow them to effectively leave scope unfinished, treating unfinished ones as if they were returned, put in a static or moved (sort of &mut [ T..U ]: 'static).

Edit: The second option only partially makes sense, not the reference, but the reference to the reference needs to be somehow forced 'static, or at least live until the inner reference is finished.

One more possibility is available if you have a Borrow Type, like I have proposed in my blog post. Since we know that there is a one to one correspondence between a mutable borrow and its mutable reference, we can require the mutable reference to be available to invalidate when destroying the borrow by accessing it. To make good code compile again, dropping the (completed) reference would remove the borrow, this is where the one to one correspondence is really needed.

Sorry for joining the conversation so late, but this is actually not a problem. When passing a closure to a function, the compiler doesn't assume that the closure is actually called – in other words, catch_unwind could even be defined like this:

fn catch_unwind<R>(f: impl FnOnce() -> R + UnwindSafe) -> R {
    // do nothing
}

Therefore, the compiler must assume that the type of place hasn't changed, because the closure might not have been called:

fn transform(it: &mut [T .. S]) { /* … */ }

let mut place = T::new();
// place is T
let _ = catch_unwind(|| {
    transform(&mut place);
    // place is S
});
// place is still T

This is okay because S is conceptually a subtype of T. For example, turning T into MaybeUninit<T> is sound, so we can be conservative. The same goes for this example:

If we can't determine the correct place at compile time, we can simply ignore the type update, so both places remain MaybeUninit.

I think this misunderstands the proposal. &mut [T .. S] is not a valid type, but a promise that the value is currently a &mut T, but it will be a &mut S when the function completes (without panicking). Therefore, it only makes sense in a function argument – you can't put it in a struct or implement a trait for it. Maybe a better syntax would avoid this misunderstanding.

Changing the type of a binding is currently impossible, so we need a built-in function for this. For example:

pub fn new_in_place(
    foo: &mut [MaybeUninit<Foo> .. Foo]
) {
    // initialization goes here
    unsafe {
        transmute_mut::<_, Foo>(self);
    }
}

If transmute_mut is an unsafe fn<T, S>(&mut [T .. S]), this will Just Work on a type level.

Yes I probably misunderstood the comment if it was referring to an alternative proposal that is restricted to functions and not a proper type. My proposal is about &mut [T .. S] being a type, not just a second-class syntax. I believe this is important to avoid second-class constructs such that refactoring is possible (modulo type annotation if inference can't follow).

The problem is, we might not want S to be required to be a subtype of T.

For me in particular (wanting to combine it with partial types; partially initialized), that doesn't work, because a type that has been (partially) moved is not a supertype of the non-moved one or vice versa.

Making it just some notation also does not work for me. Since one of my aims is to make futures safe, and I as such need to slowly initialize/change the types. After an await point, you change the Variant and then move the value into it, I need to be able to change types behind references. The whole point there is to avoid needing unsafe, so such pure notation doesn't cut it.

The application in this thread is just one example, there are more (potential) ones.

2 Likes

If that was the intention, I misunderstood the proposal myself. Of course, this would severely limit where strong updates can be used. For example, in a closure you couldn't update a value that lives outside the closure.

And the requirement that both types need to have the same size and alignment can't be lifted either.

Well one example in the original Post was &mut [Door<Closed>..Door<Open>], which is not a subtype relation. That a closure can't change the type of captures only makes sense, as it can be called multiple times. If it is an argument, of course it could change that type, though type inference might be hard.

The way I understand it, a &mut [T..U] immediately changes the type of the referenced value on creation, that makes it generally possible, but requires it to always complete for soundness.

Of course size and alignment have to stay the same (at most get smaller), as the space outside of it might already be used. That is an acceptable requirement.

2 Likes

The proposal does not require the type S at end of borrow to be a subtype of the current type T for a mutable reference &mut (T .. S) or the other way around.

However, if it happens that T is a subtype of S, then &mut (T .. S) is not linear (e.g. can be forgotten before a value of type S being written to). This means you can still have closures taking &mut (T .. S) as long as T is a subtype of S.

2 Likes

I've spent quite a long time thinking about strong updates, and was considering writing a proposal for them, when I stumbled across this thread and saw that it suggested something very similar.

The primary difference with my planning is that the types work somewhat differently, using place generics @a that allow you to specify that two references reference the same place (possibly with different types), allowing you to write function signatures like

fn new_in_place<@a: MaybeUninit<Foo>>(foo: &@a mut MaybeUninit<Foo>)
    -> &@a mut Foo

The caller of this function would then be able to change the type of their local variable (in the example in the OP, foo) by consuming the resulting reference (but they could also just use the resulting &@a mut Foo as a normal &mut Foo). The place generic is needed so that the caller can know that they're transmuting the correct place, rather than some other place that the function happened to have a reference to. (The place generic also contains the functionality of a lifetime generic – the lifetimes are calculated as though @a were 'a – because in practice the two will always be combined.)

As in this example, I'd noticed that this is only type-safe for dropping if the type of the place is always able to hold the current type of the variable – to allow arbitrary type conversions in arbitrary places, you'd have to prevent the new type of reference being dropped early (including during panic-unwinds), just as was discussed in this thread. However, it'd probably be OK to just restrict &@a T to hold types T that were subtypes of the storage type of @a, because it can safely be very general (e.g. a big union, or MaybeUninit which can hold anything that fits in a size/alignment sense): in that case, if the reference gets dropped unexpectedly, the referenced object gets leaked but there are no soundness issues.

One big reason I wanted the references to be separate on the input and output is that you may want to write functions that conditionally change the type of an argument. For example, imagine a signature like

fn try_new<@a: MaybeUninit<Foo>(out: &@a mut MaybeUninit<Foo>)
    -> Option<&@a Foo>`

This way, if try_new succeeds, you can know that your Foo is initialised, but if it fails, you don't get that promise. This works for conditional destruction, too, e.g. an "nonempty iterator" API could look like

fn next<@a: MaybeUninit<MyIterator>>(&@a mut MyIterator)
    -> (MyIterator::Item, Option<&@a mut MyIterator>)

which only gives you the ability to get at the next item if there actually is a next item.

In any case, I don't have a complete design for this and the syntax could probably do with cleaning up and simplifying (this syntax is somewhat ugly and pretty verbose), but I thought I'd share my thoughts on this so far as I think strong updates would be a great addition to the language, and maybe my half-finished design helps to inspire someone else.

2 Likes

I like the idea of returning something (in some sense that's similar to strong references in Flux). This could also be interpreted as the (linear) capability (like in ) to exclusively use the place at the given type for the given lifetime.

A few comments:

  • I didn't follow the meaning of <@a: MaybeUninit<Foo>> in fn new_in_place<@a: MaybeUninit<Foo>>(foo: &@a mut MaybeUninit<Foo>) -> &@a mut Foo. My best guess would be that it's the capability you get when unwinding. At every panic point, the value must satisfy the type of that capability.
  • I like the idea of overloading the lifetime variable to also be the place variable (what L³ calls location variable).
  • I'm not sure how this will work with things like Vec<&@a mut T> because the elements are different places but they use the same place variable which is wrong. We would need existential types over place variables like in L³ to pack mutable references into ∃@a &@a mut T.

I think it's overall a pretty good improvement over what was said in this thread so far. And in hindsight this might seem obvious given Flux and L³. The additional idea is to use the mutable reference value itself as a "ghost" representative of its capability. A function returning Option<&@a mut T> really returns a bool.

It's intended as a supertype which bounds which types can be stored in the place @a (you can only store subtypes of it). I originally added it to be able to statically check that the place has enough size and alignment to store the new type, but as you suggested, it can also be used as a type to fall back to upon unwinding.

This wouldn't be outright banned, but is pretty much useless, because it would be a vector of mutable references that all referenced the same place. But it doesn't seem reasonable to be able to change the type of individual vector elements anyway because all the elements of a Vec have to have the same type. It would be possible to implement something like a type-changing map, but the signature would take an &@a mut Vec<T> and a for<@b> impl Fn(&@b mut T) -> &@b mut U.

Thanks for the clarification! Note that it's not enough to look at size and alignment. Other things like niches matter. Consider something like this:

fn foo<@a: ???>(x: &@a mut NonZeroU8) -> &@a mut u8 {
    *x = 0;
    x
}

fn bar() {
    let mut x: Option<NonZeroU8> = Some(NonZeroU8::new(1).unwrap());
    foo(x.as_mut().unwrap()); // NonZeroU8 and u8 have same layout
    // x: Option<u8> // not same layout as Option<NonZeroU8>
}

So the &@a [MaybeUninit<u8>] -> &@a [u8] motivating example only works because there's no niches in u8.

Good point, but I'd say that's a limitation of Vec. Essentially the map function you suggest can only be implemented with unsafe since the intermediate values of Vec with mixed types can't be well-typed otherwise.

This also makes me think that another problem is the lack of layout guarantees in Rust. If you take a &@a mut Vec<T> and return a &@a mut Vec<S>, it's not enough for T and S to have the properties needed by strong updates (size, alignment, niche, etc). You also need Vec<T> and Vec<S> to have the same layout.

So there's quite a lot of difficulties at this time to get any kind of strong update working in Rust, without using custom types with well-defined layouts (repr(C) or other).

One workaround for Box, Vecand similar is a moving api, that consumes the original container. This covers most use cases.

impl<T> Box<MaybeUninit<T> {
  pub fn init<F: FnOnce(&@a mut MaybeUninit<T>) -> &@a mut T>(self, f: F) -> Box<T> {..}
}
impl<T> Vec<MaybeUninit<T> {
  pub fn init<F: FnMut(&@a mut MaybeUninit<T>) -> &@a mut T>(self, f: F) -> Vec<T> {..}
}

While these methods need to use unsafe internally, the layout is a non-issue. It is a sound api.