?Uninit types [exist today]. Also let's talk about DerefMove

The basic idea is moving things in and out of Box without Box being magic (instead moving the magic into a DerefMove trait or similar).

The basic operations on the Box include:

  • Box<T((partially?) initialized)> -> Box<T(empty)>
  • Box<T(empty)> -> Box<T((partially?) initialized)>
  • The ability to fold such operations.

For example, using syntax similar to our previous ?Uninit type proposals, say you have a Box<(String, String)>, and you wanna move out the first item:

  1. Box<(String, String)> -> Box<(String, String)()> (aka: take a fully-initialized tuple and move it out, leaving a fully-deinitialized tuple)
  2. Box<(String, String)()> -> Box<(String, String)(1)> (aka: take a fully-deinitialized tuple and move in to it, a partially-initialized tuple where only entry 1 is initialized)

This is, at least in the case of Box, equivalent to moving out only the second element.

Altho Uninit types help formalize this stuff, they're still not good enough. Here's what we're struggling with, consider a hypothetical non-generic type:

struct MyStruct(Vec<u8>);

and you can have Deref and DerefMut for it:

impl Deref for MyStruct {
  type Target = Vec<u8>;
  ...
}

but for something like DerefMove to work, well... you're not changing the type of the Target, but the type of Self itself (which, does need to reflect on the type of the Target btw, and we're not sure how to handle that either). We can't think of any way of doing this within the (implied) framework we were working with when we wrote the previous "?Uninit types" stuff, unless we're missing something really obvious here.

We can define DerefMove in terms of &move references, just like we did with DerefMut and &mut. So now we have to figure out how to make owning references sound and working.

One obvious, but not really good solution of initialization concerns is to force a programmer to initialize value back till the end of the scope, so that any match branch won't leave uninitialized data after execution.

So now example like:

let b = Box::new(String::new());
match (some_bool,b) {
    (true,*s) => {//consume s, leaving `b` uninitialized},
    (false,_) => {//don't touch the box},
}

will be incorrect, because the first branch must initialize value back.

We would like to define DerefMove in terms of DerefMut + type operations. This maintains backwards compatibility with drop flags and whatnot, and is honestly the simplest Partially Initialized Type (PIT)-based solution.

(Also doesn't require adding and defining semantics for &move references.)

This feels like adding language support for take_mut

1 Like

something along the lines of

trait DerefMove<T, U> where Self: DerefMut, T: DerefMut, Self::Family = T::Family, some other bounds probably {
  fn deref_move(&mut Self->T, f: impl FnOnce(&mut Self::Target->T::Target) -> U) -> U;
}

would work with partially initialized types and provide all the necessary semantics. altho we're not happy with this API tbh. .-.

Another idea is to turn both move references and PITs into one single feature: References to Partially Initialized bindings (RPI).

The core idea is the same as of &move references: move the value, but not the memory it's contained in. However, we could add there a bit of PIT semantic, to reason about the binding the reference refers to.

By example:

struct A {
    a: String,
    b: String,
}
fn main(){
    let clo = |&move arg: A(a)| { // closure knows that only `a` field is initialized;
        println!(&arg.a); //so we can print it
        arg.b = String::new("Hello"); // but have to initialize the `b` field.
    };
    let pb: A;
    pb.a = "We say".into();
    clo(&move pb); // it's perfectly legal, even if we didn't initialize `pb.b`
 }

First thing to notice: &move A is not valid RPI type as it doesn't contain initialization info. Creation in example involves inference of RPI type using context information, namely to &move A(a).

Second thing to notice: in generic code, usage of &move(..) types is impossible as there are no way to name fields of type parameters, therefore the only two RPIs that are allowed are &move T() and &move T(..), where the latter refers to fully initialized binding.

We do subtyping of RPIs in fashion of: &move A(a) and &move A(b) are both supertypes of &move A(a,b) and subtypes of &move A().

Usage:

DerefMove trait:

trait DerefMove {
    type Target;
    fn deref(self: &move Self(..)) -> &move Target(..);
}

TODO:

Discover a way to require consumer of such a reference to initialize or deinitialize parts of a binding.

I'm not happy either since that's not valid code and you didn't provide an explanation for what type Family is and what the ->T syntax should do.

eh we strongly believe what is or isn't valid code doesn't matter for a language proposal because the point of language proposals is to make currently-invalid code valid.

so like, the idea is that T() (an uninitialized T) and T(..) (an initialized T) are part of the same family. DerefMove would enable conversions between types in the same family, while moving things in or out. in particular note that U can be anything - even the empty tuple, so converting from type T() to type T(..) through the closure is completely valid, and the same DerefMove works for both types of moves: placing and extracting. this is analogous to Deref and DerefMut, which only have one method each.

this DerefMove is based on the original ?Uninit types thread, where all conversions must be done through calling a function. you could argue these &mut T(a)->T(b) not-quite-references (they're part of a function's signature) are comparable to &move references but honestly we think they solve a lot of unanswered questions with &move references, like how do you know what you're (de)initializing and whatnot, and what happens when you stick one of them in another type.

We want to express a wish to have a few fields of a binding initialized at the end of move reference use.

Another things to place &move A(here) may be a!, meaning that we won't uninitialize a, and a*, meaning that we are going to init. possibly uninitialized a.

Placement new possibility:

fn place_a(arg: &move A(a*) ) {//we gave a promise to make `a` initialized
    ...
}
fn place(arg: &move A*) {...} //this will definetly init the reffered binding.

fn main(){
    let smth: A;
    place_a(&move smth); //we now have `smth.a` initialized
    let smth_else: A;
    place(&move smth_else); //smth_else is now initialized;
}

Reasons to have separate semantics are following:

  • not complicating &mut further
  • separate bindings from values they hold:
    By binding I mean fixed place in memory whose value can be moved in and out; while use of values is clear, the bindings themselves are consumed with moves, and while the move semantic is the most convenient, it doesn't cover transferring ownership of only value, not value and binding.
    The impossibility to transfer ownership without also transferring binding ultimately resulted in development of Pin-ned references, and their known pitfalls.

List of features RPIs are designed to incorporate:

  • Move references;
  • Placement new: &move T* implies the very use;
  • Partial initialization;

The problem is not that it's not valid, rather that I can't seem to find a specification/set of rules under which it would be valid.

This doesn't explain what should the Family associated type be.

This feels like mixing the concept of an owning reference (won't uninitialize) and something like placement new (I will surely initialize).

Do we really need &move A(a*) for this? Couldn't this just pass &move smth.a?

1 Like

it is hard to formalize the other ?Uninit types thread, so you're on your own tbh.

Both should have the same Family, which is an opaque type because trying to formalize this is too hard for us and just putting it as "an opaque type" works well enough for this. .-.

(Altho we guess it should be Self::Target::Family rather than Self::Family... or well, both really.)

I guess &move smth.a must make &move String(??).

Yes, that's the point. I wanted to make a unified mechanism to do partial initialization within &move references, solving many of concerns with them.

Initially, I had a concern about the bitfield inside of a PIT, so I decided that this functionality will fit to move refs and thus wouldn't require bitfield in type's layout, but in the move reference.
However, now I think that it's possible to check statically, what such a reference does to data and get rid of these bitfields.

Notes over the syntax:

We have to develop a nice (at least I hope) syntax for &move A(..!) and &move A(..*). Also tuple syntax is under question: &move (u32!,String*,i16) is clear, but what if we have 4 element that we have nothing to do with, but want it to be in tuple.

Tackle of an &move A(): it's an RPI that doesn't require anything to be initialized nor promises to init anything, meaning that this is practically useless, thus I'm up to making this case an invalid and instead use a plain &move A to express defined deinitialization, &move A! for in-place value transform and &move A* to ask for initialization.

Further considerations

Pin

Another thing is interaction with Pin: to make the feature sound we should only allow Unpin types to be moved to another bindings out of this kind of references.

For example, with this proposal Generator trait's resume method now could take &move Self! instead of Pin<&mut Self> as a self type.

Patterns

RPIs may also be involved in patterns, like so:

match (smth_not_unpin,bool) {
    (ref move gen_exmp, true) => {
        //here we do smth. with gen_exmp, which have type of `&move T!`
    },
    _ => {}
}

The reason of having such pattern producing &move T! obligation is that we may not want the match to be able to suddenly de initialize a binding.

Then why are you even proposing it in the first place? I don't think a proposal without a set of rules on how it should work will ever be accepted.

That's kind of hacky. It's also unclear to me how this could work with Box. Would Box<T> and T have the same family?

The problem is that in your proposal they're still different concepts, you just unified part of the syntax (&move, but not the ! vs *, effectively making them different).

Also, you previously mentioned that you didn't want to further complicate &mut, but how is &move T! different than &mut T?

How does &move Self! prevent the Generator from being moved from outside the resume method?

So asking for help is discouraged?

Why would they have the same family? They're not even the same struct (or enum or w/e). It's just a way to tie Foo() and Foo(..) (for some struct Foo) together.

Then I don't see how this could be used for partial moves out of and placement new in Box-like structs.

The ->'s in the functions' signatures are what do the trick. In particular the Self->T, together with the Self::Target->T::Target.

From the caller perspective they make no difference, but from inside of a function this allow to move out of a reference, that's forbidden for mutable references.

This is up to the caller, but generally we could simply forbid the moves of not Unpin types if it was backward compatible.

The very problem of plain &move refs is that you have no mechanism for saying "this was deinitialized, but that wasn't", so as a result plain version made use of referenced binding impossible, thus nihilating any benefits of introduction move refs. in first place.

I agree that Partially initialized types on its own is completely different feature, but in very heart of it was the idea to declare what will be initialized, what will be deinitialized and what will remain untouched.

They are not different, but very close: in fact, ! is meant to be just stronger version of *: first does require (part of) a binding to be already initialized, but the latter doesn't. Both the * and ! give a promise to leave memory initialized.

Mhm, I think I see it now. You can give deref_move partially uninitialized type which the function promises to partially/fully initialize. Then it uses the provided closure to initialize itself. The U I guess is just used to return data from the closure. This can also work in the other direction.

However for this to work with Box you need Box<A> and Box<B> to have the same family if A and B do, which in turn means that Box<A(..)> and Box<B>(..) would also have the same family

Also, what happens if f or deref_move panic? The reason take_mut wasn't accepted in the stdlib was that it solves this problem by aborting, so I don't think it would be a valid thing to do here.

Actually, they do. One lets the caller pass uninitialized data and get back initialized data, the other instead works kinda like a mutable reference.

You can't move out of a mutable reference because there's no guarantee that you will put back a value. And even if the compiler checked that you do, there's still the problem of panics that it would have to forbid.

Generator needs a guarantee that after the first call to resume it will never be moved again. You can't make this up to the caller unless you make Generator::resume an unsafe function. Preventing the move of not Unpin type would instead be a pretty big breaking change.

Then they are different. Only one of them has actual features, the other (at least for now) is just a mutable reference with a different name. Even the uses are different. One can be used to initialize something, while the other should allow temporarily moving out of something.

Yes, they do have the same family - a family is a type with the state elided/erased.

On panic, it should fall back to the deinitialized state always. Oh hm this doesn't exactly play nice with inference does it...

This makes the static type dependent on runtime behaviour, which is no good. Alternatively you could make the user always handle the worst case, but that makes deref_move unusable if you ever need to do an indexing operation inside f.