Safe Partial Initialization

y86-dev · April 8, 2022, 7:34pm

In systems programming using partial initialization is often unavoidable (for example the linux kernel has a lot of self referential structs), at the moment rust facilitates partial initialization through the std::mem::MaybeUninit<T> type. However the relevant functions of this interface are unsafe making partial initialization error prone and in general more difficult to deal with than other safe parts of rust.

I found these attempts to fix this problem, but none of them were implemented:

the &uninit T pointer RFC
partially initialized types, again discussed here

I want to suggest a similar approach to the partially initialized types, because i think that this is a case for the type system. However I think an approach, which does not introduce new syntax, is not only possible, but beneficial. This feature is very important for the ergonomics of some applications. But adding syntax does not seem to add any real improvement over the approach that i found. And leaving the syntax as is, also makes integration with existing code much easier.

The code to add to the stdlib^[1]:

//// In core::marker

/// This union is used to mark that a field should be uninitialized,
/// because it will be initialized later and this will be checked by
/// the compiler through the type system.
pub union Uninit<T> {
    _value: std::mem::ManuallyDrop<T>,
    empty: (),
}

impl<T> Uninit<T> {
    /// Create a new Uninit this function is necessary because Uninit<T>
    /// is a union, if we could use a struct instead, it might be more
    /// ergonomic to write just `Uninit`
    pub fn new() -> Self {
        Uninit { empty: () }
    }
}

/// This trait marks all types, which do not contain an Uninit.
/// It helps facilitate soundness when using partially initialized
/// types, because having multiple points in the code, that could create 
/// mutable references may change the type (by initializing parts of it) 
/// of the data present behind them, thus invalidating the type of other refs.
pub unsafe auto trait FullyInit {}

/// Uninit does not implement FullyInit and thus all partially initialized
/// types generated by the compiler will also not implement this trait.
impl<T> !FullyInit for Uninit<T> {}

//// -----

//// In core

/// Compiler built in macro in type position that creates (and reuses if 
/// occurring multiple times) a partially initialized type replacing all
/// fields in the list by Uninit<$field>.
#[macro_export]
macro_rules! partial_uninit {
    (($($field:ident),* $(,)?) init($($init:ident),* $(,)?) $typ:ty) => {
        /* compiler built-in */
    };
    (($($field:ident),* $(,)?) $typ:ty) => {
        /* compiler built-in */
    };
}

//// -----

//// In alloc::sync

/// additional FullyInit bound to disable cloning Arc<!FullyInit>, because 
/// this way you could get two Arc<Mutex<T>> and 
/// Arc<Mutex<partial_uninit!((...) T)>> to refer to the same data, which 
/// is not sound, because you could initialize it a second time (not dropping the
/// value present)
impl<T: ?Sized + FullyInit> Clone for Arc<T> {
    fn clone(&self) -> Self {
        // unchanged...
    }
}

//// -----

//// In alloc::rc
/// additional FullyInit bound to disable cloning Rc<!FullyInit>, because 
/// this way you could get two Rc<RefCell<T>> and 
/// Rc<RefCell<partial_uninit!((...) T)>> to refer to the same data, which 
/// is not sound, because you could initialize it a second time (not dropping the
/// value present)
impl<T: ?Sized + FullyInit> Clone for Rc<T> {
    fn clone(&self) -> Self {
        // unchanged...
    }
}

//// -----

how it would look like to use this api:

// you can use any struct type with this approach.
struct Foo {
    a: u64,
    b: String,
}

// create Foo step by step
fn make_foo1() -> Foo {
    let mut foo = Foo {
        // this is a special type in std::marker. The compiler automatically coerces any type
        // that contains it to the correct uninitialized variant.
        a: Uninit::new(),
        b: "i am already initialized".to_owned(),
    };
    println!("{}", foo.b); // printing initialized field is ok
    foo.a = 10; // assigning to uninit field is ok, this does not drop foo.a
    foo
}

// naming the partially initialized type is done using a type position macro containing the list of
// uninitialized fields, only fields accesible to this module can be mentioned here.
fn make_partial_foo1() -> partial_uninit!((b) Foo) {
    Foo {
        a: 0,
        b: Uninit::new(),
    }
}

// then another function could take a partially initialized Foo and fully initialize it
fn init_foo1(mut foo: partial_uninit!((b) Foo)) -> Foo {
    foo.b = "hello world".to_owned();
    foo
}

// multiple uninitialized fields
fn init_foo2(mut foo: partial_uninit!((a, b) Foo)) -> Foo {
    foo.a = 42;
    foo
}

// initializing through a pointer requries additional information about what you are initializing
// thus transforming the pointer (in this case to &mut Foo)
fn init_foo3(foo: &mut partial_uninit!((a) init(a) Foo)) {
    foo.a = 4242;
    // forgetting this assignment (or having at least one path where it is not initialized) would
    // produce an error along the lines of "foo.a is not initialized, foo.a needs to be initialized
    // in this function due to this `init(a)`"
}

// the type variant is also suitable for use with generics
fn init_foos(foos: Vec<partial_uninit!((b) Foo)>) -> Vec<Foo> {
    foos.into_iter().map(init_foo1).collect()
}

// also calling generic functions with it is safe
fn create_boxed_foo() -> Box<partial_uninit!((b) Foo)> {
    Box::new(Foo {
        a: 0,
        b: Uninit::new(),
    })
}

// even initializing through Arc<Mutex<_>> is fine
fn init_foo4(foo: Arc<Mutex<partial_uninit!((a, b) init(a, b) Foo)>>) -> Arc<Mutex<Foo>> {
    foo.lock().unwrap().a = 0;
    foo.lock().unwrap().b = "bar".to_owned();
    foo
}

// you can also write explicit impl blocks for the type variant:
impl partial_uninit!((a) Foo) {
    // this self type also is special, you are allowed to perform a 
    // pointer transformation!
    fn init(self: &mut partial_uninit!((a) init(a) Foo)) {
        self.a = 0;
    }
}

// even implementing traits is allowed!
impl Debug for partial_uninit!((a) Foo) {
    fn fmt(&self, f: &mut Formatter) -> Result {
        write!(f, "Foo {{ a: <uninit>, b: {} }}", self.b)
    }
}

The problem that this function could create

fn bad() {
    let foo: Arc<Mutex<partial_uninit!((a, b) Foo)>> = Arc::new(Mutex::new(Foo {
        a: Uninit::new(),
        b: Uninit::new(),
    }));
    let foo2: Arc<Mutex<partial_uninit!((a, b) Foo)>> = Arc::clone(&foo);
    let foo = init_foo4(foo);
    // now this is bad, we have references to the same
    let _: Arc<Mutex<Foo>> = foo;
    let _: Arc<Mutex<partial_uninit!((a, b) Foo)>> = foo2;
}

Is mitigated by the introduction of the FullyInit marker trait. While it is too strict to ask for T: FullyInit to allow Arc<T>: Clone, this requirement may be softened later without breaking compatibility. You still need to allow the creation of Arc<!FullyInit>, because you may want to pin something using a reference count and then set its fields, kinda like this:

/// Binding to some C struct:
#[repr(C)]
pub struct DoubleListNode {
    next: *const (),
    prev: *const (),
}
#[repr(C)]
pub struct SomethingInAList {
    node: DoubleListNode,
    data: u64,
}

impl SomethingInAList {
    pub fn new(data: u64) -> partial_uninit!((node) SomethingInAList) {
        SomethingInAList {
            node: Uninit::new(),
            data,
        }
    }
}

impl partial_uninit!((node) SomethingInAList) {
    pub fn init(self: Pin<&mut partial_uninit!((node) init(node) SomethingInAList)>) {
        unsafe {
            // SAFETY: We are pinned, which means node is also pinned.
            let mut node = Pin::map_unchecked_mut(self, |s| &mut s.node);
            node.next = &mut *node as *const DoubleListNode as *const ();
            node.prev = &mut *node as *const DoubleListNode as *const ();
        }
    }
}

fn main() {
    let sial = Arc::pin(Mutex::new(SomethingInAList::new(42)));
    unsafe {
        // SAFETY: sial is pinned in the Arc and we never misuse the Mutex (the mutex api is very
        // bad for this purpose, but imagine if it would play nicely here [we would not need this
        // unsafe])
        Pin::new_unchecked(&mut *sial.lock().unwrap()).init();
    }
    let _: Pin<Arc<Mutex<SomethingInAList>>> = sial;
}

Benefits of this approach:

requires no new syntax.
makes simple partial initialization safe and easy.
gives more complex tools to those who have more complicated paths for uninitialized data.
having a compiler supported way of partial initialization avoids having to write unsafe constructor functions with the invariant, that another init function needs to be called before the use of the returned type.

Problems to tackle:

how does the compiler do its builtin magic? I do not know if it even is possible to achieve the described type coercion of pointers mid-function. And what parts of the compiler would need changing.
would it be beneficial/easy to try to generalize the required properties of this feature to provide support for even more compiler based checking? (for example tagging raw pointers with a designated cleanup strategy, so C FFI writers would be able to know which pointers need to be freed and which would need Box::from_raw etc.) in the context of this issue i think speed of implementation matters more than immediate generalization, so separating those two would be a good idea, if the general approach is needed and feasible.
better syntax for partial_uninit!
improving names
is FullyInit a sound concept? because it is rather arbitrary that types like Rc and Arc need to handle these, because only in combination with RefCell/Mutex unsoundness arises.
FullyInit makes Arc's too restrictive, because you cannot store an Arc<!FullyInit> even if you cannot get a &mut inner reference (e.g. your application stores a partially initialized struct in some state accessible by multiple threads for some time of your program [because they need to read it], at some point you take control of all the Arc's, drop all but one and then call into_inner after which you fully initialize your struct and use it. with the current implementation this would result in a compile Error even though it would actually be sound^[2])

I did not mark this as a pre-RFC because i have never even written a pre-RFC and thus know very little about it. If no big issues arise and most of the problems can be formalized, then i would gladly try to create this as my first RFC.

The documentation on these elements is nowhere near sufficient/expressive, i added it to explain each of the new items in stdlib. ↩︎
you could fix this using an additional trait implemented for Mutex<T>/RefCell<T> (probably UnsafeCell<T>) where T: !FullyInit and then disallow clone for that trait, but this seems like a workaround with a high chance of unsoundness, so i think having a tighter restriction at the beginning is safer. ↩︎

ekuber · April 8, 2022, 7:56pm

I explored at how to expose this functionality in a relatively safe manner in a crate:

It acts as a regular proc-macro providing a statically checked builder, but it also exposes the underlying MaybeUninit<T> in order to manually set bytes, if needed.

scottmcm · April 8, 2022, 8:10pm

I think this amounts to inter-function typestate?

My first instinct here is that this is lots of new compiler magic. And I think it's effectively new syntax -- it might look like a macro, but as far as I know the macro in something like

fn init_foo1(mut foo: partial_uninit!((b) Foo)) -> Foo {

can't expand to anything that currently exists. (After all, such a macro can't know anything about Foo because such information is not available at macro expansion time.)

The lang team is also skeptical of new auto traits -- for example, I might have written a custom Rc type, but it sounds like this proposal would make it UB because I'm not guarding my clone by T: FullyInit. And making existing sound code unstable isn't something we generally want to do. Or, worse, I can call Arc::clone for any T today, so adding a new bound to would be a breaking change, and thus is a non-starter.

That said, I know libs-api is interested in making it easier to work with uninit stuff. One thing that came up recently is that things like Arc are growing a longer and longer list of APIs like new_uninit and new_uninit_slice and new_zeroed and ..., so anyone interested in making a plan for a more holistic way to do things like that across Rc/Arc/Box/more would be welcome.

My suggestion for now would be to start with seeing what can happen at the library level, and see how far you can get. (Probably outside std initially, for the fastest iteration speed.) For example, I suspect that "create a second type that wraps all the fields in MaybeUninit<...>" would be possible in a derive macro, and that's likely fine -- making this opt-in for types would be fine, since not every type needs or wants it. (That doesn't do the typestate parts, but could be a useful primitive underneath it. And that derive could also fail for things that aren't repr(C), since you're clearly wanting transmute through this, but that's UB for most types. Etc.)

y86-dev · April 8, 2022, 8:22pm

I see, that crate looks really useful and awesome in situations where you own the data. However when dealing with pinned self referential structs in reference counted wrappers (or when interfacing with existing types, where all you have is a pinned pointer), you cannot use the builder pattern:

use makeit::*;
use std::{pin::*, sync::*};

/// Binding to some C struct:
#[derive(Builder)]
#[repr(C)]
pub struct DoubleListNode {
    next: *const (),
    prev: *const (),
}

#[derive(Builder)]
#[repr(C)]
pub struct SomethingInAList {
    node: DoubleListNode,
    data: u64,
}

fn main() {
    let sial = Arc::pin(Mutex::new(SomethingInAList::builder().set_data(42)));
    unsafe {
        // SAFETY: sial is pinned in the Arc and we never misuse the Mutex (the mutex api is very
        // bad for this purpose, but imagine if it would play nicely here [we would not need this
        // unsafe])
        let guard = sial.lock().unwrap();
        let ptr = &mut *guard as *const _ as *const ();
        Pin::new_unchecked(&mut *guard).set_next(ptr);
        // ^^ compiler error here: method not found for Pin<&mut SomethingInAListBuilder<NodeUnset, DataSet>> 
        Pin::new_unchecked(&mut *guard).set_prev(ptr);
        // same problem here
    }
    let _: Pin<Arc<Mutex<SomethingInAList>>> = sial;
    // ^^ also type error here, because sial is not completely initialized. 
}

I think your crate solves the issue of partial initialization of owned data perfectly, sadly not only that kind of partial initialization exists.

y86-dev · April 8, 2022, 9:21pm

Yes inter-function typestate is a good way to describe this, maybe that would be a possible generalization.

It definitely is a lot of compiler magic, but i feel like rust already provides a lot of compiler magic (e.g. borrow checker, futures, matching, ZST etc.) so i feel like it would be the language that could do this right, the inter-function typestate could also be rather restrictive at the beginning. Partial initialization (PI) just does not feel ergonomic at all in rust, in C it still feels like doing normal C. But coding C also feels like balancing a running chainsaw on a unicycle while blindfolded. PI is nothing that should require unsafe in the first place, (a simple let a; a = ...; deals with it although in this case the part is already well defined for the compiler).

scottmcm:

And I think it's effectively new syntax -- it might look like a macro, but as far as I know the macro in something like
fn init_foo1(mut foo: partial_uninit!((b) Foo)) -> Foo {
can't expand to anything that currently exists. (After all, such a macro can't know anything about Foo because such information is not available at macro expansion time.)

With 'no new syntax' i mean it would not require authors of proc-macro crates and their dependencies (syn, quote etc.) to accommodate the new syntax. If it were some syntax like Foo(<initialized stuff here>) suggested by a prior attempt that would require some bigger changes in syn/quote and proc-macros dealing with types would have to account for that. Of course it will not be a macro like any other macro, but it would still expand to some type at macro evaluation time, the compiler just has full control over that type and allows for this type to be handled specially (automatic coercion etc.).

Yes this proposal would unfortunately make your custom Rc produce unsoundness^[1]. And you are also right about the breakage of a function using Arc::clone with a generic T, that seems to have slipped by write-up... I do not yet see an easy way out, implicitly expecting T: FullyInit (similar to Sized) might work to prevent the {Arc,Rc}::clone and then one would only need to add T: ?FullyInit to functions supporting PI types. I think that this way existing code should still work, but smart pointers and other wrappers would need the additional bound relaxation. Is there another reason why the lang team is skeptical of new auto traits or is it to keep down the amount of auto trait clutter?

My main problem with the current APIs is the use of MaybeUninit<T>, it is reasonably fine when writing new software, but when you want interoperability with C and you do not want your safe wrapper to contain MaybeUninit<T> (because then you do not really have a safe wrapper), then you are out of luck. I do not know, if it is possible, but maybe adding the same kind of coercion that i described for the partial_uninit!(...) type variants to MaybeUninit<T> would alleviate this pain, but that still would require some kind of annotation in functions taking a pointer, thus being not really different from my proposal.

Thanks for the good feedback, but as i explained in my reply to @ekuber, changing types without automatic coercion is not really a viable option, as the interfaces need to work with already pinned types.

I think it is not UB, because std::mem::forget is safe and it only would allow you to initialize the same memory multiple times without dropping the value present, this can then lead to logical errors, but not UB. ↩︎

ekuber · April 8, 2022, 10:33pm

Would providing an unsafe API to construct a Builder from an existing MaybeUninit<T> help cater to that use-case? I'm picturing a workflow of "get unsafe thinghy, put it in safe box asap". Of course, that crate will not cover every use-case: restricting what you can do is how we get security by default, and unsafe for when you really need it but be careful. With all of that, I need to think about your use-case in more detail to build a better mental model.

y86-dev · April 8, 2022, 10:46pm

After a bit of thinking, i think i have got an idea to fix this by extending the library by @ekuber. When i get some progress i will write some more, and if it works well enough, i will give ekuber a pull request.

y86-dev · April 8, 2022, 10:48pm

The problem is rather, that one creates Ref<T> where Ref is a reference counted and pinned pointer to T, all of the APIs want Ref<T> and i would need a good way to transform a Ref<TBuilder> to Ref<T>, but i think i have got an idea

y86-dev · April 9, 2022, 10:37am

I experimented a little bit and found a design that seems to work, it would need some heavy extending of @ekuber's makeit crate, but in principle it should work:

use makeit::*;
use std::marker::*;
use std::mem::*;
use std::pin::*;
use std::ptr::*;
use std::sync::*;

#[derive(Builder)]
#[with_wrapper(
    Pin<Arc<Mutex<_>>> =>
    access: .lock().unwrap();
    creation: Arc::pin(Mutex::new(_));
)]
#[repr(C)]
struct ListNode {
    next: *mut ListNode,
    prev: *mut ListNode,
}

this code would look expanded something like this (i also added the current code generated by makeit):

#![allow(dead_code)]
#![deny(unsafe_op_in_unsafe_fn)]

use makeit::*;
use std::marker::*;
use std::mem::*;
use std::pin::*;
use std::ptr::*;
use std::sync::*;

#[repr(C)]
struct ListNode {
    next: *mut ListNode,
    prev: *mut ListNode,
}

#[allow(non_snake_case)]
#[deny(unused_must_use, clippy::pedantic)]
mod ListNodeBuilderFields {
    use super::*;
    #[must_use]
    #[repr(transparent)]
    pub(super) struct ListNodeBuilder<FieldNext, FieldPrev> {
        inner: ::core::mem::MaybeUninit<ListNode>,
        __fields: ::core::marker::PhantomData<(FieldNext, FieldPrev)>,
    }
    pub struct NextSet;
    pub struct PrevSet;
    pub struct NextUnset;
    pub struct PrevUnset;
    impl ::makeit::Buildable for ListNode {
        type Builder = ListNodeBuilder<NextUnset, PrevUnset>;
        /// Returns a builder that lets you initialize `Self` field by field in a zero-cost,
        /// type-safe manner.
        #[must_use]
        #[allow(unused_parens)]
        fn builder() -> Self::Builder {
            let builder = ListNodeBuilder {
                inner: ::core::mem::MaybeUninit::<Self>::uninit(),
                __fields: ::core::marker::PhantomData,
            };
            builder
        }
    }
    impl ListNodeBuilder<NextSet, PrevSet> {
        /// Finalize the builder.
        #[must_use]
        pub fn build(self) -> ListNode {
            unsafe { self.unsafe_build() }
        }
    }
    impl<FieldPrev> ListNodeBuilder<NextUnset, FieldPrev> {
        #[must_use]
        pub fn set_next(mut self, value: *mut ListNode) -> ListNodeBuilder<NextSet, FieldPrev> {
            self.inner_set_next(value);
            let ptr = &self as *const ListNodeBuilder<NextUnset, FieldPrev>
                as *const ListNodeBuilder<NextSet, FieldPrev>;
            ::core::mem::forget(self);
            unsafe { ptr.read() }
        }
        fn inner_set_next(&mut self, value: *mut ListNode) {
            let inner = self.inner.as_mut_ptr();
            unsafe {
                addr_of_mut!((*inner).next).write(value);
            }
        }
    }
    impl<FieldNext> ListNodeBuilder<FieldNext, PrevUnset> {
        #[must_use]
        pub fn set_prev(mut self, value: *mut ListNode) -> ListNodeBuilder<FieldNext, PrevSet> {
            self.inner_set_prev(value);
            let ptr = &self as *const ListNodeBuilder<FieldNext, PrevUnset>
                as *const ListNodeBuilder<FieldNext, PrevSet>;
            ::core::mem::forget(self);
            unsafe { ptr.read() }
        }
        fn inner_set_prev(&mut self, value: *mut ListNode) {
            let inner = self.inner.as_mut_ptr();
            unsafe {
                addr_of_mut!((*inner).prev).write(value);
            }
        }
    }
    impl<FieldNext, FieldPrev> ListNodeBuilder<FieldNext, FieldPrev> {
        /// Returns a mutable pointer to a field of the type being built. This is useful if the
        /// initialization requires subtle unsafe shenanigans. You will need to call
        /// `.unsafe_build()` after ensuring all of the fields have been initialized.
        #[must_use]
        pub unsafe fn ptr_0(&mut self) -> *mut *mut ListNode {
            let inner = self.inner.as_mut_ptr();
            addr_of_mut!((*inner).next)
        }
        /// Returns a mutable pointer to a field of the type being built. This is useful if the
        /// initialization requires subtle unsafe shenanigans. You will need to call
        /// `.unsafe_build()` after ensuring all of the fields have been initialized.
        #[must_use]
        pub unsafe fn ptr_1(&mut self) -> *mut *mut ListNode {
            let inner = self.inner.as_mut_ptr();
            addr_of_mut!((*inner).prev)
        }
        /// HERE BE DRAGONS!
        ///
        /// # Safety
        ///
        /// You're dealing with `MaybeUninit`. If you have to research what that is, you don't
        /// want this.
        #[must_use]
        pub unsafe fn maybe_uninit(self) -> ::core::mem::MaybeUninit<ListNode> {
            self.inner
        }
        /// Only call if you have set a field through their mutable pointer, instead
        /// of using the type-safe builder. It is your responsibility to ensure that
        /// all fields have been set before doing this.
        #[must_use]
        pub unsafe fn unsafe_build(self) -> ListNode {
            unsafe { self.inner.assume_init() }
        }
    }

    ///////////////////////////////////////
    //////////// New code here ////////////
    ///////////////////////////////////////

    impl ::makeit::Buildable for ListNode {
        #[must_use]
        #[allow(unused_parens)]
        fn pin_arc_mutex_builder() -> PinnedArcMutexListNodeBuilder<NextUnset, PrevUnset> {
            let builder = PinnedArcMutexListNodeBuilder {
                inner: Arc::pin(Mutex::new(<ListNode as ::makeit::Buildable>::builder())),
            };
            builder
        }
    }

    // small helper macro to help with safer transmuting, you need to own the expression $what by
    // value and the types $from and $to need to be transmutible.
    macro_rules! transmute {
        (($what:expr): $from:ty => $to:ty) => {
            match $what {
                what => {
                    let ptr = &what as *const $from as *const $to;
                    ::core::mem::forget(what);
                    unsafe { ptr.read() }
                }
            }
        };
    }

    #[must_use]
    #[repr(transparent)]
    pub(super) struct PinnedArcMutexListNodeBuilder<FieldNext, FieldPrev> {
        inner: Pin<Arc<Mutex<ListNodeBuilder<FieldNext, FieldPrev>>>>,
    }

    impl PinnedArcMutexListNodeBuilder<NextSet, PrevSet> {
        /// Finalize the builder.
        #[must_use]
        pub fn build(self) -> Pin<Arc<Mutex<ListNode>>> {
            unsafe { self.unsafe_build() }
        }
    }
    impl<FieldPrev> PinnedArcMutexListNodeBuilder<NextUnset, FieldPrev> {
        #[must_use]
        pub fn set_next(
            self,
            value: *mut ListNode,
        ) -> PinnedArcMutexListNodeBuilder<NextSet, FieldPrev> {
            self.inner.lock().unwrap().inner_set_next(value);
            transmute!((self): _ => PinnedArcMutexListNodeBuilder<NextSet, FieldPrev>)
        }
    }
    impl<FieldNext> PinnedArcMutexListNodeBuilder<FieldNext, PrevUnset> {
        #[must_use]
        pub fn set_prev(
            self,
            value: *mut ListNode,
        ) -> PinnedArcMutexListNodeBuilder<FieldNext, PrevSet> {
            self.inner.lock().unwrap().inner_set_prev(value);
            transmute!((self): _ => PinnedArcMutexListNodeBuilder<FieldNext, PrevSet>)
        }
    }

    impl<FieldNext, FieldPrev> PinnedArcMutexListNodeBuilder<FieldNext, FieldPrev> {
        /// Returns a mutable pointer to a field of the type being built. This is useful if the
        /// initialization requires subtle unsafe shenanigans. You will need to call
        /// `.unsafe_build()` after ensuring all of the fields have been initialized.
        #[must_use]
        pub unsafe fn ptr_0(&mut self) -> *mut *mut ListNode {
            unsafe { self.inner.lock().unwrap().ptr_0() }
        }
        /// Returns a mutable pointer to a field of the type being built. This is useful if the
        /// initialization requires subtle unsafe shenanigans. You will need to call
        /// `.unsafe_build()` after ensuring all of the fields have been initialized.
        #[must_use]
        pub unsafe fn ptr_1(&mut self) -> *mut *mut ListNode {
            unsafe { self.inner.lock().unwrap().ptr_1() }
        }
        /// HERE BE DRAGONS!
        ///
        /// # Safety
        ///
        /// You're dealing with `MaybeUninit`. If you have to research what that is, you don't
        /// want this.
        #[must_use]
        pub unsafe fn maybe_uninit(self) -> Pin<Arc<Mutex<::core::mem::MaybeUninit<ListNode>>>> {
            transmute!((self.inner): Pin<Arc<Mutex<ListNodeBuilder<FieldNext, FieldPrev>>>> => Pin<Arc<Mutex<MaybeUninit<ListNode>>>>)
        }
        /// Only call if you have set a field through their mutable pointer, instead
        /// of using the type-safe builder. It is your responsibility to ensure that
        /// all fields have been set before doing this.
        #[must_use]
        pub unsafe fn unsafe_build(self) -> Pin<Arc<Mutex<ListNode>>> {
            transmute!((unsafe { self.maybe_uninit() }): _ => Pin<Arc<Mutex<ListNode>>>)
        }
    }
}

this would then allow one to write

fn new_init_list() -> Pin<Arc<Mutex<ListNode>>> {
     let list = List::pin_arc_mutex_builder();
     let ptr = &mut *list.lock().unwrap() as *mut _ as *mut ListNode;
     list.set_prev(ptr).set_next(ptr).build()
}

however what still seems a little bit difficult is this:

#[derive(Builder)]
#[repr(C)]
struct SomethingWithList {
    list: ListNode,
    data: *const (),
}

but i feel like, that could be solved by nested builders and some other helper functions/macros.

chrefr · April 9, 2022, 9:56pm

I think the opposite: a language that includes so much concepts (maybe even too much), should avoid introducing new ones as much as it can (I also don't think those things count as "magic" but as language features; however, producing a-macro-that-isn't-really-macro is a magic).

No, that is the point: it cannot expand to a type at macro expansion time, it cannot expand at macro expansion time at all. The compiler cannot inspect types at macro expansion types - it's not just something your macros cannot do, it's something the compiler cannot do, and changing that is not an option. It may also not be possible to create a deduplicate the generated types at macro expansion time. It will have to exist at least on the HIR. At this point I much prefer to it be a language syntax, because this way it's at least not magical.

y86-dev:

/// additional FullyInit bound to disable cloning Arc<!FullyInit>, because 
/// this way you could get two Arc<Mutex<T>> and 
/// Arc<Mutex<partial_uninit!((...) T)>> to refer to the same data, which 
/// is not sound, because you could initialize it a second time (not dropping the
/// value present)
impl<T: ?Sized + FullyInit> Clone for Arc<T> {

The problem is not the Arc, it's the interior mutability behind (the UnsafeCell of the Mutex). The same problem arises with Rc<RefCell<T>>, or even with &RefCell<T>. And restricting UnsafeCell's Clone impl is not possible, because this impl does not exist (and for a good reason) - and even if it was, Arc/Rc/& doesn't restrict its cloneness to only when UnsafeCell<T>: Clone, because it is clonable anyway. In short, this idea is incompatible with interior mutability, and that's a huge drawback.

chrefr · April 9, 2022, 9:59pm

Thinking about it again, it's not only a problem with interior mutability: the same problem happens with &mut references. How do you inform the underlying value it was changed through a &mut reference? It's not unsound, but it's certainly a footgun.

SkiFire13 · April 10, 2022, 8:11am

The problem with &Mutex<T> (you don't even need Arc), &Cell<T>, &RefCell<T> ecc ecc and &mut<T> references is related to variance. They're invariant over T, but you're trying to propose some coercion/subtyping mechanism that goes against that. I think could be solved by requiring that:

to go from a type (which may also be not-fully initialized) to a less initialized type, then you need covariance with respect to that type, because a more initialized type is effectively a subtype of a less initialized one;
there may be a special rule for going from a less initialized type to a more initialized type, given that you initialized it, in some cases. For example I think this would be sound for &mut T because you have exclusive access, thus nobody can uninitialize it while you expect it to be initialized, though it isn't for &Mutex<T> because it is shared, so it needs something other than variance to work.

The FullyInit trait is definitely a breaking change if it isn't an implicit trait bound like Sized. Note that adding an implicit trait bound is big bar for adding a feature: it wasn't done for things like Pin or Leak, so I doubt it will be done for this proposal.

y86-dev · April 18, 2022, 10:27am

Thanks everyone for their valuable feedback.

I have been working on a library solution and wanted to finish writing the first version of it. My suggested changes to the stdlib and compiler were not feasible, but luckily using the existing type system similar ergonomics at the call site can be achieved. With more type abstractions simple declarations of pinned initialized types are also written completely without unsafe code. I would really appreciate some feedback for this library but I am unsure, if this is the right place for it. Here is the GH repo: GitHub - y86-dev/pinned-init

@chrefr your arguments are right, the problem of pointer transformation is difficult to handle with &mut, because the origin of that pointer might not be aware of the initialization.

My current problems with my library are (it is not yet finished):

transmuting SomeStruct<false> to SomeStruct<true> is only sound, if both have the same layout and i was not able to find such a guarantee and i opened an issue in the UCG repo. If I want to support even more flexible initialization, it might also be a good idea to require #[repr(C)] on the structs that need to be pin initialized. This way using the initialized variant of a type would be even more ergonomic, because the type is equivalent, not just some wrapper (see StaticUninit<T, true>). But this requirement then would disallow the use of Box, Arc, Rc and many more, because they are not #[repr(C)]. It is possible to write a custom ReprC{Box/Arc/Rc}, but that feels a little hacky and might be a source for unnecessary bugs.
adding parameters to the initialization, when my proc macro implements my PinnedInit trait, it would need to also know the correct type for the paramter (which would be a new associated type). This might result in parameters like this: ((), (), ((), ((), String)), *mut i32, ()) which are very unergonomic. I might be able to create a trait which would enable conversion of (String, *mut i32) to the preivious type, but i havent tested that yet.

While writing this library, i also found pin-init, another library trying to achieve the same thing, but i ended up writing my own library, because storing not initialized types is not really possible (you could store the init closure, but that requires dyn) and initialization is not checked by the type system like my crate does it.

system · July 17, 2022, 10:28am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Pre-RFC] safe Uninit Types language design	8	1178	August 21, 2023
?Uninit types? language design	12	1497	May 10, 2021
Pre-RFC: Partial Initialization and Write Pointers language design	34	3188	March 25, 2019
Less Unsafe	9	2276	March 25, 2019
[Pre-RFC] Partially Initialized Types language design	19	2752	March 25, 2019

Safe Partial Initialization

The code to add to the stdlib[1]:

how it would look like to use this api:

Benefits of this approach:

Problems to tackle:

Related topics

The code to add to the stdlib^[1]: