`std::marker::PhantomUnsized` marker type

in library/core/src/marker.rs:

/// A marker type without a compile-time or runtime defined size.
///
/// This is distinct from other dynamically-sized types as it does not have
/// pointer metadata. References/pointers to slices (`[T]`) and trait objects
/// (`dyn Trait`) carry extra metadata (length for slices, vtable pointer for
/// trait objects).
///
/// This is useful for creating immovable structs, or structs with trailing data.
///
/// Example:
///
/// ```rust
/// pub struct MyImmovableType {
///   a: u32,
///   b: f64,
///   c: String,
///   _unsized: PhantomUnsized,
/// }
///
/// // ...
///
/// // A reference to `MyImmovableType` is "thin" like references to sized types.
/// assert_eq!(size_of::<&MyImmovableType>(), size_of::<&()>());
/// ```
pub extern type PhantomUnsized;

and to make it possible to construct these immovable types,

in library/core/src/mem/mod.rs:

/// Return the minimum size of a type in bytes.
///
/// For unsized types, this is the size of the "leading" data before the unsized
/// field. For sized types, this is equivalent to [`size_of`].
///
/// ```rust
/// pub struct MyTypeWithDST<T: ?Sized> {
///   a: u32,
///   b: f64,
///   trailer: T,
/// }
///
/// assert_eq!(min_size_of::<MyTypeWithDST<[u32]>>(), 16);
/// assert_eq!(min_size_of::<MyTypeWithDST<dyn Trait>>(), 16);
/// assert_eq!(min_size_of::<MyTypeWithDST<PhantomUnsized>>(), 16);
/// ```
pub const fn min_size_of<T: ?Sized>() -> usize;

// `min_align_of` is not needed as `align_of` already returns "the ABI-required
// minimum alignment of a type". However, the type parameter on `align_of` needs
// to be relaxed to `?Sized`. The documentation needs to be updated to specify
// the behavior of `dyn Trait` since the actual alignment is defined at runtime.

// The trait bound on `size_of` must be changed from `T: ?Sized` to
// `T: ?MetaSized` so `size_of::<PhantomUnsized>()` results in a compiler error.

This replaces extern type; developers would use a newtype instead:

pub struct MyExternType(PhantomUnsized);

Attempting to "move" a runtime-unsized type with std::ptr::copy_nonoverlapping is generally undefined behavior except when library invariants are satisfied. For example, a "packet" type with a size field and trailing bytes cannot be moved safely by another other library because its invariants (presence of trailing bytes within its allocation) may be violated.

References to runtime-unsized types will not have the dereferenceable LLVM attribute (or maybe they could if it isn't UB to read past the end of a dereferenceable ptr).

1 Like

Isn't this just extern types? Also, you didn't provide the most important part: the motivation.

I’m assuming that the motivation is “provide the behavior of extern type with a Rustier syntax”.

3 Likes

Yes, but it doesn't expose new syntax (the change to core::marker uses an extern type, but part of the intent is to keep that syntax perma-unstable).

Having PhantomUnsized also documents the existence and behavior of this a bit better than stuffing it under the type keyword in the Rust documentation. And imo it makes sense for it to be a marker type.

As an extension, PhantomUnsized could gain a type parameter specifying the pointer metadata. The type must be Copy and have the same layout as usize (for now, assuming the compiler can't take types larger/smaller than usize).

For example you could define your own 2D plane type:

// NOTE: This would not be compatible with the current `usize` restriction.
// However, I'm assuming this restriction would exist because the compiler
// assumes pointer metadata has the same layout as a pointer.
#[derive(Copy)]
struct PlaneMeta {
  width: u32,
  height: u32,
}

struct Plane<T> {
  // Stride is stored in the Plane header because it is the same across all
  // pointers/references.
  stride: u32,
  // :)
  _align_hack: [T; 0],
  _phantom: PhantomData<T>,
  _unsized: PhantomUnsized<PlaneMeta>,
}

// `&Plane<u8>` now works as a 2D slice. This can also be extended to any
// number of dimensions.

Libraries could opt in to MetaSized with a new trait PointerLayout (probably under std::ops).

NOTE: I'm not sure if this should be implemented on the pointer metadata or the pointee.

NOTE 2: size_of_val_raw may not interact with this well as PointerLayout implementations can assume their pointers are dereferenceable. However the docs say that "this function is only safe to call" if the metadata is from a slice or trait object. For now I've added a const generic arg specifying whether the call is from *_of_val (using references) or *_of_val_raw (using ptrs).

in main/library/core/src/ops/mod.rs:

pub trait std::ops::PointerLayout<T: Pointee<Metadata = Self>> {
  unsafe fn layout_of<const RAW: bool>(ptr: *const T) -> Layout;
}
// This implementation makes `Plane<T>` `MetaSized`.
impl<T> std::ops::PointerLayout<Plane<T>> for PlaneMeta {
  unsafe fn layout_of<const RAW: bool>(ptr: *const Plane<T>) -> Layout {
    // Panicking is free because of the const generic, but implementations do not
    // have to panic because `*_of_val_raw` is allowed to invoke UB.
    if RAW {
      panic!("plane ptrs are not supported: pointer must be deref to compute");
    }
    let stride = unsafe { (&raw const (*ptr).stride).read() };
    let meta = std::ptr::metadata(ptr);
    // It is undefined behavior to construct invalid planes with sizes that
    // overflow `usize` so we are allowed to invoke UB "for speed".
    let trailing_len = (stride as usize).unchecked_mul(meta.height as usize);
    unsafe {
      try {
        Layout::new_minimum::<Plane<T>>()
          .extend(Layout::new::<T>().repeat(trailing_len)?)?.0
       }
      .unwrap_unchecked()
    }
  }
}

This would then allow for creating boxed planes (with some unsafe code):

pub fn new_boxed(width: u32, height: u32, item: T) -> Box<Self>
  where T: Copy
{
  let Some(trailing_len) = (width as usize).checked_mul(height as usize) else {
    panic!("plane is too large to allocate ({width}x{height})")
  };
  let offset;
  let Ok(layout) = try {
    let layout;
    (layout, offset) = Layout::new_minimum::<Plane<T>>()
      .extend(Layout::new::<T>().repeat(trailing_len)?)?;
    layout.pad_to_align()?
  } else {
    panic!("plane is too large to allocate ({width}x{height})")
  };

  // Assume `Box::new_uninit_unsized_layout` exists (I wish it did :( )
  let mut this = Box::new_uninit_unsized_layout::<Self>(
    layout,
    // Metadata needs to be passed when creating unsized types.
    // Alternatively this could always return `Box<[MaybeUninit<u8>]>`.
    PlaneMeta { width, height }
  );
  unsafe {
    (&raw mut (*this.as_ptr()).stride).write(width);
  }
  let items = unsafe {
    std::slice::slice_from_raw_parts_mut(
      this.as_ptr()
        .byte_add(offset)
        .cast::<MaybeUninit<T>>(),
      trailing_len,
    )
  };
  items.fill(MaybeUninit::new(item));
  this.assume_init()
}

I've omitted the Drop impl for Plane. It is trivial to write (it just drops all elements in place).

But it neither is nicer nor helps resolve the problems with extern types which currently prevent their stabilization, so I don't think there is a benefit in this.