Summary
Allow Rust code to define dynamically sized types with custom thick (and thin)
pointers, and define slice functions in terms of these, instead of transmute.
Also, convert the CStr
type to use this functionality,
and make it a thin pointer; this will allow use with FFI.
Motivation
As of right now, the lack of custom DSTs in Rust means that we can’t communicate
with C in important ways - we lack the ability to define a CStr
in the
language such that &CStr
is compatible with char const *
,
and we lack the ability to communicate nicely with C code that uses
Flexible Array Members.
This RFC attempts to fix this, as well as introduce more correctness to existing practices.
Apart from FFI, it also has usecases for indexing and slicing 2-d arrays.
Guide-level explanation
There’s a new language trait in the standard library, under std::ops
:
unsafe trait DynamicallySized {
type Metadata: 'static + Copy;
fn size_of_val(&self) -> usize;
fn align_of_val(&self) -> usize;
}
with an automatic implementation for all Sized
types:
unsafe impl<T> DynamicallySized for T {
type Metadata = ();
fn size_of_val(&self) -> usize { size_of::<T>() }
fn align_of_val(&self) -> usize { align_of::<T>() }
}
If you have a type which you would like to be unsized, you can implement this trait for your type!
#[repr(C)]
struct CStr([c_char; 0]);
unsafe impl DynamicallySized for CStr {
type Metadata = ();
fn size_of_val(&self) -> usize { strlen(&self.0 as *const c_char) }
fn align_of_val(&self) -> usize { 1 }
}
and automatically, your type will not implement Sized
.
The existing DynamicallySized
types will continue to work;
if one writes a DynamicallySized
type T
,
and then wraps T
into a struct, they’ll get the obvious semantics.
struct Foo {
x: usize,
y: CStr,
}
// size_of_val(&foo) returns size_of::<usize>() + size_of_val(&foo.y)
// same with align_of_val
More Examples
Non-trivial types
For non-trivial types (i.e., those that have a destructor), Rust generates the obvious destructor
from the definition of the type itself - i.e., if you hold a Vec<T>
in your type, Rust will destroy it.
However, if your type contains additional data that Rust doesn’t know about, you’ll have to destroy it yourself.
#[repr(C)] // we need this to be laid out linearly
struct InlineVec<T> {
capacity: usize,
len: usize,
buffer: [T; 0], // for offset, alignment, and dropck
}
unsafe impl<T> DynamicallySized for InlineVec<T> {
type Metadata = ();
fn size_of_val(&self) -> usize {
Self::full_size(self.capacity)
}
fn align_of_val(&self) -> usize {
Self::full_align()
}
}
impl<T> Drop for InlineVec<T> {
fn drop(&mut self) {
std::mem::drop_in_place(self.as_mut_slice());
}
}
impl<T> InlineVec<T> {
// internal
fn full_size(cap: usize) -> usize {
std::mem::size_of_header::<Self>() + cap * std::mem::size_of::<T>()
}
fn full_align() -> usize {
std::mem::align_of_header::<Self>().max(std::mem::align_of::<T>())
}
pub fn new(cap: usize) -> Box<Self> {
let size = Self::full_size(cap);
let align = Self::full_align();
let layout = std::alloc::Layout::from_size_align(size, align).unwrap();
let ptr = std::raw::from_raw_parts_mut(std::alloc::alloc(layout) as *mut (), ());
std::ptr::write(&mut ptr.capacity, cap);
std::ptr::write(&mut ptr.len, 0);
Box::from_raw(ptr)
}
pub fn len(&self) -> usize {
self.len
}
pub fn capacity(&self) -> usize {
self.capacity
}
pub fn as_ptr(&self) -> *const T {
&self.buff as *const [T; 0] as *const T
}
pub fn as_mut_ptr(&mut self) -> *mut T {
&mut self.buff as *mut [T; 0] as *mut T
}
pub fn as_slice(&self) -> &[T] {
unsafe {
std::slice::from_raw_parts(self.as_ptr(), self.len())
}
}
pub fn as_mut_slice(&mut self) -> &mut [T] {
unsafe {
std::slice::from_raw_parts(self.as_mut_ptr(), self.len())
}
}
// panics if it doesn't have remaining capacity
pub fn push(&mut self, el: T) {
assert!(self.size() < self.capacity());
let ptr = self.as_mut_ptr();
let index = self.len();
std::ptr::write(ptr.offset(index as isize), el);
self.len += 1;
}
// panics if it doesn't have any elements
pub fn pop(&mut self) -> T {
assert!(self.len() > 0);
self.len -= 1;
let ptr = self.as_mut_ptr();
let index = self.len();
std::ptr::read(ptr.offset(index as isize))
}
}
Reference-level explanation
In addition to the explanation given above,
we will also introduce three functions into the standard library,
in core::raw
, which allow you to create and destructure these
pointers to DynamicallySized
types:
mod core::raw {
pub fn from_raw_parts<T: DynamicallySized>(
ptr: *const (),
meta: <T as DynamicallySized>::Metadata,
) -> *const T;
pub fn from_raw_parts_mut<T: DynamicallySized>(
ptr: *mut (),
meta: <T as DynamicallySized>::Metadata,
) -> *mut T;
pub fn metadata<T: DynamicallySized>(
ptr: *const T,
) -> <T as DynamicallySized>::Metadata;
}
and we will introduce two functions into core::mem
, to help people write types with Flexible Array Members:
mod core::mem {
pub fn size_of_header<T: DynamicallySized>() -> usize;
pub fn align_of_header<T: DynamicallySized>() -> usize;
}
These functions return the size and alignment of the header of a type; or, the minimum possible size and alignment, in other words. For existing Sized
types, they are equivalent to size_of
and align_of
, and for existing DSTs,
assert_eq!(size_of_header::<[T]>(), 0);
assert_eq!(align_of_header::<[T]>(), align_of::<T>());
assert_eq!(size_of_header::<dyn Trait>(), 0);
assert_eq!(align_of_header::<dyn Trait>(), 1);
// on 64-bit
assert_eq!(size_of_header::<RcBox<dyn Trait>>(), 16);
assert_eq!(align_of_header::<RcBox<dyn Trait>>(), 8);
Notes:
- names of the above functions should be bikeshed
-
extern type
s do not implementDynamicallySized
, although in theory one could choose to do this (that usecase is not supported by this RFC). -
T: DynamicallySized
bounds imply aT: ?Sized
bound.
We will also change CStr
to have the implementation from above.
On an ABI level, we promise that pointers to any type with
size_of::<Metadata>() == 0
&& align_of::<Metadata>() <= align_of::<*const ()>()
are ABI compatible with a C pointer - this is important, since we want to be able to write:
extern "C" {
fn printf(fmt: &CStr, ...) -> c_int;
}
Unfortunately, we won’t be able to change existing declarations in libc
without a new major version.
as
casts continue to allow
fn cast_to_thin<T: DynamicallySized, U: Sized>(t: *const T) -> *const U {
t as *const U
}
so we do not introduce any new functions to access the pointer part of the thick pointer.
Drawbacks
- More complication in the language.
- Lack of a
Sized
type dual to these unsized types – the lack of a[u8; N]
to these types’[u8]
is unfortunate. - Inability to define a custom DST safely
- The
size_of_val
andalign_of_val
declarations are now incorrect; they should takeT: DynamicallySized
, as opposed toT: ?Sized
.
Rationale and alternatives
This has been a necessary change for quite a few years. The only real alternatives are those which are simply different ways of writing this feature. We need custom DSTs.
We also likely especially want to deprecate the current ?Sized
behavior of size_of_val
and align_of_val
, since people are planning on aborting/panicking at runtime when called on extern types. That’s not great. (link)
Prior art
- FAMs in C
- FAMs in C++ (unfinished proposal)
- Existing Rust which could use this feature:
- Other RFCs
- Niko’s Blog on DSTs
(you will note the incredible number of RFCs on this topic – we really need to fix this missing feature)
Unresolved questions
-
How should these thick pointers be passed, if they are larger than two pointers?(repr(Rust) aggregate) -
Are(removed)std::raw::ptr
andstd::raw::ptr_mut
necessary? You can get this behavior withas
.
Future possibilities
- By overloading
DerefAssign
, we could add aBitReference
type