Summary
Two functions are added to the standard library that allow querying specific type related information.
Motivation
I'm suggesting the standardization of two functions:
const fn trivial_drop<T>() -> bool;
const fn has_direct_interior_mutability<T>() -> bool;
The main motivation is to provide a standardized way to do something that is already being done, and to provide a sound foundation for optimizations surrounding types with direct interior mutability.
A specific example:
Summary
#[inline(always)]
unsafe fn merge_up<T, F>(
mut src_left: *const T,
mut src_right: *const T,
mut dest_ptr: *mut T,
is_less: &mut F,
) -> (*const T, *const T, *mut T)
where
F: FnMut(&T, &T) -> bool,
{
// This is a branchless merge utility function.
// The equivalent code with a branch would be:
//
// if !is_less(&*src_right, &*src_left) {
// ptr::copy_nonoverlapping(src_left, dest_ptr, 1);
// src_left = src_left.wrapping_add(1);
// } else {
// ptr::copy_nonoverlapping(src_right, dest_ptr, 1);
// src_right = src_right.wrapping_add(1);
// }
// dest_ptr = dest_ptr.add(1);
let is_l = !is_less(&*src_right, &*src_left);
let copy_ptr = if is_l { src_left } else { src_right };
ptr::copy_nonoverlapping(copy_ptr, dest_ptr, 1);
src_right = src_right.wrapping_add(!is_l as usize);
src_left = src_left.wrapping_add(is_l as usize);
dest_ptr = dest_ptr.add(1);
(src_left, src_right, dest_ptr)
}
#[inline(always)]
unsafe fn merge_down<T, F>(
mut src_left: *const T,
mut src_right: *const T,
mut dest_ptr: *mut T,
is_less: &mut F,
) -> (*const T, *const T, *mut T)
where
F: FnMut(&T, &T) -> bool,
{
// This is a branchless merge utility function.
// The equivalent code with a branch would be:
//
// if !is_less(&*src_right, &*src_left) {
// ptr::copy_nonoverlapping(src_right, dest_ptr, 1);
// src_right = src_right.wrapping_sub(1);
// } else {
// ptr::copy_nonoverlapping(src_left, dest_ptr, 1);
// src_left = src_left.wrapping_sub(1);
// }
// dest_ptr = dest_ptr.sub(1);
let is_l = !is_less(&*src_right, &*src_left);
let copy_ptr = if is_l { src_right } else { src_left };
ptr::copy_nonoverlapping(copy_ptr, dest_ptr, 1);
src_right = src_right.wrapping_sub(is_l as usize);
src_left = src_left.wrapping_sub(!is_l as usize);
dest_ptr = dest_ptr.sub(1);
(src_left, src_right, dest_ptr)
}
/// Merge v assuming the len is even and v[..len / 2] and v[len / 2..] are sorted.
///
/// Adapted from crumsort/quadsort.
unsafe fn parity_merge<T, F>(v: &[T], dest_ptr: *mut T, is_less: &mut F)
where
F: FnMut(&T, &T) -> bool,
{
// SAFETY: the caller must guarantee that `dest_ptr` is valid for v.len() writes.
// Also `v.as_ptr` and `dest_ptr` must not alias.
//
// The caller must guarantee that T cannot modify itself inside is_less.
// merge_up and merge_down read left and right pointers and potentially modify the stack value
// they point to, if T has interior mutability. This may leave one or two potential writes to
// the stack value un-observed when dest is copied onto of src.
// It helps to visualize the merge:
//
// Initial:
//
// |ptr_data (in dest)
// |ptr_left |ptr_right
// v v
// [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]
// ^ ^
// |t_ptr_left |t_ptr_right
// |t_ptr_data (in dest)
//
// After:
//
// |ptr_data (in dest)
// |ptr_left | |ptr_right
// v v v
// [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]
// ^ ^ ^
// |t_ptr_left | |t_ptr_right
// |t_ptr_data (in dest)
//
//
// Note, the pointers that have been written, are now one past where they were read and
// copied. written == incremented or decremented + copy to dest.
let len = v.len();
let src_ptr = v.as_ptr();
let len_div_2 = len / 2;
let mut ptr_left = src_ptr;
let mut ptr_right = src_ptr.wrapping_add(len_div_2);
let mut ptr_data = dest_ptr;
let mut t_ptr_left = src_ptr.wrapping_add(len_div_2 - 1);
let mut t_ptr_right = src_ptr.wrapping_add(len - 1);
let mut t_ptr_data = dest_ptr.wrapping_add(len - 1);
for _ in 0..len_div_2 {
(ptr_left, ptr_right, ptr_data) = merge_up(ptr_left, ptr_right, ptr_data, is_less);
(t_ptr_left, t_ptr_right, t_ptr_data) =
merge_down(t_ptr_left, t_ptr_right, t_ptr_data, is_less);
}
let left_diff = (ptr_left as usize).wrapping_sub(t_ptr_left as usize);
let right_diff = (ptr_right as usize).wrapping_sub(t_ptr_right as usize);
if !(left_diff == mem::size_of::<T>() && right_diff == mem::size_of::<T>()) {
panic_on_ord_violation();
}
}
This function is a fundamental piece of fast data-parallel sorting algorithms. However it is not safe for types that can modify their stack value as part of the user provided comparison function is_less
. One example for such a problematic type would be Mutex<Option<Box<str>>>
. As part of a call to is_less
it could replace the option with None
freeing the inner String. If this modification is not observed in the slice the user has access to after the sort completes, dropping the slices backing storage would free the String again -> UB. A large percentage of types, have no direct interior mutability, eg. u64
, String
. It would be safe to do this optimization for such types. However currently there is no way to query this information, even though the compiler has access to it. This leads to a situation where people are using, functional but inappropriate ways to approximate this property. For example using unsafe specialization to figure out if a type is Copy
or not. As UnsafeCell
does not implement Copy
, at least for now.
Guide-level explanation
const fn trivial_drop<T>() -> bool;
This function returns true
for types that have a trivial Drop
implementation, false
otherwise. Trivial Drop
implementation means, no code can be executed as part of Drop
.
const fn has_direct_interior_mutability<T>() -> bool;
This function returns true
for types that have direct interior mutability, false
otherwise. Direct interior mutability is the ability for a type to modify its stack value through a shared reference. u64
, String
, and Box<RefCell<u64>>
would all return false
, in contrast RefCell<u64>
, Cell<String>
and Mutex<u64>
would all return true
.
Reference-level explanation
The compiler has access to the relevant information, the functions would have to implemented with intrinsics.
Drawbacks
This information could be used to build unsound abstractions, and may embolden people to do incorrect assumptions.
Rationale and alternatives
Some alternatives would be:
- A) Ignore the possible optimizations.
- B) Adjust the code so it is safe for types with direct interior mutability.
- C) Add an unsafe marker trait.
A, seems like a bad fit for Rust, a language that markets itself as an efficient way to exploit hardware resources. B might not always be feasible, for the provided example it is possible but would make the code significantly slower and complex. That's not a property that should be ignored, especially because often in such performance critical code, unsafe code is being used. C, is really problematic too. It creates this two class 'society', where builtin u64 ... sort()
, and #[derive(Copy)] struct X{u32: a, u32: b} ... sort_by_key(|x| x.a)
would have vastly different performance.
Regarding Copy
specialization, this check is already being done in the standard library for optimizations purposes see as approximation for trivial Drop
. The fact that this exists is testament to:
- Such optimizations are important enough that they are done in practice
- Crude approximations can and will go wrong see
TrustedRandomAccessNoCoerce
What is the impact of not doing this?
Rust being slower and more power-inefficient than it could be. And people using wrong approximations, which may break code with future updates, and or stop the future improvements to Rust because incorrect assumptions have proliferated.
If this is a language proposal, could this be done in a library or macro instead? Does the proposed change make Rust code easier or harder to read, understand, and maintain?
Currently with the exception of proc macros that disassemble generated code and analyze optimizations, there is no way to surface said information in user code.
Prior art
There was a RFC with a similar goal, the Freeze
auto trait https://github.com/rust-lang/rfcs/pull/2944. The main motivation for declining said PR, was a worry that it may add another possibility for library authors to inadvertently break SemVer. This new proposed design avoids this problem by adding functions that can be used for optimizations purposes, but not for specialization. Similar to mem::size_of
and mem::align_of
which also query type related information, and are also used to inform algorithmic optimization decisions.
Unresolved questions
- Should these functions be
const
functions? One motivation to have functions instead of auto traits, was to avoid specializations inadvertently breaking with updates. I could imagine a future whereconst
functions influence code compilation in a way that makes it fail based on properties calculated within them. But arguably that ship has already sailed with many existingconst
functions creating the same potential problems.
Future possibilities
It may make sense to group these functions in a module for example core::type_info
, as they may not be the last type properties deemed important enough that it can be queried.
Disclaimer, I'm not attached to those names, I'm sure better ones can be found.