Feature Name: from_bits
Start Date: (fill me in with today’s date, YYYY-MM-DD)
RFC PR: (leave this empty)
Rust Issue: (leave this empty)
Summary and motivation
This RFC proposes to add two traits to the std
library convert
module, FromBits
and IntoBits
, as well as implementations of these traits for some of the std
library types.
These traits are used to implement bit-pattern preserving conversions between types. Currently, the easiest way to perform these conversions is via unsafe
code by means of mem::transmute
.
These two traits allow users to express for which pairs of types every bit-pattern of the input type is also a valid bit-pattern of the output type, and thus a safe, infallible, and lossless conversion exists.
Motivation
The std
library From
and Into
traits are used to express infallible conversions. In the context of numeric types, std
follows the convention that these conversions must preserve numeric values:
assert_eq!(f64::from(13_i32), 13.0_f64);
Another common operation for these types is to perform a conversion that, instead of preserving the numeric value, preserves the bit-pattern of the value. The floating-point types have some inherent methods for this:
assert_eq!(f32::from_bits(0x5F3759DF_u32), 1.3211836e19_f32);
However, these methods are not generic, and as a consequence the following fails to compile:
assert_eq!(f32::from_bits(0x0_i32), 0.0_f32);
This isn’t a big of a deal if one only has a couple of types to convert to. For example, we could add a f32::from_bits_i32
method and call it a day. However, in the context of SIMD vector types, bitwise preserving conversions are incredibly common. For converting between architecture-specific vector types like __m256
, __m256i
, and __m256d
, the number of _::from_bits_{...}
conversion functions would remain reasonably small. But this is a list of all portable packed SIMD vector types whose bit-pattern often needs to be converted to __m256
: i8x32
, u8x32
, i16x16
, u16x16
, i32x8
, u32x8
, f32x8
, i64x4
, u64x4
, and f64x4
.
Now consider adding the same amount of bitwise conversions for __m256i
and __m256d
, and then think about 64-bit, 128-bit, and 512-bit wide portable vectors and their architecture specific types. The number of total _::from_bits_xyz
methods quickly reaches > 50.
Users might be tempted to reach for unsafe { mem::transmute(...) }
in these cases, but not having to write any unsafe code is actually one of the main advantages of the portable packed SIMD vector types because, as opposed to the std::arch
intrinsics, their API is safe. This is how one ARM NEON stdsimd test looks without these traits:
unsafe {
let a = i16x8::new(1, 2, 3, 4, 5, 6, 7, 8);
let b = i16x8::new(8, 7, 6, 5, 4, 3, 2, 1);
let e = i16x8::new(9, 9, 9, 9, 9, 9, 9, 9);
let r: i16x8 = mem::transmute(vaddq_s16(mem::transmute(a), mem::transmute(b)));
assert_eq!(r, e);
}
and the same test with the traits:
let a = i16x8::new(1, 2, 3, 4, 5, 6, 7, 8);
let b = i16x8::new(8, 7, 6, 5, 4, 3, 2, 1);
let e = i16x8::new(9, 9, 9, 9, 9, 9, 9, 9);
let r: i16x8 = vaddq_s16(a.into_bits(), b.into_bits()).into_bits();
assert_eq!(r, e);
This RFC is one potential solution to this problem.
Guide-level explanation
With the traits proposed by this RFC, the currently-rejected snippet of code shown above:
assert_eq!(f32::from_bits(0x0_i32), 0.0_f32);
would compile and produce the correct result. The following currently-rejected snippets of code would also work correctly:
assert_eq!((0x0_u32).into_bits(), 0.0_f32);
assert_eq!((0x0_i32).into_bits(), 0.0_f32);
Reference
This RFC introduces two traits to core::convert
analogous to From
/Into
that are used to provide a safe wrapper over bitwise preserving conversions:
pub trait FromBits<T>: marker::Sized {
fn from_bits(T) -> Self;
}
pub trait IntoBits<T>: marker::Sized {
fn into_bits(self) -> T;
}
// FromBits implies IntoBits:
impl<T, U> IntoBits<U> for T
where
U: FromBits<T>,
{
fn into_bits(self) -> U {
U::from_bits(self)
}
}
// FromBits (and thus IntoBits) is reflexive
impl<T> FromBits<T> for T {
fn from_bits(t: Self) -> Self {
t
}
}
as well as implementations for the following equally-sized types:
impl FromBits<i8> for u8;
impl FromBits<u8> for i8;
impl FromBits<i16> for u16;
impl FromBits<u16> for i16;
impl FromBits<u32> for f32;
impl FromBits<f32> for u32;
impl FromBits<i32> for f32;
impl FromBits<f32> for i32;
impl FromBits<u32> for i32;
impl FromBits<i32> for u32;
impl FromBits<u64> for f64;
impl FromBits<f64> for u64;
impl FromBits<i64> for f64;
impl FromBits<f64> for i64;
impl FromBits<u64> for i64;
impl FromBits<i64> for u64;
impl FromBits<isize> for usize;
impl FromBits<usize> for isize;
impl FromBits<i128> for u128;
impl FromBits<u128> for i128;
Drawbacks
It adds a new pair of traits to std
which might be painful.
Coherence
If crate A
exposes the type AT
, and crate B
exposes the type BT
, crate C
cannot implement FromBits<A::AT> for B::BT
.
Rationale and alternatives
Equally-sized types restriction
The proposed implementations are only restricted to equally-sized types.
This is however, not a requirement, since, for example, the following implementation would also be safe:
impl FromBits<i32> for i64;
The problem is that there are many ways to extend an i32
onto an i64
, e.g., zero-extend, sign-extend, etc.
FromBits is not Bijective
That is: FromBits<T> for U
does not imply FromBits<U> for T
. This is by design.
In the context of stdsimd
we have vector masks, like b8x8
, a 64-bit wide type, containing eight 8-bit masks, where the bits of each mask are all either set of cleared. That is, each lane can only contain two values: 0
or u8::max_value()
. Therefore, FromBits<b8x8> for u8x8
is a safe and correct operation, since all valid bit-patterns of the mask is a valid u8x8
bit patterns. However, its inverse: FromBits<u8x8> for b8x8
is not correct, since there are many u8x8
bit-patterns that aren’t valid b8x8
bit patterns.
Prior art
A version of this trait is currently used to provide easy .into_bits()
conversions between both portable packed SIMD vector types themselves and against the architecture-specific vector types.
Unresolved questions
TBD.