- Feature Name: bit-twiddling
- Start Date: 2018-03-16
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)
Summary
Add a few common bit manipulation operations for integer types.
Motivation
Bit manipulation (“twiddling”) is a very common operation in systems programming, especially operating systems and embedded systems. Bit manipulations tend to come up in the following important use cases:
-
Constructing bit-packed representations of data. For example, pixel color values and Unix-style permission bits are often represented compactly as bit-packed
u8
s. Often, bit-packed representations are used with unions to create really memory-efficient representations of values (e.g. for a struct that we have lots of). -
Hardware interfaces. Embedded systems and operating systems often have to deal with hardware interfaces that want certain bits in certain places. For example, on x86, the Interupt Desciptor Table entries have this format.
Unfortunately, bit twiddling tends to be unergonomic. There tends to be a lot of nested parentheses and casting between different-width integers. This makes the code painful to write and painful to read. This RFC aims to improve the situation by introducing a few helpful-but-generic functions/methods.
Guide-level explanation
Explain the proposal as if it was already included in the language and you were teaching it to another Rust programmer. That generally means:
- Introducing new named concepts.
- Explaining the feature largely in terms of examples.
- Explaining how Rust programmers should think about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible.
- If applicable, provide sample error messages, deprecation warnings, or migration guidance.
- If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers.
This RFC adds a bunch of methods to libcore
and re-exports them through
libstd
. The reason they are added to libcore
is that they should be
available in no_std
environments, which are common in embedded systems
setups.
To generate a bit mask, you can use the mask
method of your favorite integer type:
assert_eq!(0b00_111_000u8, u8::mask(5, 3));
assert_eq!(0b00_111_000_00000000i16, u8::mask(13, 11));
To set the i
-th bit of an integer, you can use the set_bit
method:
let mut x = 0u8;
assert_eq!(x, 0);
x.set_bit(3);
assert_eq!(x, 0b1000u8);
To get the i
-th byte of an integer, use the get_byte
method:
assert_eq!(0xADu8, 0xDEADBEEF.get_byte(2));
Often, it is useful to copy some range of one integer into another range of
another integer. This can be done with the set_bits
function:
let mut a = 0xDEADBEEFu32;
let b = 0xABu8;
set_bits(&mut a, 16, b, 4, 8);
assert_eq!(a, 0xDE0ABEEF);
Finally, sometimes it is useful to truncate the bit representation of an integer
into a smaller integer type. For this, the Truncate
trait is useful:
assert_eq!(0xBEEFu16, 0xDEADBEEFu32.truncate());
Reference-level explanation
This RFC adds some methods and functions to libcore
and re-exports them
through libstd
. For each integer type $i
(signed and unsigned, 8- to
128-bit), the following are added:
impl $i {
/// Returns a bit mask with the given range of bits set:
/// - `h`: most significant bit (inclusive)
/// - `l`: least significant bit (inclusive).
///
/// 0 is the index of the least significant bit.
fn mask(h: usize, l: usize) -> Self {
let m1 = (1 << (h + 1)) - 1;
let m2 = (1 << (l)) - 1;
m1 & !m2
}
/// Set the given bit to the given value:
/// - `i`: the index of the bit to set.
/// - `b`: `true` if the bit should be set to 1. `false` if it should be cleared to 0.
///
/// 0 is the index of the least significant bit.
fn set_bit(&mut self, i: usize, b: bool) {
let mask = $i::mask(i,i);
let b = if b { mask } else { 0 };
*self = (*self & !mask) | b;
}
/// Get the `i`-th byte of the value.
///
/// 0 is the index of the least significant byte.
fn get_byte(self, i: usize) -> u8 {
let mask = $i::mask((i+1)*8-1, i*8);
((val & mask) >> (i*8)) as u8
}
}
Moreover, the set_bits
function is also added:
/// Extracts bits `lb+len` to `lb` for `b` and uses them to set the bits
/// `la+len` to `la` of `a`.
///
/// 0 is the index of the least significant bit. `a` must be wider than `b`
///
/// # Example
///
/// ```rust
/// let mut a = 0xDEADBEEFu32;
/// let b = 0xABu8;
///
/// set_bits(&mut a, 16, b, 4, 8);
///
/// assert_eq!(a, 0xDE0ABEEF);
/// ```
pub fn set_bits<I1, I2>(a: &mut I1, la: usize, b: I3, lb: usize, len: usize)
where
I1: Copy,
I1: Mask,
I1: BitOr<Output = I1>,
I1: BitAnd<Output = I1>,
I1: Shl<usize, Output = I1>,
I1: Shr<usize, Output = I1>,
I1: Not<Output = I1>,
I1: std::convert::From<I2>,
I2: Mask,
I2: BitOr<Output = I2>,
I2: BitAnd<Output = I2>,
I2: Shl<usize, Output = I2>,
I2: Shr<usize, Output = I2>,
{
let mask_a = I1::mask(la + len - 1, la);
let mask_b = I1::mask(lb + len - 1, lb);
// Clear the bits of a we wish to replace
let a_bits = *a & !mask_a;
let b: I1 = b.into();
let b_bits = if la == lb {
b & mask_b
} else if la < lb {
(b & mask_b) >> (lb - la)
} else {
(b & mask_b) << (la - lb)
};
// combine them
*a = a_bits | b_bits;
}
Finally, the following trait is added:
trait Truncate {
type Output;
/// Returns a version of `self` where the most significant bits are
/// truncated to get the output.
///
/// # Example
/// ```rust
/// assert_eq!(0xBEEFu16, 0xDEADBEEFu32.truncate());
/// ```
fn truncate(self) -> Self::Output;
}
And the following implementations are added:
u16: Truncate<Output = u8>
u32: Truncate<Output = u8>
u64: Truncate<Output = u8>
u128: Truncate<Output = u8>
u32: Truncate<Output = u16>
u64: Truncate<Output = u16>
u128: Truncate<Output = u16>
u64: Truncate<Output = u32>
u128: Truncate<Output = u32>
u128: Truncate<Output = u64>
i16: Truncate<Output = i8>
i32: Truncate<Output = i8>
i64: Truncate<Output = i8>
i128: Truncate<Output = i8>
i32: Truncate<Output = i16>
i64: Truncate<Output = i16>
i128: Truncate<Output = i16>
i64: Truncate<Output = i32>
i128: Truncate<Output = i32>
i128: Truncate<Output = i64>
Drawbacks
It adds a larger surface to the standard library.
Rationale and Alternatives
Do nothing
The status quo is roughly the experience you get in C, C++, and many other languages. We could stay where we are. Of course, that doesn’t address the motivation.
Implement as a crate
We could implement all of these changes in another crate, rather than libstd
or libcore
Unresolved questions
-
It’s not clear how much should be done with traits and generics rather than macros.
-
Does it make sense to add impls for
Truncate
where the input is signed and the output unsigned? Or vice versa?