Methods for splitting integers into their halves

abgros · July 10, 2025, 4:54pm

A while ago someone proposed .lo() and .hi() for splitting an integer into its high and low halves which seems to have gotten a cool reception from the community:

While reading it I thought of a point that wasn't considered in that thread. Namely, Clippy on the pedantic level will emit cast_possible_truncation warnings everywhere which can result in ugly code like:

let num: u128 = todo!();
u64::try_from(num % (1 << u64::BITS)).unwrap()

To give a real-world example, the fastrand crate has a function which looks like:

fn mul_high_u128(a: u128, b: u128) -> u128 {
	let a_lo = a as u64 as u128;
	let a_hi = (a >> 64) as u64 as u128;
	let b_lo = b as u64 as u128;
	let b_hi = (b >> 64) as u64 as u128;
	let carry = (a_lo * b_lo) >> 64;
	let carry = ((a_hi * b_lo) as u64 as u128 + (a_lo * b_hi) as u64 as u128 + carry) >> 64;
	a_hi * b_hi + ((a_hi * b_lo) >> 64) + ((a_lo * b_hi) >> 64) + carry
}

I believe this would be more readable if rewritten like so:

fn mul_high_u128(a: u128, b: u128) -> u128 {
	let hi_hi = u128::from(a.high()) * u128::from(b.high());
	let hi_lo = u128::from(a.high()) * u128::from(b.low());
	let lo_hi = u128::from(a.low()) * u128::from(b.high());
	let lo_lo = u128::from(a.low()) * u128::from(b.low());

	let carry = u128::from(hi_lo.low()) + u128::from(lo_hi.low()) + (lo_lo >> 64);

	hi_hi + (hi_lo >> 64) + (lo_hi >> 64) + (carry >> 64)
}

There are many other instances of the shift-and-cast pattern in that crate.

tczajka · July 10, 2025, 5:04pm

The crate can easily define these functions for its own purposes. These are rather niche for general use.

The low 64 bits can already be gotten using the as u64 notation, although it has been discussed that this be deprecated and replaced with wrapping_cast.

abgros · July 10, 2025, 5:12pm

I have a somewhat Socratic line of questioning in response to this. Do you think the crate maintainers should create functions for this in the interest of readability? If so, why haven't they done so?

If the answer is something like "because they never thought of it", it seems like std is worthwhile just for that.

tczajka · July 10, 2025, 5:18pm

Yes. You just showed how the code is cleaner when changed to use these functions. Small functions are good. And you have additionally mentioned there are multiple other instances in the same crate where the functions would be useful, making the case even more clear-cut.

I'd write it as something like:

fn split_128_to_64(a: u128) -> [u64; 2]

and its various variants for different sizes.

I don't know. Perhaps because of a mix of laziness and lack of experience, or just a difference of opinion of what they consider clean code.

Not really. The argument is that the standard library should come with "batteries included", with all kinds of features that people might want to use in their programs. Kind of how Python approaches it.

Rust has a different philosophy. A reasonable approach would be to create a crate for such functions. If it turns out that the crate is widely used, then perhaps that's an argument for putting it in std (although even that is debatable).

abgros · July 10, 2025, 5:37pm

I mean, if you don't think std should be "batteries included", that ship has long sailed given the existence of high-level features like HashMap and VecDeque (JS doesn't even have that). The alloc crate also has its "batteries" like String and Rc. Rust would be significantly less usable if Strings and UTF-8 handling required an external crate. And clearly those are much higher-level than chopping up an integer.

tczajka · July 10, 2025, 5:44pm

You have a point, but in these cases the extra argument is that these are used for communication between crates. Different crates have to agree on a common standard to use in APIs. Hence the value of forcing a common standard for these types.

I agree that chopping up integers into smaller ones and merging integers into larger ones could conceivably be in std. Still, a good way to achieve that is by first implementing these as an external crate to gain experience, consider alternatives, and gather usage statistics. Once it's in std it's impossible to change the design because std never gets a major version bump.

scottmcm · July 10, 2025, 6:46pm

Well, the solution to that is not to stabilize lo/high methods, but to make fastrand not need to implement multiplication like that itself.

That crate just wants https://doc.rust-lang.org/std/primitive.u128.html#method.widening_mul to be stable, so it can use a.widening_mul(b).1 instead: https://rust.godbolt.org/z/anbYahYc1

cc Tracking Issue for bigint helper methods · Issue #85532 · rust-lang/rust · GitHub

josh · July 10, 2025, 8:26pm

FWIW, I wouldn't write it like that, because that type signature makes it easy to forget which one is low and which one is high. I think .lo() and .hi() are clearer than that.

tczajka · July 10, 2025, 8:29pm

I might be biased because to me little-endian is clearly correct-endian xd (because then the index corresponds to the exponent of the base). Perhaps it should be called split_128_to_64_le.

My function generalizes easily:

fn split_128_to_32(a: u128) -> [u32; 4]

josh · July 10, 2025, 9:16pm

I agree that that's the "right" order, but it's one of those things where a different name can help people process it faster. Personally, I'd call it split_128_low_high, and then you instantly know that it returns (low, high). (But I'd still want it as a method, to_low_high(), rather than a standalone function.)

tczajka · July 10, 2025, 9:27pm

For u16 the function already exists:

u16::to_le_bytes(self) -> [u8; 2]

So how about this for consistency:

u128::to_le_u64s(self) -> [u64; 2]
u128::to_le_u32s(self) -> [u32; 4]
u128::to_le_u16s(self) -> [u16; 8]
u128::to_le_bytes(self) -> [u8; 16]

Nobody_1707 · July 14, 2025, 2:33am

I think if this is added at all, it should either do as widening_mul does and return a tuple or, if we're really worried about people getting the order wrong, a struct with pub named fields. In the same order as widening_mul with the same types. Anyone performing this operation really does just want halves. The arrays of decreasing element size formulation is cute, but I've never seen any code that wanted it.

If we are that worried about getting the order wrong, then widening_mul should probably return the same struct. The fact that it doesn't already suggests that we don't consider it that big of a problem.

Neutron3529 · July 14, 2025, 12:59pm

I believe this would be more readable, too. Except it's wrong.

$ cat test.rs && rustc --edition 2024 -Z unstable-options -Zpolonius -C link-arg=-fuse-ld=mold test.rs -C opt-level=3 -C target-cpu=native -C codegen-units=1 -C lto=thin -o test && ./test
trait Hilo: Copy {
    type Output: Copy;
    fn hi(self) -> Self::Output;
    fn lo(self) -> Self::Output;
}
impl Hilo for u128 {
    type Output = u64;
    fn hi(self) -> Self::Output {
        (self >> 64) as u64
    }
    fn lo(self) -> Self::Output {
        self as u64
    }
}
fn mul_high_u128(a: u128, b: u128) -> u128 {
    let a_lo = a as u64 as u128;
    let a_hi = (a >> 64) as u64 as u128;
    let b_lo = b as u64 as u128;
    let b_hi = (b >> 64) as u64 as u128;
    let carry = (a_lo * b_lo) >> 64;
    let carry = ((a_hi * b_lo) as u64 as u128 + (a_lo * b_hi) as u64 as u128 + carry) >> 64;
    a_hi * b_hi + ((a_hi * b_lo) >> 64) + ((a_lo * b_hi) >> 64) + carry
}
fn mul_high_u128_fake(a: u128, b: u128) -> u128 {
    let hi_hi = u128::from(a.hi()) * u128::from(b.hi());
    let hi_lo = u128::from(a.hi()) * u128::from(b.lo());
    let lo_hi = u128::from(a.lo()) * u128::from(b.hi());
    let lo_lo = u128::from(a.lo()) * u128::from(b.lo());

    let carry = hi_lo + lo_hi + (lo_lo >> 64);

    hi_hi + (hi_lo >> 64) + (lo_hi >> 64) + (carry >> 64)
}
fn main() {
    let a = u128::MAX;
    let b = a;
    println!("{} != {}", mul_high_u128(a, b), mul_high_u128_fake(a, b))
}
340282366920938463463374607431768211454 != 18446744073709551610

abgros · July 14, 2025, 1:15pm

Thanks, corrected.

Neutron3529 · July 14, 2025, 1:46pm

Maybe what you really want is something like Nvidia's PTX code madc.cc.

But unfortunately this might not be something Rust could provide

Topic		Replies	Views
[Pre-RFC] `.lo()` and `.hi()` methods for splitting integers into their halves libs	27	8297	March 25, 2019
Bit twiddling pre-RFC libs	18	4330	March 25, 2019
A tale of two's complement	62	25027	March 25, 2019
Proposal: integer conversion methods	27	3399	May 12, 2019
Pre-RFC Introduction of Half and Quadruple Precision Floats (f16 and f128) language design	38	10718	March 25, 2019

Methods for splitting integers into their halves

Related topics