A while ago someone proposed .lo() and .hi() for splitting an integer into its high and low halves which seems to have gotten a cool reception from the community:
While reading it I thought of a point that wasn't considered in that thread. Namely, Clippy on the pedantic level will emit cast_possible_truncation warnings everywhere which can result in ugly code like:
let num: u128 = todo!();
u64::try_from(num % (1 << u64::BITS)).unwrap()
To give a real-world example, the fastrand crate has a function which looks like:
fn mul_high_u128(a: u128, b: u128) -> u128 {
let a_lo = a as u64 as u128;
let a_hi = (a >> 64) as u64 as u128;
let b_lo = b as u64 as u128;
let b_hi = (b >> 64) as u64 as u128;
let carry = (a_lo * b_lo) >> 64;
let carry = ((a_hi * b_lo) as u64 as u128 + (a_lo * b_hi) as u64 as u128 + carry) >> 64;
a_hi * b_hi + ((a_hi * b_lo) >> 64) + ((a_lo * b_hi) >> 64) + carry
}
I believe this would be more readable if rewritten like so:
The crate can easily define these functions for its own purposes. These are rather niche for general use.
The low 64 bits can already be gotten using the as u64 notation, although it has been discussed that this be deprecated and replaced with wrapping_cast.
I have a somewhat Socratic line of questioning in response to this. Do you think the crate maintainers should create functions for this in the interest of readability? If so, why haven't they done so?
If the answer is something like "because they never thought of it", it seems like std is worthwhile just for that.
Yes. You just showed how the code is cleaner when changed to use these functions. Small functions are good. And you have additionally mentioned there are multiple other instances in the same crate where the functions would be useful, making the case even more clear-cut.
I'd write it as something like:
fn split_128_to_64(a: u128) -> [u64; 2]
and its various variants for different sizes.
I don't know. Perhaps because of a mix of laziness and lack of experience, or just a difference of opinion of what they consider clean code.
Not really. The argument is that the standard library should come with "batteries included", with all kinds of features that people might want to use in their programs. Kind of how Python approaches it.
Rust has a different philosophy. A reasonable approach would be to create a crate for such functions. If it turns out that the crate is widely used, then perhaps that's an argument for putting it in std (although even that is debatable).
I mean, if you don't think std should be "batteries included", that ship has long sailed given the existence of high-level features like HashMap and VecDeque (JS doesn't even have that). The alloc crate also has its "batteries" like String and Rc. Rust would be significantly less usable if Strings and UTF-8 handling required an external crate. And clearly those are much higher-level than chopping up an integer.
You have a point, but in these cases the extra argument is that these are used for communication between crates. Different crates have to agree on a common standard to use in APIs. Hence the value of forcing a common standard for these types.
I agree that chopping up integers into smaller ones and merging integers into larger ones could conceivably be in std. Still, a good way to achieve that is by first implementing these as an external crate to gain experience, consider alternatives, and gather usage statistics. Once it's in std it's impossible to change the design because std never gets a major version bump.
FWIW, I wouldn't write it like that, because that type signature makes it easy to forget which one is low and which one is high. I think .lo() and .hi() are clearer than that.
I might be biased because to me little-endian is clearly correct-endian xd (because then the index corresponds to the exponent of the base). Perhaps it should be called split_128_to_64_le.
I agree that that's the "right" order, but it's one of those things where a different name can help people process it faster. Personally, I'd call it split_128_low_high, and then you instantly know that it returns (low, high). (But I'd still want it as a method, to_low_high(), rather than a standalone function.)
I think if this is added at all, it should either do as widening_mul does and return a tuple or, if we're really worried about people getting the order wrong, a struct with pub named fields. In the same order as widening_mul with the same types. Anyone performing this operation really does just want halves. The arrays of decreasing element size formulation is cute, but I've never seen any code that wanted it.
If we are that worried about getting the order wrong, then widening_mul should probably return the same struct. The fact that it doesn't already suggests that we don't consider it that big of a problem.