[Pre-RFC] `.lo()` and `.hi()` methods for splitting integers into their halves

Here's some anecdata.

I tend to work pretty close to the metal (deeply embedded systems, network protocol stacks, crypto, etc.) and I do a lot of bit manipulation.

I never think in terms of the "halves" of an integer. In fact, I clicked into this thread because lo() and hi() suggest, to me, either least/most-significant bit or byte access, and I was unclear on what half an integer would mean.

I am significantly more likely to need:

  • Bits 19:15 of this integer, zero-extended.
  • A particular byte, or every byte for serialization.

I note that all of the motivating examples you sent are actually accessing bytes in 16-bit integers, not halves per se. I believe byte access is a natural thing to need, and the fact that your byte accesses happen to map to halves is an artifact of your emulating a (mostly) 16-bit platform — not an indicator of the intrinsic value of "integer halves." Were you emulating a 32-bit system I think you'd have less desire for halving integers (because new_addr.lo().hi() isn't nearly as pretty).

I don't think these operations (lo()/hi()) are sufficiently general/primitive to merit the API surface area in std or core that you're suggesting. Maybe implement them in your own codebase first?

7 Likes

I’m building a gameboy emulator and I really want the exact same thing. However I agree with cbiffle, this is likely more an artefact of me building an emulator than a general use case. Similar to my issues with unsigned overflow.

I think the idea of bit/byte access could be more general purpose and worthy of consideration. What would it look like?

I will join @cbiffle on this. I do both crypto and bare metal. In my experience I haven’t seen much necessity in .lo() and .hi().

In crypto you often need to convert say u64 to [u8; 4], [u8; 64] to [u64; 8] and backwards using little or big endianness, preferably without using unsafes, making unnecessary copies and with as little runtime cost as possible, so converting using little endianness on le machine will be equivalent to a simple transmute, but on be machine it will be a rewrite of array with swapped bytes. But to implement this properly we require type level numerics, which is unfortunately not on the horizon right now.

As for bare metall there was discussion regarding bitfields somewhere here (can’t find it right now), I think bitpacked structures will be a good addition to the language.

1 Like

I feel like given the objections to .lo() and .hi() as expressed here, and the issue of not wanting to add a dependency for such a trivial crate, this is an area that would be well served by something like the itertools crate, perhaps a bitops crate or something. That would allow experimentation with the API, and I think that people would be more willing to add a dependency on something that provided a range of bit-twiddling operations than one that just provided two specialized operations; then it would be possible to get some information on how people use it to propose adding the most useful operations to libstd or libcore.

Also note that the conversion to and from &[u8] in a given endianness can generally be handled by the byteorder crate. I guess that doesn’t give you the zero-copy if the endianness matches, but I’m not sure how important that is.

4 Likes

I think people just use Into and as instead, because it is already part of the language. Personally, I love conv and think that as should be deprecated for numerical casts.

1 Like

I agree the current situation with the usage of "as" isn't acceptable in a safe language as Rust. Having dangerous conversions as default is bad.

3 Likes

This seems common enough to be worthy of inclusion in the standard library, so it definitely gets my vote.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.