#1

# Summary

Add two methods, `.lo()` and `.hi()`, to all unsigned integer types except `u8` and `usize`, which return the most (`hi`) or least (`lo`) significant half of an integer.

# Motivation

Low-level code that interacts closely with hardware often needs to manipulate integers in specific ways. One of the most common operations is getting the low or high half of an integer. Currently, this can be accomplished by shifting and converting (masking) the integer:

``````let my_int: u32 = 0xABCDEF55;

let high: u16 = (my_int >> 16) as u16;
let low: u16 = my_int as u16;
``````

To make this common operation shorter and easier to use, we can instead provide two simple methods on integers which do the same thing:

``````let my_int: u32 = 0xABCDEF55;

let high: u16 = my_int.hi();
let low: u16 = my_int.lo();
``````

# Detailed design

Two methods are added to `u16`, `u32`, `u64` and `u128`. `u8` is omitted because it canâ€™t be split into smaller integers. `usize` is omitted because its size isnâ€™t uniform on all architectures, so the result of the split would also be different on each architecture, possibly causing Rust code to compile on one architecture, but fail to compile on another.

The implementation in `libcore` could look like this:

``````impl u16 {
fn lo(&self) -> u8 { *self as u8 }
fn hi(&self) -> u8 { (*self >> 8) as u8 }
}

impl u32 {
fn lo(&self) -> u16 { *self as u16 }
fn hi(&self) -> u16 { (*self >> 16) as u16 }
}

impl u64 {
fn lo(&self) -> u32 { *self as u32 }
fn hi(&self) -> u32 { (*self >> 32) as u32 }
}

impl u128 {
fn lo(&self) -> u64 { *self as u64 }
fn hi(&self) -> u64 { (*self >> 64) as u64 }
}
``````

Here is an implementation on the Rust playground using a `LoHi` trait instead of an inherent impl.

# Drawbacks

The methods add a bit of API surface to `core`, which needs to be maintained and documented.

# Alternatives

• Continue the bit-fiddling like before

# Unresolved questions

• Should signed integers implement the methods, too?
• Should `usize` also get a `lo` and `hi` method?
• Should the methods be named `low` and `high` instead of `lo` and `hi`?

#2

Why canâ€™t this just be implemented in a crate?

#3

For reference, here are a few places where this wouldâ€™ve been useful in an emulator I wrote.

#4

It can, of course. But importing an external crate and a trait (which you currently have to do) is more work than just doing `.lo()`, and considering how simple the bit-shift version is I doubt that many people would import a crate just for this. If itâ€™s in `core`, however, there isnâ€™t really any excuse not to use them (I also believe that these methods are pulling their weight for many users - see the links I posted above - so I think they deserve to be in `core`)

#5

But those arguments are broadly true for all small methods.

For something to go in `std` (let alone `core`), it should be something thatâ€™s either impossible to do at a higher layer, or is so broadly useful that itâ€™s worth freezing in stone forever and having to be maintained until at least the next major release.

If this is really so useful, then you should be able to put it into a crate, and then point at the long list of reverse dependencies, and impressive download stats. If nothing else, this gives you actual evidence that the functionality is desired.

I mean, I figured people would love `conv`. Itâ€™s the sort of thing that should be in `std`! â€¦but almost no one uses it, which demonstrates rather tragically that most people donâ€™t care. Functionality that can be done in an external crate, and which most people donâ€™t care about doesnâ€™t deserve to be in `std`; itâ€™d just be a burden on the core devs (for maintenance) and users (larger downloads).

#6

Why is `next_power_of_two()` in the stdlib? Why is `trailing_zeroes` in the stdlib? Are they really broadly used?

I really donâ€™t understand this level of stubbornness with external crates. Would adding `hi()` and `lo()` really be a maintenance burden? Are the additional hundreds of octets of download really relevant?

#7

What I know however, is that if I needed `hi()` and `lo()` in a project, Iâ€™d put them in an `utils.rs` file that Iâ€™ll never touch again in my life. Maintenance burden: zero.

This is exactly what I did earlier today with the `lerp()` function. I put it alongside with the `clamp()` function that Iâ€™ve been using as well.

Because at some point, when it takes me more time to search for a crate that contains a function compared to writing it myself, I donâ€™t even bother. If you add the time it takes to find the documentation of the function, and add it to your Cargo.toml and your main.rs, you have the time to write it ten times.

#8

An argument for putting something in std is if something is so trivial that noone will bother to import an external crate for it. This definitely fits that bill. If I need the hi 16 bits of a `u32` Iâ€™m not going to bother looking for an external crate that will give me that functionality, Iâ€™ll just use `(x >> 16) as u16`. But using `.hi()` might be slightly nicer.

#9

Which is also why I think removing the functions that mapped `&T -> &[T; 1]` and `&mut T -> &mut [T; 1]` was such a bad idea. Especially since they require `unsafe` to implement.

#10

Because they were added before 1.0 when the standard library was less picky and somehow survived the great batteries removal Itâ€™s unfortunate that libstd is in maintenance, stabilization and polishing status now and effectively frozen for new stuff. I suspect this somehow correlates to the number of people on the libs team who are not super buzy with other work (i.e. 0).

#11

For `trailing_zeroes` the answer probably is â€śbecause it is implemented as LLVM intrinsicâ€ť.
Not sure about `next_power_of_two`, but Iâ€™d guess itâ€™s because it is used by stdlib collections (to figure out the next bigger size).

#12

TIL about `conv` !

#13

I have two bitwise manipulation crates (both unfinished).

• bitintr is supposed to be in std someday, because otherwise it wonâ€™t ever be usable in stable (it uses LLVM intrinsics directly, target feature, extern-intrinsic), just like the SIMD crate.

• bitwise implements â€śhigher levelâ€ť bit manipulation algorithms, it will depend on bitintr for the low-level primitives.

FWIW since `hi` and `lo` arenâ€™t â€ślow levelâ€ť (as in they donâ€™t match to one asm instruction in any platform) and donâ€™t depend on rustc/LLVM intrinsics, I would put them in bitwise, so I guess that means I think they belong outside of standard.

#14

They do on x86 in many cases, with effectively zero instructions: given a value you already have loaded into a register, access a smaller register component of that register and use that in the appropriate following instruction.

#15

I like the general concept, and I donâ€™t want to let the perfect be the enemy of the good here, but this feels like something better addressed via vector-ish/SIMD types.

Iâ€™d love to see Rust support types like â€śvector of 4 16-bit values stored in a 64-bit valueâ€ť, or â€śvector of 8 16-bit values stored in a 128-bit valueâ€ť. And given such support, many operations will make sense, including access to the individual components (or combinations of components).

#16

with effectively zero instructions

TIL, thanks for pointing this out.

There are some algorithms like `umul` that map to `umux` which for 64 bit unsigned integer multiplication returns the higher and lower order bits in two different registers, but for accessing the loworder or highorder bits in a normal register I just thought that the common thing was to use a bitmask.

#17

To implement this functionality in a nice way we need type level integers. This would allow adding nice bit-vectors to bitwise, and a better interface to the SIMD crate. I donâ€™t know whether it is worth it to move anything like that into std before that. I can see the advantages in that we could be using functionality like this right now with a less nice interface, and that adding a nicer interface later on is not a breaking change.

#18

A problem with such utility functions in a create is that people have to be aware of the crate and the time they might need to look for the crate is around the time they need to implement it them self. Also a crate which bundles â€śa bunch of bit operationsâ€ť might feel as quite a heavy import just for this `hi`/`lo`. Through I donâ€™t think this on itself is a reason to include it into core. It might be interesting to open some kind of poll to check how many people actually do use this functionality in a project. (I would guess, at last, all kind of serialization, emulation, some-binary-format and embedded system crates)

Independent of whether or not adding the functionality is a good idea, I donâ€™t like the names `hi`, `lo`. The names `high`, `low` or maybe something else would be preferable I think.

EDIT: Uh, why wasnâ€™t I aware of `conv` beforeâ€¦ Thanks for mentioning it.

#19

I feel a trait is actually more useful than inherent implâ€™s for this, it extends the operations to code that is generic over integer size. For example I implemented a `Halveable` trait to allow splitting and recombining generic integers, which means `roaring-rs` can support bitmaps of anywhere from `u16` to `u64` (including `usize` for 32/64 bit machines) with no runtime cost (assumedly, liberal inlining and basic arithmetic optimizations should remove all the extra code) and no additional code (other than the basic trait impl, and the horribleness the genericity requires in some constraints).

Similarly, despite `{trailing,leading}_zeros` being implemented in std, I canâ€™t use them as theyâ€™re only inherent impls. I have to include the `num` crate to have a trait that allows access to the functions.

BTW, `num` is a good contender for somewhere that might accept a feature like this if std doesnâ€™t.

#20

Used by some collections.

Maps to a machine instruction / LLVM intrinsics => not feasible to implement in a stable crate.

I feel like a better API (though it does not consider BE vs LE) would be one which returns a `[T; 2]` (where T has halved bitwidth), or something that would be similar and would be no-op to convert into (i.e. has the same representation as the value being split up itself).