Add methods that return the number of bits necessary to represent an integer in binary to the standard library

Hi internals,

Currently, the number of bits necessary to represent an integer in binary to can be obtained by:

For unsigned integers:

let answer: u16 = 42; // 0b10_1010
assert_eq!(answer.checked_ilog2().map_or(0, |n| n + 1), 6);

For signed integers (excluding the sign bit and leading zeros):

let n: i16 = -37; // -0b10_0101
assert_eq!(n.unsigned_abs().checked_ilog2().map_or(0, |n| n + 1), 6);

Methods for this purpose appear to exist in the standard library of several languages:

The above methods can achieve the goal, but I think they are a bit complicated.

I think there is some demand for this, so it would be useful to have methods for this purpose.

For unsigned integers:

assert_eq!(u16::MIN.bit_len(), 0); // 0b0
assert_eq!(1_u16.bit_len(), 1); // 0b1
assert_eq!(42_u16.bit_len(), 6); // 0b10_1010
assert_eq!(u16::MAX.bit_len(), 16); // 0b1111_1111_1111_1111

For signed integers, returns the number of bits necessary to represent an integer in binary, excluding the sign and leading zeros (-37 is considered 37), similar to Python's int.bit_length():

assert_eq!(i16::MIN.bit_len(), 16); // -0b1000_0000_0000_0000
assert_eq!((-37_i16).bit_len(), 6); // -0b10_0101
assert_eq!((-1_i16).bit_len(), 1); // -0b1
assert_eq!(i16::default().bit_len(), 0); // 0b0
assert_eq!(1_i16.bit_len(), 1); // +0b1
assert_eq!(37_i16.bit_len(), 6); // +0b10_0101
assert_eq!(i16::MAX.bit_len(), 15); // +0b111_1111_1111_1111

Comments to this suggestion are welcome.

Thanks.

bits.Len is just ilog2, and bit_length is just ilog2(abs(x)), as far as i can tell.

Note that if you want to concretely make a proposal, you'll want an ACP: https://std-dev-guide.rust-lang.org/development/feature-lifecycle.html#suitability-for-the-standard-library

You can of course also do u16::BITS - u16::leading_zeros(answer), if you want to avoid the map_or.

Why would you say that 0 takes 1 bit to represent? Why isn't it zero bits?

Because I thought that at least one bit is needed to represent an integer. But this is probably wrong and I think your point is correct.

I removed the method that returns the number of bits necessary to represent an integer in binary, including the sign bit. Because there is probably no such method in the standard library of other languages.

The number of bits of a positive integer n is the integer part of 1 + log2(n). So I think they represent number_of_bits - 1.

I thought about this again, the existing API might be enough. However, I don't think everyone knows that they can get the number of bits using 1 + log2(n). So it might be useful to have a mention of this in the descriptions of ilog2() and checked_ilog2(). If the mention exists, then it is probably not necessary to add methods for this purpose.

That's just... what a logarithm is, though? Is it up to the api docs to teach people what a logarithm is? I think not.

But, if a developer doesn't know what a logarithm is, or doesn't understand that positional notation is based on exponents, how are they supposed to find that method? This type of problem solving seems better suited to something like stackoverflow.

I think you need to give a stronger argument on the motivation. Refering to other languages standard libraries is not enough. Especially, if they aspire to be all-batteries-included. Rust already offers leading_zeros and friends which correspond to x86 bit manipulation instructions. u16::BITS - x.leading_zeros() is already very simple code to get what you want. You have to consider that adding more methods to the primitive interger types makes them harder to discover.

1 Like

I think precedence across several standard libraries is pretty compelling[1], especially as Rust is, itself, fairly batteries included. But being literally one off ilog2() does seem like it's reasonably disqualifying, I think the only cases where that sort of thing has shown up is too avoid out of bounds which doesn't apply here.

Perhaps if something like num-traits got pull in at some point? Depending on representation, required bits might not exactly match 1+ ilog2(), but that's its own issue.

[1] Being just a random user, though, I couldn't say whether it's sufficiently compelling, of course.

  1. Since you need to handle ilog2(0) case, it's not just ilog2(n) + 1, there is some value in this function. I also like, that unlike ilog2 it works well for all inputs, without special cases.

  2. I'm not convinced it should be implemented for signed integers, since it's not clear what the bit-length of a negative integer is supposed to be. OP chose to use bit_len(-n), while I though about bit_len(!n), i.e. the number of bits, excluding leading 1s.

  3. I think len is too generic a name for what it does, and perhaps even bit_len is a bit too generic still. Not sure what the best name would be.

    Perhaps used_bits,used_bit_count or used_bit_len?

Offtopic, but I've always found the expm1 family of functions so weird. I mean, I get why they exists, but it still feels wrong. why not introduce sinm1, cbrtm1 and absm1 while we're at it.

Iirc it's a numeric precision thing? I've never had a use myself but I remember something about that looking it up at some point

1 Like

It's because sin(x) - 1.0 is numerically stable for most x whereas exp(x) - 1.0 is not. cosm1 is potentially useful because cos(x) - 1.0 is unstable for small x.

4 Likes