[Pre-RFC]: leading_ones_trailing_ones

yoshuawuyts · November 4, 2018, 5:37pm

Hi! I wrote my very first RFC today! I’m posting it here on internals first to answer two questions:

Are these additions too small to write an RFC for?
Is the language here good enough for an RFC?

I think I’ve formulated this as an RFC because I’m more comfortable drafting text than submitting a PR to the compiler. So ummm, I hope this is good! Keen to receive feedback!

Feature Name: leading_ones_trailing_ones
Start Date: 2018-11-04
RFC PR: (leave this empty)
Rust Issue: (leave this empty)

Summary

We propose adding the .leading_ones(), and .trailing_ones() methods to numbers as counterparts to .leading_zeros(), and .trailing_zeros().

Motivation

Symmetry

In Rust there are currently 4 operations that count ones and zeros inside the binary representation of numbers:

<num>::count_ones()
<num>::count_zeros()
<num>::leading_zeros()
<num>::trailing_zeros()

count_zeros() has a counterpart in count_ones(), but all other methods do not have any counterparts.

In the Rust ecosystem it’s easy to get used to the symmetry in methods. For example: Option::is_none() is balanced with Option::is_some(), and <num>::rotate_left() is balanced with <num>::rotate_right().

Adding counterparts to .leading_zeros() and .trailing_zeros() would create symmetry in those APIs, which seems like a natural fit for Rust’s stdlib.

Expressiveness & Reducing Errors

To count the number of trailing ones in a number, you’d currently have to write something similar to:

fn do_something(i: u8) {
  let x = (!i).count_zeros();
  // (...)
}

This is not ideal for a few reasons. The binary negation operator (!) might not be familiar for a lot of people. In languages such as JavaScript it would evaluate to a boolean, and for people coming from those languages it might be a little confusing at first sight what’s going on.

Also there’s some operator precedence to learn here: the negation operator is evaluated after the method call. This requires the addition of braces to ensure the statements are evaluated in the right order.

Instead this would be more expressive, and less prone to errors:

fn do_something(num: u8) {
  let x = i.count_ones();
  // (...)
}

Guide-level explanation

`<num>::trailing_ones()`

The .trailing_ones() method would return the number of trailing ones in the binary representation of the number.

let n = 0b1001011u8;
assert_eq!(n.trailing_ones(), 2);

This method would be available on all numerical types.

`<num>::leading_ones()`

The .leading_ones() method would return the number of leading ones in the binary representation of the number.

let n = 0b1001011u8;
assert_eq!(n.leading_ones(), 1);

This method would be available on all numerical types.

Reference-level explanation

`<num>::trailing_ones()`

The .trailing_ones() method would be implemented as part of libcore/num/mod.rs, and added to each numerical type:

#[rustc_const_unstable(feature = "const_int_ops")]
#[inline]
pub const fn trailing_ones(self) -> u32 {
    (!self).trailing_zeros()
}

This implementation would be similar to count_zeros().

The difference in (optimized) x86 assembly would be a single instruction:

trailing_zeros:
  movzx   eax, dil
  or      eax, 256
  tzcnt   eax, eax
  ret

trailing_ones:
  not     dil       ; added
  movzx   eax, dil
  or      eax, 256
  tzcnt   eax, eax
  ret

`<num>::leading_ones()`

The .leading_ones() method would be implemented as part of libcore/num/mod.rs, and added to each numerical type:

#[rustc_const_unstable(feature = "const_int_ops")]
#[inline]
pub const fn leading_ones(self) -> u32 {
    (!self).leading_zeros()
}

This implementation would be similar to count_zeros().

The difference in (optimized) x86 assembly would be a single instruction:

leading_zeros:
  movzx   eax, dil
  lzcnt   eax, eax
  add     eax, -24
  movzx   eax, al
  ret

leading_ones:
  not     dil       ; added
  movzx   eax, dil
  lzcnt   eax, eax
  add     eax, -24
  movzx   eax, al
  ret

Drawbacks

API Surface Growth

Adding this feature would increase the overall API of Rust by introducing two new methods to every number type. However this increase only adds symmetrical counterparts to existing APIs, so there should be a minimal amount of extra learning needed.

No Direct Mapping To Intrinsics

The addition of these methods doesn’t have a direct mapping to any Rust intrinsics, and/or LLVM intrinsics.

For .trailing_zeros there’s the cttz intrinsic, which in LLVM maps onto llvm.cttz. A similar mapping exists in .leading_zeros() through ctlz. This mapping would not exist for the two APIs we’re proposing in this RFC.

However there is prior art for a non-direct mapping in .count_zeros() and .count_ones(). .count_zeros() is implemented as a negation of .count_ones(), which would be similar to what we’re proposing:

pub const fn count_zeros(self) -> u32 {
  (!self).count_ones()
}

Rationale and alternatives

Do Nothing

This RFC doesn’t introduce anything that wasn’t possible before, but instead intends to codeify patterns into methods. This lowers the bar for accessing this functionality, and provides a more streamlined experience.

Introduce As A `NumExt` library

It might be argued that convenience methods should live in user land, as part of libraries that provide extension traits to stdlib.

However, that would seem like it would have the same discovery problems as learning the (!<num>).<method> pattern, and could arguably be considered too small of a library which probably would see little usage. Making it effectively the same as doing nothing.

Naming

Arguably using the *_ones() suffix might lead to confusing. When reading .trailing_ones() in code without context, it might be reasonable to ask: “Which ones are trailing?”, and “Do they feel lonely?”.

An alternative naming scheme might include a *_1s() suffix, removing some of the ambiguity. However this would break the precedent set in methods such as .count_ones(). Therefor it seems best to follow the existing scheme of pairing zeros with ones.

Prior art

Rust’s Stdlib

Rust’s <num>::count_zeros() method is implemented as a binary negation of <num>::count_ones(), which maps to the popcnt instruction.

In x86 assembly, the difference between the two methods is a single not instruction:

count_ones:
  movzx   eax, dil
  popcnt  eax, eax
  ret

count_zeros:
  not     dil        ; added
  movzx   eax, dil
  popcnt  eax, eax
  ret

For the methods we’re proposing, the difference would be similar.

C++

We’re currently not aware of any prior art for this method in other languages. Most of the prior art we’ve found has been through access to compiler intrinsics such as gcc’s __builtin_clz and msvc’s popcnt.

A quick search on the internet brings up examples to manually combine these methods such as:

#ifdef __GNUG__
int count_ones (unsigned int a) {
  return __builtin_popcount(a) ;
}
#endif // __GNUG__

#ifdef _MSC_VER
int count_ones (unsigned int a) {
  return __popcount(a);
}
#endif // _MSC_VER

This seems to have a reasonable amount of boilerplate, and is less convenient than having a single, portable method available.

Dat Protocol

The Dat protocol makes use of a data structure called “Flat in-order trees”, where entries have a “depth” property that can be calculated by counting the amount of trailing ones in the binary representation of numbers.

 0─┐
   1─┐
 2─┘ │
     3
 4─┐ │
   5─┘
 6─┘

5 in binary = 101 (one trailing 1)
3 in binary = 011 (two trailing 1s)
4 in binary = 100 (zero trailing 1s)

It seems reasonable to assume that more algorithms exist that make use of similar number properties.

Unresolved questions

None.

Future possibilities

While researching how to most efficiently count trailing ones, we found some interesting operations that currently don’t seem to be exposed in Rust.

For example Intel’s bit_scan_* range of instructions might be interesting to add.

There’s probably a balance to be had, as there’s probably such a thing as “too many methods available”. But it might still be interesting for Rust to look at which other methods could be exposed, as it could enable some interesting optimizations for algorithms that might otherwise require some guesswork to achieve.

This would be similar to how SIMD intrinsics have recently be introduced, as a no-guesswork counterpart to auto-vectorization.

H2CO3 · November 4, 2018, 6:33pm

Very nice writeup for a first RFC!

sfackler · November 4, 2018, 6:53pm

Smallish API extensions like these don’t need an RFC. These seem pretty reasonable to add IMO!

Centril · November 4, 2018, 7:25pm

I'm not opposed to this addition; however, this bit can be fixed by writing i.not().count_zeros().

scottmcm · November 4, 2018, 11:40pm

These do have intrinsics, for example https://doc.rust-lang.org/std/intrinsics/fn.ctlz_nonzero.html, introduced in

github.com/rust-lang/rust

Get LLVM to stop generating dead assembly in next_power_of_two

rust-lang:master ← scottmcm:ctz-nz

opened 05:26AM - 09 Jun 17 UTC

scottmcm

+96 -3

It turns out that LLVM can turn `@llvm.ctlz.i64(_, true)` into `@llvm.ctlz.i64(_…, false)` ([`ctlz`](http://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic)) where valuable, but never does the opposite. That leads to some silly assembly getting generated in certain cases. A contrived-but-clear example https://is.gd/VAIKuC: ```rust fn foo(x:u64) -> u32 { if x == 0 { return !0; } x.leading_zeros() } ``` Generates ```asm testq %rdi, %rdi je .LBB0_1 je .LBB0_3 ; <-- wha? bsrq %rdi, %rax xorq $63, %rax retq .LBB0_1: movl $-1, %eax retq .LBB0_3: movl $64, %eax ; <-- dead retq ``` I noticed this in `next_power_of_two`, which without this PR generates the following: ```asm cmpq $2, %rcx jae .LBB1_2 movl $1, %eax retq .LBB1_2: decq %rcx je .LBB1_3 bsrq %rcx, %rcx xorq $63, %rcx jmp .LBB1_5 .LBB1_3: movl $64, %ecx ; <-- dead .LBB1_5: movq $-1, %rax shrq %cl, %rax incq %rax retq ``` And with this PR becomes ```asm cmpq $2, %rcx jae .LBB0_2 movl $1, %eax retq .LBB0_2: decq %rcx bsrq %rcx, %rcx xorl $63, %ecx movq $-1, %rax shrq %cl, %rax incq %rax retq ```

I'd love to find a good way to expose them, since last I checked LLVM is unable to simplify ctlz(x, false) to ctlz(x, true) when it knows x != 0.

They could always be exposed as an _unchecked method, but it feels like there's something better to find. I was pondering putting them on NonZero*, for example, but those have no such methods yet, so I'm not sure that's right...

yoshuawuyts · November 6, 2018, 1:17am

Okay, yay! I’ve started working on a patch against rustc. It’s a bit scary, but kind of fun!

Will update this thread here once I have a PR

yoshuawuyts · November 6, 2018, 11:52am

Wrote a PR! https://github.com/rust-lang/rust/pull/55715

system · March 25, 2019, 8:31am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: ergonomics around NonZeroU* and literals language design	11	1679	March 25, 2019
This week's older RFCs	3	1227	March 25, 2019
Idea: In the next edition, stop accepting `0.` as a valid float literal	32	3888	January 15, 2020
Pre-RFC: Extended array literal syntax language design	5	894	September 21, 2019
pre-RFC: Leading commas	24	2205	March 25, 2019

[Pre-RFC]: leading_ones_trailing_ones

Summary

Motivation

Symmetry

Expressiveness & Reducing Errors

Guide-level explanation

<num>::trailing_ones()

<num>::leading_ones()

Reference-level explanation

<num>::trailing_ones()

<num>::leading_ones()

Drawbacks

API Surface Growth

No Direct Mapping To Intrinsics

Rationale and alternatives

Do Nothing

Introduce As A NumExt library

Naming

Prior art

Rust’s Stdlib

C++

Dat Protocol

Unresolved questions

Future possibilities

Related topics

`<num>::trailing_ones()`

`<num>::leading_ones()`

`<num>::trailing_ones()`

`<num>::leading_ones()`

Introduce As A `NumExt` library