Less bug-prone bitflags macro, match annotations for higher performance


#1

I have seen this pull request: https://github.com/rust-lang/rust/pull/36993

It replaces an enum with some bitflags like:

bitflags! {
    #[derive(Debug)]
    flags NodeState: u8 {
        #[allow(non_upper_case_globals)] const Pending = 1 << 0,
        #[allow(non_upper_case_globals)] const Success = 1 << 1,
        #[allow(non_upper_case_globals)] const Waiting = 1 << 2,
        #[allow(non_upper_case_globals)] const Done    = 1 << 3,

In some cases the flags are a little strange, so you need to specify them manually. But in many cases you just need the standard 1,2,4,8,16,… progression. For such common case I think it’s better to have something like a bitflags_default macro that assigns the powers of two automatically. I think this could avoid you some bugs.


The patch notes:

Replaces all the hot matches involving NodeState with if/else chains that ensure that cases are handled in the order of frequency.

The changes are like:

-        match node.state.get() {
-            NodeState::Pending | NodeState::Done => {},
-            NodeState::Waiting | NodeState::Error | NodeState::OnDfsStack => return,
-            NodeState::Success => {
-                node.state.set(NodeState::Waiting);
-            }
+        let state = node.state.get();
+        // For rustc-benchmarks/inflate-0.1.0 this state test is very hot.
+        // Dominant occurrences: `Waiting` 64.2%, `Success` 35.5%
+        if state.intersects(NodeState::Waiting | NodeState::Error | NodeState::OnDfsStack) {
+            return;
+        } else if state == NodeState::Success {
+            node.state.set(NodeState::Waiting);
+        } else if state.intersects(NodeState::Pending | NodeState::Done) {
+            // do nothing
+        } else {
+            unreachable!();

Instead of fixing the problem just for this situation isn’t it better to find a more general solution? I mean, perhaps we can introduce some annotation like:

#[likely=5]

on the branches of the match{}, to tell the back-end what the dominant occurrences are, so the compiler is able to compile the match{} in a way that’s as much efficient as the chain of if/then.


Reviving the bit-data discussion
#2

How much slower would it be to shift at runtime:

let state = node.state.get();
let mask = 1 << state; // The only shift at runtime (the rest will be constant bitmaps).
if (mask & ((1 << NodeState::Waiting) | (1 << NodeState::Error) | (1 << NodeState::OnDfsStack))) != 0 {
    return;
} else if state == NodeState::Success {
    node.state.set(NodeState::Waiting);
} else if (mask & ((1 << NodeState::Pending) | (1 << NodeState::Done))) != 0 {
    // do nothing
} else {
    unreachable!();
}

This looks like an optimization that the compiler could make for small enums (compile a match to the above bit-twiddling). We’d still need some way to state which branch is likely but that’s a different issue.