Can the compiler stuff booleans into a bitmask since it has the borrow checker?

I was thinking about how I was told booleans each take up entire word size to be represented in a program. if we have the borrow checker and know they will only be mutated one at a time or copied, can we push all booleans in a program into one bitmask? In the best case the value would never be copied and would be loaded at each comparison or mutation from a single register for the entirety of a program.

It seems like it would be a vectorizer pass similar to AVX but for bitmasking operations. Does this already exist? It seems possible even without the borrow checker.

1 Like

This sounds like it'd require atomics around any bool access (because who knows if another thread will be doing something with a neighboring bit. CPUs with bit addressing modes are few and far between and I suspect those tend to lack threads so maybe they don't care, but such things are beyond the Rust language spec IMO.

I think if you know you can bitpack, there are crates for you already to manage them (with appropriate lack of access via any &bool or &mut bool handles).

1 Like

The threading model would be on top of the primitive type anyways because of the borrow checker. You need AtomicBool or Arc already for bool so it should be the same as bool thanks to the borrow checker. Essentially for some set of Bool types if they are all bool and not wrapped in another type such as Arc, therefore following the borrow checker, they should be able to be vectorized into a "boolmask" type of size word. its not really bitpacking so much as auto-vectorization of "bit words" to architecture word size.

this seems to be the closest crate in relation to this I could find https://crates.io/crates/bitvec

Because size_of::<bool>() is 1, reference need to be able to replace the whole byte.

One way that packing like you describe could be legal would be something like "copy-only fields" (A #[repr] to allow an enum containing enums to use a single discriminant - #5 by scottmcm) so there's always the opportunity to run code, and thus it's work fairly easily.

6 Likes

It takes one byte, not a word (which would be 2 bytes AFAIK).

No. The reason that the borrow checker can avoid data races is because values in rust are always multiple of bytes, so even if two values are mutated from two different threads they won't touch the same byte. This would no longer be true with your proposal since a bool would be smaller than a byte, so mutating two adjacent bools from two different thread may result in mutating the same byte, thus in a data race.

Maybe an example would make this cleaner:

let mut bools = [true, true];
let (first, second) = bools.split_at_mut(1);
let first = &mut first[0];
let second = &mut second[0];
// Now `first` and `second` are two references that point to adjacent bools

// Using `rayon::scope` to send non-`'static` data
rayon::scope(|scope| {
    // One thread mutates `first`
    scope.spawn(|_| {
        *first = false;
    });
    // The other mutates `second`
    scope.spawn(|_| {
        *second = false;
    });
});

dbg!(first, second);

If bools was a bitmask, the first and second would point to the same byte, and thus modifying them at the same time would result in a data race. Note that this code doesn't use atomics right now, because bool being 1 byte makes this safe even without them. Your proposal however would require every bool to become an AtomicBool.

Not to mention all the code that currently expects bool to have size 1, and thus would break if this was changed.

2 Likes

How will a code like *b = true; compiled? We cannot know the bit pattern for true for *b ahead of time!

hmmm maybe for all Booleans within a scope? seems like this could happen alot with object oriented programming. If a struct contains several booleans its object will be whats borrow checked. But yes this is admittedly not as generalizable as I first thought. If its a tricky enough edge case it should probably just be in a crate.

Obviously boolean pointers would have to become fatter to hold the byte address and 3 bits of index. This could even be done in a library type today. Most of the problems being presented here are not intractable, and even the far-reaching problems can be solved with just sometimes making bools 8 bits anyway (e.g. for atomics).

@Corallus-Caninus bool is currently the smallest addressable size. This keeps it simple (it's like C), makes references to it simple, and it usually doesn't use up that much extra space. Compare it to how often you use a u64/usize when a smaller integer type would have worked. Or how often your structs have literally unused padding bytes. There are performance issues, doing a mask operation every time you dereference may not be what you want. And there are aliasing issues of having multiple mutable references to the same byte (tractable but probably not fun).

Even applying this space optimization only within a function body, only for bools which do not get referenced, would probably just lose a small amount of performance. It would be worth it only if you had exhausted almost all other avenues of compressing stack space.

When you do want to make these tradeoffs though, they can be done explicitly. Bitflags don't (really) exist in Rust yet, but they probably should, and this would allow you to pack a single struct's fields and opt into masking with either fat references or inability to reference. bitvec is there for when you want just a ton of booleans in one place, probably one of the big spots that 8×ing memory isn't helpful.

1 Like

This is exactly what bitvec does, and more. @myrrlyn has written the best bit addressing library out there, in any language.

3 Likes

I've actually personally written code that could benefit from compiler compaction of booleans into a single register. They weren't bools, though, they were enum discriminators...

struct LandmarkFileColumns {
    addr: usize,
    lon: usize,
    lat: usize,
    dist: usize,
}

fn select_columns(header: &csv::StringRecord, location: &str)
    -> Result<LandmarkFileColumns, LandmarkFileMissingColumnsError> {

    let mut addr: Option<usize> = None;
    let mut lon:  Option<usize> = None;
    let mut lat:  Option<usize> = None;
    let mut dist: Option<usize> = None;

    for (index, field) in header.iter().enumerate() {
        match field {
            "addr"             => { addr = Some(index); },
            "longitude"        => { lon  = Some(index); },
            "latitude"         => { lat  = Some(index); },
            f if f == location => { dist = Some(index); },
            _                  => {},
        }
        if let (Some(addr), Some(lon), Some(lat), Some(dist)) =
            (addr, lon, lat, dist) {
                return Ok(LandmarkFileColumns { addr, lon, lat, dist });
            }
    }
    return Err(LandmarkFileMissingColumnsError(/* ... */));
}

Because usize has no niches, at the assembly level each of the mutable Option<usize> variables requires two words, and (on x86-64) we run out of registers.

2 Likes

But then you break pointer casts (&*(&b as *const bool as *const u8 as *const bool)).

You can pack it inside one word by using the unused high bits, but this is only possible on 64 bit platforms.

I mean, it breaks a lot of things for sure if this is applied globally, yes. I guess I am interpreting @Corallus-Caninus's question as "can we have a space optimization pass that packs booleans into bitmasks". My answer would be "yes definitely", with some caveats mentioned in this thread. Reifying bitreferences seems pretty easy, this is already done explicitly in bitvec. Framed as an optimization pass we would not, as you say, want to make this observable.

LLVM doesn't do this optimization even with -Os, as far as I can tell, I assume for some good reason(s). But maybe it would make sense as part of a suite of -Oregister_starved passes? Probably this isn't something you would build into rustc specifically.