Optimizing branched assignings based on possible states

Hi. I tried this code in the godbolt:

pub enum Bool {
    False,
    True,
}

pub fn b(b: &mut bool) {
    if *b == false {
        *b = true;
    }
}

pub fn B(b: &mut Bool) {
    if let Bool::False = *b {
        *b = Bool::True;
    }
}

I was expecting something like this as the assembly output:

example::B:
        mov     byte ptr [rdi], 1
        ret

But the compiler doesn't optimize it and doesn't outputs the above assembly. I don't really know about the internals of bool type, but I know that the compiler is able to see possible variants, and therefore possible states of enums and act according to that. Since invalid discriminants for enums also UB, I was expecting the compiler to do the above optimization. Is there something I am not aware of that blocks this kind of optimization or is it just not implemented? If there is not such an obstacle, should we implement this to the compiler?

While I believe this would be a legal implementation of the function, it's not clear to me that this is an optimization? This does an unconditional memory write, whereas the optimized code from LLVM does a read followed by a conditional write. I don't think either of these two implementations is strictly better than the other.

I don't know which is better for sure either, I thought that a branchless write would be more efficient most of the time.

Given the complexity of modern CPU microarchitectures, I don't think you can make such a claim generally.

3 Likes

I came across this issue on github, and it seems like the reason this code don't optimizes into an unconditional assigning is about the generated LLVM IR: