# Feature request: #[repr(bool)]

There are values which are fundamentally boolean in nature, but which can be easy to misinterpret. For example, with the Miller-Rabin probablistic primality test, I might want a type that more clearly explains what each result means:

``````#[repr(bool)]
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub enum Primality {
DefinitelyComposite = false,
ProbablyPrime = true,
}

impl Primality {
pub(crate) fn new(value: bool) -> Primality {
core::mem::transmute<bool, Primality>(value)
}

#[inline(always)]
pub fn is_definitely_composite(&self) -> bool {
!(self as bool)
}

#[inline(always)]
pub fn is_probably_prime(&self) -> bool {
self as bool
}
}
``````

If the function just returns `true` or `false`, it's easy to forget that it doesn't mean definitely prime or definitely composite, both for authors and readers.

In this case, it would be nice to not use constants, like `const PROBABLY_PRIME: bool = true`, because that doesn't fix that problem, as it is highly likely that the programmer will see `-> bool` on the function and then deal with it as a regular `bool` rather than using the constants.

This has several benefits:

1. The type is very explicit about what its values mean
2. That explicitness will generally be reflected in the code programmers write
3. `#[repr(bool)]` ensures zero-work conversions to `bool`
4. `#[repr(bool)]` allows you to write `Variant1 = true` instead of `Variant1 = 1` as with `#[repr(u8)]`, which better communicates meaning
5. `#[repr(bool)]` enforces a maximum of two variants
6. `#[repr(bool)]` allows explicitly using `value as bool` if desired, instead of something more complicated to understand like `value as u8 != 0` with `#[repr(u8)]`
7. If there ever comes along an architecture where `bool` is more efficiently represented as something other than `0_u8` and `1_u8` (perhaps the machine language requires `u8::MAX` for `true`), programs which use `#[repr(bool)]` will automatically adapt and will already be optimized compared to things like `value as u8 != 0`, which might require a comparison instead of no work at all
8. If the programmer knows how `#[repr(u8)]` and so on work, `#[repr(bool)]` has an obvious interpretation
9. There are no backward compatibility issues with this
10. Works nicely with FFI by allowing us to "wrap" a C boolean in an `enum` and vice versa

What are your opinions on this feature request?

6 Likes

It's probably best to have a separate explicit list of just what is different from using `#[repr(u8)]`.

I think it'd just be

• compiler-enforced variant count â‰¤ 2
• variant assignment `= true` instead of `= true as u8`
• theoretical platforms where `sizeof(_BOOL)` is not `sizeof(uint8_t)`
• theoretical platforms where `transmute(true) != true as u8`
• Rust defines the `as` to portably produce 0/1 independent of what the byte representation is
• `as bool` support on the enum
• note that `as` is generally disliked and `From`/`Into` should be derived/used instead
6 Likes

Could you provide more use cases? I think as long as you can represent a value as too variant, it's okay to represent as it directly as a bool, where you just need to have a function called "is_xxx".

In your specific case, the reason why you want to have a standalone enum is that there is actually a third case (which you didn't include): definitely prime. So this enum should actually contains three variants.

That's incorrect. The probablistic primality test is called that because it's probablistic, not deterministic. There is no third variant for the result it returns. It can tell you either that it's definitely composite or probably prime and can provide no further information.

Misunderstandings like this are a primary reason why this is useful.

3 Likes

Which part is incorrect?

There is a certain range in where the integers can be checked with miller-rabin tests deterministically. If you don't want to include that branch, usually people will use a `is_probable_prime` name for the primality check.

Yes, it's true for just about all probablistic algorithms that the uncertain outputs for some inputs can be correctly used as if they were certain. That extra knowledge of which inputs that happens with doesn't change the fact that the algorithm has exactly two possible outputs and that they have the meanings that I mentioned.

The `is_probable_prime` name is not sufficiently clear. What does `false` mean? Does it mean that it's much more improbable for it to be a prime or does it mean that it's definitely not prime?

If you would want to improve on the algorithm itself by providing a third variant, that's fine, but there are benefits to not doing that extra work when it can't apply (for example, RSA key generation can't use facts about small integers), and so an implementation that skips that work is acceptable in those cases, and the `enum` I gave earlier is useful in that case.

3 Likes

What's the main goal here â€” allowing `foo as bool` for bool-like enums? If so, I definitely don't want this. stdlib has been trying to move away from `as` because it can be error-prone in many situations. I don't think expanding it is what we want/need.

3 Likes

The main goal is to give a clearer meaning to values that are traditionally represented as `bool`s. A secondary goal is to represent them as `bool`s so that, for example, they can be given via FFI to C or received via FFI with no problems and no work at all as far as conversions.

3 Likes

The name is not that clear itself, but the documentation could be sufficient to make it clear. It seems not worth it for me to add a language feature for this.

As far as I understand it, the entire goal of `enum`s is specifically for cases like this. The documentation could be sufficient to make it clear that 1 stands for a red light on a stoplight in code that controls an actual stoplight, but that's not what we do.

To use an `enum` instead brings several benefits as far as making code more explicit to aid understanding of what's going on.

It also explicitly reinforces the meaning in the code itself, providing less need to memorize the documentation's exact details.

3 Likes

size_of in std::mem - Rust has documented that `sizeof(bool)` is `1` in Rust, though, so `repr(bool)` would I think have to be `1` as well, regardless of what C says.

4 Likes

maybe all you want is

``````impl Primality{
pub const ProbablyPrime:Primality=Self(true);
pub const DefinitelyComposite:Primality=Self(false);
}
``````

with

``````impl std::fmt::Display for Primality{
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f,"{}", if self.0 {"ProbablyPrime"} else {"DefinitelyComposite"})
}
}
``````
3 Likes

Not really. That can't, for example, be used easily in `match` statements. I really do mean an `enum`.

It appears I was wrong. This is a solution if the `#[repr(bool)]` suggestion isn't accepted. I can use `#[repr(transparent)]` on it and then put `#[allow(non_upper_case_globals)]` on the constants.

This seems to have some issues that `enum`s don't, though. It's harder to write the `use` line for them, for example. You can't just do `use crate_name::Primality::*;` like you can with an `enum`.

It seems a bit like trying to pretend to be an `enum` because we want it to work like one, but not quite having the real thing.

2 Likes

In 2018, we decided that `bool` is ABI- and layout-compatible to C `_Bool`, and that `sizeof _Bool == 1` on all platforms Rust currently targets.

If my memory serves me well, we deferred the decision of what exactly this means for targets where `sizeof _Bool != 1`. IIRC, the options are that Rust doesn't support those platforms, or that `bool` is still `_Bool` and whatever downsides apply (and they apply equally to C, which also mandates that `_Bool` acts as an integer containing either 0 (false) or 1 (true); given C23's mandate of two's compliment representation, this may mandate that `_Bool` is byte-equivalent to an integer of the same size storing 0 or 1).

I'm not seeing the motivating use case here. Why write `value as bool` when you could write `value == Primality::ProbablyPrime`? The latter is clearer at the use-site, which is the main point of using an enum in the first place.

You could also easily define an `as_bool()` method for your enum, although it would probably be better to name it `is_probably_prime()` (i.e. just implement your definition of `is_probably_prime()` manually instead of leaning on a new Rust feature). In the former case, `repr(bool)` would be a minor convenience at best, and seems like it would mainly just tempt the user into a less-readable format.

Unfortunately, this is more complicated than you might think, because different C compilers don't agree on whether a bool is 1 byte or 4 bytes. The unstable `core::ffi` module doesn't even define a "c_bool" type, presumably for this reason. So you'll always need an explicit conversion (which is probably what you want to do anyway, for the semantics reasons).

11 Likes

As far as I am aware, it is still the case that C99 `_Bool` is one byte on all targets Rust targets (where `_Bool` exists, at least; it may not on some of the retro targets).

What is common is to see a `BOOL` macro from pre-C99 which is `int`.

Props to C99 for finally settling on an answer, but I doubt there will ever come a time where Rust FFI is no longer used to interface with old legacy code, so it'll always be in our interest to avoid making features where you can easily miss a possible gotcha.

Ah, you skipped a step there.

• Rustâ€™s `bool` always has size/align 1.
• Rustâ€™s `bool` always matches Câ€™s `_Bool` in ABI, not any legacy `BOOL` (of which there wouldnâ€™t necessarily be one standard one per target anyway).
• C `_Bool` does not have size/alignment 1 on all platforms (the one I know is 32-bit PowerPC Macs, where its size and alignment was 4).
• Rust does not currently support any such targets.

None of these say anything about targets that donâ€™t have a C99 `_Bool`, because Rust doesnâ€™t support any such targets. But if it did, `bool` matching some arbitrary C typedef wouldnâ€™t break anything more than a non-size-1-align-1 `_Bool` could.

I agree with the original poster that being able to define an enum whose representation is ABI-compatible with `_Bool` is interesting and useful, and that `repr(bool)` is a reasonable way to spell that.

EDIT: Because C++11 allows `enum foo_t : bool` (their equivalent of `repr(primitive)`), this could be considered a C++ interop feature.

2 Likes

Whatâ€™s wrong with this?

``````#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub enum Primality {
DefinitelyComposite,
ProbablyPrime,
}

impl Primality {
pub(crate) fn new(value: bool) -> Primality {
match value {
true => ProbablyPrime,
false => DefinitelyComposite,
}
}

#[inline(always)]
pub fn is_definitely_composite(&self) -> bool {
matches!(self, DefinitelyComposite)
}

#[inline(always)]
pub fn is_probably_prime(&self) -> bool {
matches!(self, ProbablyPrime)
}
}
``````

Just as readable, doesnâ€™t require any new language features.

1 Like

One of Rust's selling points is "Empowering everyone to build reliable and efficient software." One of the ways it does that is by using C++'s idea of zero-cost abstractions.

It's more efficient to deal with something that already has the exact same representation as a `bool` so that there are no conversion costs at all. This fits with the desire for zero-cost.

This is the same reason that, for example, `#[repr(u8)]` exists: so that the representation of the type and the exact values of each variant can be controlled by the programmer for efficiency reasons (for example, by eliminating conversion costs to and from an "unwrapped" `u8` value).

Without `#[repr(u8)]`, you can still use a few `match` statements to convert back and forth, but it can have a cost if the underlying variants aren't the same bit patterns as their `u8` version or, even if those are the same, if the compiler doesn't realize it can eliminate the conversion.