Enums should magically get a trait for getting `discriminant` and `variant_count`

Both discriminant and variant_count have the following disclaimer in their docs:

If T is not an enum, calling this function will not result in undefined behavior, but the return value is unspecified.

Why not avoid this pitfall by automatically adding a trait to all enums that identifies them as such? Making discriminant require that trait will break backward compatibility, but the trait itself can have its own versions of discriminant and variant_count and functions that use these methods can require these traits so that the compiler enforce they are not used with non-enums.

4 Likes

The search term you're looking for is AsRepr:

3 Likes
// Automatically implemented for all data-free enums with an explicit repr attribute.

I think it's for something different. It talks about "primitive enums", specifically ones with #[repr], and a way to convert them to their simple integer representation. It talks about the discriminant because even if an enum has data in some of its variants, the discriminant itself does not and can thus be considered a "primitive enum".

I'm talking about something different - about the way to get the discriminant from enums, primitive or otherwise.

Yeah, that's part of the conversation. Part of the problem is that we don't actually want people to get discriminants from every enum, since nothing says they should be meaningful. That's why mem::Discriminant doesn't support PartialOrd, for example.

Thus the big conversation (such as Plan for enum discrim, repr, and casting · Issue #134 · rust-lang/lang-team · GitHub) is about having something that can be derived to offer this kind of thing for enums where it makes sense.

1 Like

I think the better resolution for discrimimant is probably just to specify it's behavior for non-enum types as returning an arbitrary but consistent (and non-introspectable) value, and for variant_count to return 1.

9 Likes

Indeed that's consistent with treating a struct as sugar for a single-variant enum, which tbf it may as well be?

2 Likes

It's not quite sugar because of visibility - struct is the only way to have non pub fields. But I think that's the only semantic difference between structs and univariant enums. They even use the same code path in the compiler for computing layout, iirc.

2 Likes

Is this ever the right decision though?

In my specific use case, I want to go through one code path when a variant change, and another when it wasn't but the fields inside the variant may have possibly changed. If someone uses my function with a non-enum type, and the variant will be constant, they'll always get the second behavior - which is probably not what they expect.

I don't know if it's a representative for the common use case for discriminant, but I'd imagine that if someone ends up using this function with a non-enum it's because they've made a mistake (probably a cascading one - passing a non-enum directly to it would have been easy to spot). And the fact that its behavior for non-enums is currently unspecified supports this opinion (better than undefined, but still not something one should use). I thought I'll have to work hard to construct an example for this pitfall, but it turned out simpler than anticipated:

use std::collections::HashSet;
use std::mem::discriminant;

#[derive(Clone)]
enum Shape {
    Circle(f32),
    Rectangle(f32, f32),
    Triangle(f32, f32, f32),
}

fn count_different<T>(items: impl IntoIterator<Item = T>) -> usize {
    items
        .into_iter()
        .map(|item| discriminant(&item))
        .collect::<HashSet<_>>()
        .len()
}

fn main() {
    let shapes = &[
        Shape::Circle(1.0),
        Shape::Rectangle(2.0, 3.0),
        Shape::Triangle(4.0, 5.0, 6.0),
        Shape::Circle(7.0),
    ];
    println!("Wrong answer: {}", count_different(shapes));
    println!("Right answer: {}", count_different(shapes.to_owned()));
}

count_different is only meaningful when T is an enum. But when I used count_different(shapes), T was a reference to an enum - so discriminant gave a constant value and it did not do its job.

Should a reference to an enum be treated like an enum or like a struct? I say neither - the Rusty standards dictate that such code should be rejected, and that the compiler errors will guide me to write the correct code that does what I meant to do.

I'm not saying that there shouldn't be some Enum trait (for count_different it would make sense), just that at least for the free function discriminant there's an obviously correct semantic which could be given to non-enum types. A compile-time restriction is better, but count_different can (assuming variant_count always returns 1 for non-enum) include an assert!(variant_count::<T>() > 1), panicking if the result would be meaningless. (And with some trickery, it's already possible to turn a const variant_count into a post-monomorphization error for this.)

For integers and floating point types, would the discriminant just be the underlying bits of the value? Is it an issue that the number of variants of u64 (not to mention u128) doesn't fit in a usize?

And what about floating points? One could argue that IEEE 754 is essentially an enum:

pub enum f32 {
    Finite {
        base: i24,
        exponent: i8,
    },
    PositiveInfinity,
    NegativeInfinity,
    QuietNaN,
    SignalingNaN,
}

So should variant_count::<f32>() == 5? And should discriminant return the same value for all finite numbers but a distinct value for each special case?

I don’t think there’s any reasonable definition of the "discriminant" of a fundamental type except the same as for structs/single-variant enums, that is, as a zero-sized type ("a zero-bit integer"): i32 has a single variant just like a struct Foo(i32) does.

2 Likes

You're confusing "number of values" with "number of variants". All fundamental types have variant_count() == 1.

2 Likes

What about bool? Is it an enumerated type with two variants?

bool is a scalar with two values. enum Bool { True, False } is an enum with two variants. That they are isomorphic doesn't mean they are equivalent (in this way, while u32 and i32 are isomorphic, their semantics are not the same).

3 Likes

I thought the point of this discussion was that variant_count was undefined for fundamental types and it is hardly obvious to me at least that it should be 1. Forcing this function to be entire constrains the compiler in the future. (I'm also not sure how useful a notion of isomorphism is when you have properties that aren't invariant under it.)

Isomorphism just means that there exists a pair of functions f :: A → B and g :: B → A such that f(g(b)) == b and g(f(a)) == a. It doesn't say that all such functions f and g need to participate in such a pairing.

People have wanted to do that for over 8 years, but it hasn't happened yet:

That's a bijection. Isomorphism requires that f and g preserve some properties. Which properties? That depends on the field of mathematics you are working in.

I know what an isomorphism is. Yes, you can cast between i32 and u32 and back and get the identity. Is this set-theoretic notion useful? You can abuse set theory to make String and Option<String> "isomorphic."

I am not trying to make a formal mathematical argument. Integers in general and bool in particular act a lot like enumerated types. You can match on them, exactly as you would with a type declared with enum, and they have finitely many values. There are times when it's convenient to think of them as having multiple variants and it seems like declaring their variant counts to be one might prevent block some changes in the future.