[Pre-RFC] Inferred Enum Type

The problem with that reasoning is that people will use it everywhere it's allowed. Unless some piece of code is a hard compiler error, programmers will invent creative excuses to (ab)use every construct, even if it's not recommended or explicitly frowned upon. People think they know better than years or decades of community or industrial experience about what ought to be avoided, and that's how bugs sneak in and complexity accumulates.

6 Likes

Counterexample: we already have constructs which would be even shorter and less readable, such as using a series of boolean arguments to a function, and yet people take steps to make code more readable, such as by creating dedicated self-documenting enums.

It's already possible to write obfuscated, write-only Rust, but people strive not to.

I do think we should avoid language features that always make the language less readable. And when choosing between language-based solutions to a problem, we should favor solutions that make it easier to write readable code and ideally not easier to write unreadable code.

I think this feature passes that criterion. It's not easier to write unreadable code with it; you can do that just as easily without.

The most compelling argument I've seen against this feature is that it may make code less greppable, since the enum name will appear in fewer places. And I do think that's an important argument to balance.

24 Likes

The impression I get, specifically regarding match expressions, is that:

  • Some people want to avoid the repetition of specifying the matched type for each match arm.
  • Some people want it to be clear, from the match expression alone, what type is being matched.

A potential way to address both of these concerns would be to require the first match arm to explicitly contain the type, then allow _ afterwards. This is sort of like a ditto mark.

Modifying the example from the OP:

pub fn associated_status_code(&self) -> Option<StatusCode> {
    match self {
        HeaderError::SpecificityInvalid => Some(StatusCode::BadRequest),
        _::DateInvalid(_) => Some(StatusCode::BadRequest),
        _::TransferEncodingUnnegotiable => Some(StatusCode::NotAcceptable),
        _::TransferEncodingInvalidEncoding(_) => Some(StatusCode::BadRequest),
        _::TraceContextInvalid(_) => Some(StatusCode::BadRequest),
        _::ServerTimingInvalid(_) => Some(StatusCode::BadRequest),
        _::TimingAllowOriginInvalidUrl(_) => Some(StatusCode::BadRequest),
        _::ForwardedInvalid(_) => Some(StatusCode::BadRequest),
        _::ContentTypeInvalidMediaType(_) => Some(StatusCode::BadRequest),
        _::ContentLengthInvalid => Some(StatusCode::BadRequest),
        _::AcceptInvalidMediaType(_) => Some(StatusCode::BadRequest),
        _::AcceptUnnegotiable => Some(StatusCode::NotAcceptable),
        _::AcceptEncodingInvalidEncoding(_) => Some(StatusCode::BadRequest),
        _::AcceptEncodingUnnegotiable => Some(StatusCode::NotAcceptable),
        _::ETagInvalid => Some(StatusCode::BadRequest),
        _::AgeInvalid => Some(StatusCode::BadRequest),
        _::CacheControlInvalid => Some(StatusCode::BadRequest),
        _::AuthorizationInvalid(_) => Some(StatusCode::BadRequest),
        _::WWWAuthenticateInvalid(_) => Some(StatusCode::BadRequest),
        _::ExpectInvalid => Some(StatusCode::BadRequest),
        _ => None, // Contextually, there are more which end up becoming InternalServerError.
    }
}

I'm imagining this could also be usable with structs, as well:

match some_struct {
    SomeStruct { foo, bar } if foo > bar => { /* ... */ },
    _ { foo, bar } if foo < bar => { /* ... */ },
    _ { foo, bar } => { /* ... */ },
}
8 Likes

For what it is worth, I think that this feature being limited to pattern position would be the best scenario.

Yes it is a bit inconsistent but with good error messages that could be almost completely mitigated.

The benefit of pattern only positioning is that _ can already be used in some pattern only locations to help with type inference so this is an extension of that.

And not allowing this short hand in expression position would mean that when constructing the value the whole name is used (baring specific imports).

1 Like

That addition would definitely alleviate the concerns I have around code just being fragile when adding new variants.

I think having this option available doesn't force people to use it.

This is unfortunately not necessarily true in practice, as people can and will leverage absolutely crushing peer pressure to make others conform to what they think is "good style". So what if this feature hits, and someone wants to add it as a clippy lint? What if they get their way, someone has their CI break because they don't use it, and unwisely run cargo clippy -D warnings or some equivalent (perhaps because someone else told them it was "good style")? And what if people start PRing it around to open source projects, and before you know it, this too is "good style" and anyone who doesn't use it is tittered at?

Maybe that doesn't happen. But it's within the realm of possibility. People really have gotten into screaming matches over tabs vs. spaces. I only hesitate to say "death threats" because I can't recall a specific instances, but people sure do like to "joke" about killing the "infidels" who use the "wrong" formatting style a lot. People like to say things like "I would hope that would never happen in the Rust community" but hope is not a plan.

Having said all that: Say it is desirable, if you like! It just seems to be untoward to do so without acknowledging the elephant in the room.

I think this feature passes that criterion. It's not easier to write unreadable code with it; you can do that just as easily without.

Adding more ways to make code less readable that do not themselves individually make code worse does not necessarily mean that it will not hurt the overall readability of Rust code. There could be something about this syntax that makes it more and uniquely appealing to write less legible code in, as compared to other forms you can cite. Or perhaps the way people write code looks more like "roll a die per syntactic quirk one can acquire, and if result is less than some P, acquire it, independently of other quirks you could have acquired", thus adding more opportunities to acquire syntactic quirks still increases the average peculiarity and decreases the average legibility of Rust code because they are independently determined.

It is not that I do not think we should not consider this, merely that when considering the cons, let us not dismiss them lightly and prematurely. This is a matter of modal logic, not Boolean conditions.

6 Likes

If anything I'd write a clippy lint to reject this new form is some cases, particularly if you end up with foo(_::Yes, _::No). I would also like incorrect code where the variant is used directly to have a better diagnostic than it does today, by accepting all branches, typechecking successfully and providing a structured suggestion that rustfix/VSCode can cleanly apply.

3 Likes

Personally, I don't think this is important enough to include, mostly because you can accomplish something similar-enough with a use line:

enum MyEnum {
    First,
    Second,
    ...
}
fn do_stuff(my: MyEnum) {
    use MyEnum::*;
    match my {
        First => {},
        Second => {},
        ...
    }
}
5 Likes

That's certainly tolerable in some cases, especially for ones where it's an inherent method on the enum.

The critical part of what makes proposals in this space interesting to me, though, is that this takes advantage of inference. Like I don't know what type I passed in foo(stuff.collect()) without looking up the definition of foo, but that's ok. It could even be a type that's not in-scope for me right now -- with some trickery it's possible that it's even a type I cannot name. So maybe we can find some cases like those type inference ones for these uses that would also be reasonable.

Imagine something like this, for example:

let response = client.call(HttpMethod::Get, "https://rust-lang.org", RequestOptions { cookie_handling: CookieHandling::Skip, ..});

It might be good to allow something like

let response = client.call(.Get, "https://rust-lang.org", .{ cookie_handling: .Skip, .. });

Even though it's absolutely true that dbg!(.Get); or dbg!(.{ cookie_handling: .Skip, .. }) couldn't work and would be undesirable even if they could.

6 Likes

This seems like something we can address by not adding such a lint. Or, if anything, adding a lint for cases where people shouldn't use it, or ways for people to say "don't use it on this enum in particular".

This seems like a fairly plausible model. :smile:

Complete agreement. This is a tradeoff, not something with all upside and no downside.

2 Likes

Clippy already has lints that contradict each other: for example, mod_module_files and self_named_module_files. In general, the clippy::restriction lint category exists for lints that one might want to enforce but which don't make sense to apply to all Rust code in the world. None of these are enabled by default, and if you try to deny(clippy::restriction) you'll find that your program will never pass linting due to the contradictions.

Another notable example is pattern_type_mismatch — this lint is essentially “don't use match ergonomics”, and was contentious when originally proposed.

So, there's room for lints which do not reflect any sort of consensus or majority on style.

4 Likes

Mmm, that plus the example you have given entails a fair amount of "spooky action at a distance" via inference: that is, the type is beyond conventional name lookup but fields are available on it, therefore I can bind to the type's names in some way, and thereby determine the type in use in subtle ways. Subtle changes could cause me to invisibly bind to another type, which could break the characteristics of the code I want. This result has been fairly undesirable every time I have run into it, really.

3 Likes

This, in fact, already works with inference today in stable. We just don't have special syntax for it.

Here's a demo: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ed060cd13f013414f43985b027053776

mod foo {
    mod bar { 
        #[derive(Default, Debug)]
        pub struct RequestOptions {
            pub value: i32,
            pub other: i32,
        }

        pub fn make_call(url: &str, options: RequestOptions) {
            dbg!((url, options));
        }
    }

    pub use bar::make_call; // Notably *not* importing `RequestOptions`
}

fn main() {    
    // This fails, because we can't name it at this scope:
    //let x = foo::bar::RequestOptions { value: 10, other: 20 };
    
    // But we can create it, and set fields, using the magic of inference:
    foo::make_call("YAY", {
        let mut tmp = Default::default();
        if false { tmp }
        else {
            tmp.value = 123;
            tmp.other = 456;
            tmp
        }
    });
}

Isn't this true for basically all cases of inference, though? Even something like foo(x.into()) is doing it.

I used to think that I needed to annotate all of my local variables, since if I didn't I wouldn't know what they were and they might be something I didn't expect. But I've since gotten used to auto and var and let, and found that I was worrying for nothing.

To take something from C#, I don't know what response.GetHeaders() returns. Maybe it's called TypedHeaders, maybe it's called HeaderDictionary, whatever. If I can still do .RetryAfter = TimeSpan.FromSeconds(5); on it, that's fine my me. (Any change would be semver-breaking, so that's not going to happen outside a major version upgrade anyway, but even then if setting the property on it works, the odds of me actually breaking at runtime while still compiling by it being another type are tiny.)

Or for another thing that works fine in rust already, I could do

response.update_headers(|x| {
    x.retry_after = Duration::from_secs(5);
});

without typing the name of that type nor having it in scope, but still being able to set fields on it.

So I just don't see this as having any more "spooky at a distance" than inference already has in lots of places in Rust. Is there something specific you're thinking would make this case particularly-bad?


Actually, I really like that closure example. It gives a nice TLDR:

Since I could easily make this work:

client.call(|m| m.get, "https://rust-lang.org", |o| { o.max_retries = 4; });

What's really so bad about

client.call(.Get, "https://rust-lang.org", .{ max_retries: 4 });

?

(Or client.call(_::Get, "https://rust-lang.org", _ { max_retries: 4 }); or whatever the syntax.)

4 Likes

That's with a specific type, however. Nameless types are achieved in Rust via generics (specifically, impl Trait in arg position). Enabling field access (or variant access, for that matter) on a totally variable generic T would Seem Like A Bad Idea.

1 Like

Can you clarify "that"?

I agree, but that's not what this proposal would do. It would only work on a specific type that inference figured out.

That's why let _ = .{ max_retries: 4 }; wouldn't work -- there's no inference context information to get it to a specific type.

(aka that's why this is inferred types, not anonymous or structural types.)

4 Likes

I now see what the problem is with this phrasing: obviously neither argument applies to all Rust users. Of course, there are people who are trying to make code ever more beautiful, while at the other end of the spectrum, there are those who want to win the fast typing championship in production.

I'm arguing that we shouldn't be optimizing for the latter, and this proposal does exactly that.

There are two other, serious issue with your analogy:

  1. The phenomenon you described (many meaningless boolean arguments) is the result of the interaction of many features. Function declarations, call expressions, the primitive boolean type and its literals, and how the constellation of all of these can result in unreadable code. In contrast, the proposed feature would enable unreadable code all by itself, in many different contexts simultaneously.
  2. it's not really practical to make a language without booleans or functions, they are pretty much necessary. In contrast, enum name inference is not indispensable, it would be a minor convenience. Therefore the potential downsides are contrasted with a much weaker argument in favor of it.
10 Likes

The part I don't get from your argument is that this doesn't change the status quo of "I want to write short code".

"I want to write short code" developer is going to use HttpMethod::*; Potentially at block scope. Then they're going to client.call(Get, "https://rust-lang.org").

Against that, client.call(_::Get, "https://rust-lang.org") is, imho, better, because it clearly indicates that Get is an enum variant of the enum type that is accepted, and not a constant.

Similarly, in a match pattern context, _::Get is an enum variant, Get is potentially a wildcard arm. (If you're just against expression use, not pattern use, it'd be good to clarify.)

Personally, I quite like the comparison scottmcm provided:

Everywhere that an inferred type {value|pattern} would be allowed is already a strongly-known type location where inference is already done. The s! macro that's been mentioned before for alternate { ..Default::default } semantics (move in the other direction) works for the struct today.


I do want to bring up one point, though: this makes "enum variant" even more more special. This isn't a blocker by any means—enum variants are special, and are getting more special—but this would be another way in which faux C-style enums would be salted against compared to proper enum.

Specifically, I'm talking about the pattern of

#[repr(transparent)]
#[derive(Debug, Copy, Clone, Eq, PartialEq)]
struct HttpResponseCode {
    raw: u16,
}

impl HttpResponseCode {
    pub const OK: Self = c(200);
    pub const IM_A_TEAPOT: Self = c(418);
    // etc
}

Currently, per my understanding, an enum variant defines:

  • If it is a unit variant, an associated constant
  • If it is a tuple variant, an associated function
  • If it is a struct variant, an associated braced initializer (not a user definable concept, but macro imitatable)
  • An associated pattern (not a user definable concept)
  • These are glob importable, which is not the case for user-defined associated items

In the future, there are plans/proposals to also define:

  • An associated type (but with special, not user definable enum variant semantics and coersions)
  • Special inference-powered access rules (this proposal)

With a struct and associated constants, you can get what (if you squint at it right) kinda looks like an enum but with the C++ (explicitly typed discriminant enum) semantics where any discriminant is valid, not just the named ones. Since C-style enums necessarily are only unit variants, and you can match against consts, this mostly works like an enum (so long as you don't try to glob import the "variant"s).

It would be theoretically nice if it were possible to define enum in terms of a union desugar, and all of the variants' powers are available to user defined items. Obviously this is probably not really possible (patterns that aren't just consts being the big one, and the core value add of enum; the rest are theoretically exposable), but as we get further from faux enums being able to "quack" like enum, I want to be sure it's deliberate.

6 Likes

Exactly! So if the substitute for this proposal is use HttpMethod::*; at block/function scope, then that's clearly better, because then I know what enum Get is a variant of.

3 Likes

I meant to say file scope there, actually. The "I don't care about code readability, I want short to type code" developer you're invoking will put the glob import up at the top of the file, or a big function, or just off the screen from where it's being used. Even maybe through multiple levels of glob imports, if we're feeling especially naughty.

_::Get reduces the desire to use Enum::*; at any scope, so you should see it less.

The other other thing I want to re-bring up, though, is what I think is the strongest argument against: naming conventions.

Swift, which has had .Variant from early on, names its variants to be used with that in mind, e.g. .IgnoreCookies. Rust's naming conventions specifically discourage "stuttering," giving us instead CookieHandling::Ignore[1].

This is a valid, and fairly strong, argument against adding it in now: existing ecosystem and std APIs aren't designed with this in mind. It's very similar to the argument against adding in named arguments at this point in the game: it completely changes the API design calculus, and much of std would likely want to be redesigned to take advantage of the new functionality.

With that in mind (alongside the other arguments), I'm personally -¾ on allowing such a feature in expression position, but still +1 on generally allowing it in pattern position.


  1. This convention is in fact sometimes ignored for the express purpose of glob importing the variant names. Personally, I say if you want freestanding name compatible names that are different, put use Enum::Variant as EnumVariant lines in a sibling module.) ↩︎

8 Likes