[IDEA] Implied enum types

This has been mentioned as a drawback of this feature in previous conversations (and I tend to agree), but I just realized that this would be relatively trivial to address with a lint: check all arguments in a method call, collect all arguments that are enum types, check whether there are overlapping variant names, check if any of the arguments passed in are using the implied enum syntax on any of those. If so, complain and request the full qualified path be used.

3 Likes

Depending on how you write this lint, it's likely to be either noisy or not fire often enough.

If it lints because you've used a variant where two different enums are visible and have a variant of that name, you get a lot of noise from things like crate::dangerous::DangerousOptionsDoNotTouch has a variant Disabled, but so does the locally defined enum SafeThingsToPlayWith, and you really did mean to call set_the_switches(switch_position: SafeThingsToPlayWith) -> Result<(), ErrorType> with SafeThingsToPlayWith::Disabled.

If it only lints when there are multiple enum types in the function signature, then you get the problem of me typing set_the_switches(Disabled) meaning SafeThingsToPlayWith::Disabled, not realising that set_the_switches in scope here has signature set_the_switches(switch_position: crate::dangerous::DangerousOptionsDoNotTouch) -> Result<(), ErrorType>.

This is a lot of why I'd like the code to be explicit about anything non-local, with the IDE hiding the noise. One of my problems with C++ is that when reviewing patches, you end up having to look all over the place to see exactly what the compiler is going to choose to do when you do anything (thanks to Koenig lookup, copy constructors and other features to let you implicitly make things work). In contrast, Rust code tends to only let you elide things when the context is local, and thus I need only look up or down a couple of lines to see what's meant.

1 Like

I don't know why it's not explicit. Can you explain? I am not really an expert in compilers.

Take the following code snippet:

fn do_the_thing() {
    // This `let` line is implicit and local
    let x = "I am a string".to_string();

    // This line is explicit: about the parameter type for `foo_fn`, but
    // has an implicit return type.
    let y = foo_fn(FooEnum::Baz);

    // And this makes the return type explicit
    let z: FooEnum = foo_fn(FooEnum::Quux);

    // FooEnum's variants are too long to keep typing out,
    // so let's make the `FooEnum::` implicit
    use FooEnum::*;
    let a = foo_fn(Plugh);

    // Still OK because no ambiguity. Yet.
    let b = foo_fn(NotPresent);

    use std::env::VarError::*;
    // No longer allowed, because this could be FooEnum::NotPresent
    // or VarError::NotPresent
    let c = foo_fn(NotPresent);

    use FooEnum::NotPresent;
    // And now disambiguated by the explicit import
    let d = foo_fn(NotPresent);
}

Rust does allow some implicitness, where you don't specify things in the code, and the compiler knows what you mean, but that implicitness is used with care. Implicitness can be a benefit by making the code more concise - as in the let x line, where no experienced Rustacean would expect x to be any type other than String, but it can also make the code harder to understand, as in the let y line, where you cannot see what type y has from the code in front of you until you see the let z line.

It also lets you add more implicitness with use statements, where the result is unambiguous and where you think it makes the code more readable. If you use a wildcard in your use statement (as I did with use FooEnum::*), this only imports the names that are unambiguous, and you need to disambiguate when you're unclear.

C++ makes it worse with the ability to find constructors to call for you - if Rust adopted C++'s level of implicitness, I wouldn't even be able to see that foo_fn took a FooEnum as parameter, because C++ would implicitly add the equivalent of .into() calls to make it work out.

There's also a useful concept called "local reasoning" - in short, humans find it easier to reason things out by just looking at the stuff in front of them, and take shortcuts as you make them look further afield for the "right" answer. Local reasoning is extremely powerful, since the more local your reasoning can be, the better other humans are at repeating it (and hence at spotting errors in your code). Rust is heavy on making local reasoning work - where there's non-local action, you tend to have to invoke it deliberately (e.g. where C++ does implicit conversions via single-parameter constructors, Rust uses .into and ::from methods), which makes it clear to the reader that there's cleverness here.

4 Likes

Well imagine that a function can only have one type of enum which is true. The _:: operator represents the enum for that given function. It is only posible for the enum to be of the same type as the argument. I agree that c++ makes a lot of horrible implicit things like auto but this is not that:

auto foo(bool option) {
    if (option) { return 1; }
    return 2;
}

:face_vomiting::face_vomiting::face_vomiting:.

This is horrible because it is ambiguous, if you return a float in one of the return types by accident, it will give you an error. but because you use auto, you don't know if the return type is meant to be float or int.

When you would try to fix it, it leads to the float being trimmed but maybe you wanted to round it.

auto foo(bool option) {
    if (option) { return 1.5; }
    return 2.2;
}

int main() {
    // 1.5 -> 1
    int a = foo();
    return 0;
}

I think this is much different to the C++ auto and it's much better.

Counter-example, and thus why I think this should be an IDE feature:

I have a function set_wifi_state(config: RadioConfig). I realise I've fouled up my design because I forgot about the Zigbee and Bluetooth radios, and this function should be split into two: set_radio_state(config: RadioConfig) and set_wifi_state(config: WifiConfig), but the two enums have variants with the same name.

With the current world, the moment I change set_wifi_state to take a WifiConfig, all calls to set_wifi_state fail to compile. I can now look at each one, and determine the correct outcome: should set_wifi_state(RadioConfig::Disabled) in each location be set_wifi_state(WifiConfig::Disabled) or set_radio_state(RadioConfig::Disabled)?

With your suggested change, all of the calls to set_wifi_state(Disabled) are ambiguous; did I mean set_wifi_state(WifiConfig::Disabled), or did I mean set_radio_state(RadioState::Disabled)? The compiler will assume the former - which is wrong if it really was meant to be the latter.

I can't have this ambiguity in today's Rust; if I'd written use RadioConfig::* to make set_wifi_state(Disabled) work, then it fails to compile as I'd expected. If I add use WifiConfig::*, then Disabled is ambiguous, and Rust complains.

With your change, the compiler has an unambiguous and wrong resolution to the ambiguity I've introduced by fixing my design fault - if, in my earlier faulty design, I'd written set_wifi_state(Disabled), where the new design wants it to be set_radio_state(Disabled), the compiler happily adds bugs to my code, because it re-infers Disabled to be WifiConfig::Disabled, where I really, really meant RadioConfig::Disabled.

This is why I think this should be IDE functionality - you want one thing that can add the use Enum::*; statement for you, and thus make the code much shorter while keeping it clear, and another that hides the Enum:: prefix in Enum::Variant for people who prefer the code to look shorter.

6 Likes

Thats not a language problem, if you decide to refactor, you would need to fix it yourself? Also, If you don't want to use it, you don't have to.

If you use the feature, and I refactor later because your design was inadequate to today's problem, it is a language problem.

Without your change, I attempt my refactor, and the compiler complains that I've made a mistake - it's unambiguous to the compiler that I've got it wrong, because I've got a type error - and this applies even if you use the use RadioConfig::* trick to allow you to write set_wifi_state(RadioConfig::Disabled) as set_wifi_state(Disabled). With your suggested change, the compiler sees a correct variant of my new WifiConfig enum, and doesn't flag up to me that I've missed a spot.

This is fundamental to anything that makes things more implicit - you need only spend time on users.rust-lang.org to see users getting surprised when the various lifetime elision rules result in a lifetime surprising them - for example, the following generates a lifetime error that people get surprised by (playground link:

fn box_it<T: Trait>(it: T) -> Box<dyn Trait> {
    Box::new(it)
}

The explanation for the error is simple; this function has an implicit lifetime, and is in fact a short-hand way of writing:

fn box_it<T: Trait>(it: T) -> Box<dyn Trait + 'static> {
    Box::new(it)
}

But even this level of implicitness is confusing to some people - because while the compiler error suggests a fix (changing the bound from T: Trait to T: Trait + 'static), it doesn't explain why it suggests that change, or how you'd fix it if you wanted to allow it to be bounded by an arbitrary lifetime instead of 'static.

This is why I think this should be an IDE feature, not a language feature - my IDE of choice already knows how to add to what I'm typing (e.g. it knows to convert foo.letm to let mut = foo and put my cursor between the mut and the =), how to add in type inlays and display fn foo(&self) -> &str as fn<'self> foo(&'self self) -> &'self str, and it can also remove things from display - so it can display the code the way you want it, and make it easier to type.

Finally, "Also, If you don't want to use it, you don't have to." is not helpful - if anyone I collaborate with uses a language feature, I have to deal with it. If that's the only way you can handle an objection, then you're basically saying that the feature shouldn't be in the language because it can't be used at all.

But that is different from a language feature, that's a bug in your code. The compiler shouldn't tell you if you changed the wifi state instead of the radio state. You are just telling the computer what to do! This is out of the scope for the compiler. The implicit lifetimes are different because they effect compiles and memory safety.

I am completely confused about what you're trying to say here - I showed a language feature that leads to confusion already (elided lifetimes in Box<dyn Trait>), and I showed code that, in current Rust, is a compile error, and in your proposed change is compilable code with a bug in it.

Bearing in mind that "memory safety" is simply a name for a well-understood class of bug, I'm lost trying to understand you. You're saying that your proposed feature (which affects compilation and introduces a class of bug) is different to an existing feature because the existing feature affects compilation and a class of bug)' this doesn't make sense to me.

What I mean is that that is a logic problem that is out of scope of the compiler. I feel as if your saying the compiler should be checking if the function your calling is correct. These bugs are totally different. I don't know but it feels like you're saying:

fn add(num1: usize, num2: usize) -> usize {
    num1 * num2
    // ^ Error: did you mean add instead of multiply?
}

I think this is the crux of our difference of opinion. I, like a lot of people in the community (but not all), find Rust attractive because it's designed to make it easier to write correct code (bug-free code). If the compiler could issue the error you describe, that feels like a great thing to do - you've got a logic error, and the compiler is telling you that you've made a mistake so that you can fix it as early as possible in the process. Given the context you've supplied it in, am I right to believe that you think the compiler shouldn't issue an error in this case, even if it could detect that this was a logic bug? If so, that's the core difference of opinion.

And note that the entire design of unsafe is built around this assumption - memory safety bugs are just a type of logic errors that happen to have disastrous consequences for the program. If we thought the compiler shouldn't error on logic errors, then we'd get rid of unsafe, and just require you to not have logic errors, otherwise the compiler does what you told it to do.

3 Likes

I like this kind of inference and don't think it really adds bugs. It's kinda a fact of life that having enums of any kind either brakes namespacing, is much less useful than you'd like, or requires somewhat complicated inference rules like discussed here (see Niklaus Wirth's design rationale for omitting enums in Oberon-07). The fact that rust has use Enum::* at all makes them more useful than some other "strong" enum implementations.

I find this particularly egregious when matching over enums, generally once the match statement opens I always want the enumerators of the enum I'm matching over in-scope, and inferring things like this makes code more readable in most cases (you already know good and well what the enum type is since it's often in the match expressions head).

2 Likes

The case with a match makes some sense, but one of my reasons for preferring Rust over C++ is that C++ is hard to review in patch form.

If I give you:

fn do_the_thing() -> Result<(), Error> {
    set_radio_state(AllUnblocked)?;
    set_wifi_state(Disabled)?;
    set_zigbee_state(Channel5)?;
    Ok(())
}

and ask you what types are being passed to set_radio_state, set_wifi_state and set_zigbee_state, you're not going to be able to - you'll need to look at the function definitions to see what's going on.

In contrast, with:

fn do_the_thing() -> Result<(), Error> {
    use RadioState::*;
    use WifiState::*;
    use ZigbeeState::*;
    set_radio_state(AllUnblocked)?;
    set_wifi_state(Disabled)?;
    set_zigbee_state(Channel5)?;
    Ok(())
}

you have a reasonable chance of guessing what enums are being passed to which functions. And if I go fully explicit, you can see what I'm doing in isolation.

fn do_the_thing() -> Result<(), Error> {
    set_radio_state(RadioState::AllUnblocked)?;
    set_wifi_state(WifiState::Disabled)?;
    set_zigbee_state(ZigbeeState::Channel5)?;
    Ok(())
}

One thing that I think needs restating is that I am almost always on the side of the maintenance programmer trying to make sense of and fix the code years down the line over the person writing it to begin with - making it quicker to write code is only helpful if it also makes it easy for the maintenance programmer. It's not great if you make the original author's life a little easier at the expense of the reviewer and the maintenance programmers later on - in much the same way as it's not helpful if I use abbreviations only I understand here, just to save me time at the expense of all the people who read this comment.

6 Likes

I think there's a question of whether this is a useful question to ask in isolation, and whether it can be answered with existing language features.

For example, I'd also have to look at the function definition to answer it for set_radio_state(Default::default()), or set_radio_state(bar()), or set_radio_state(x.into()).

What is it about set_radio_state(.AllUnblocked)?; -- where I'm using my usual placeholder syntax for this, since I would expect there to be some marker to keep it from conflicting visually with things like struct AllUnblocked; -- that means that one needs to be answer the "what type is it?" question locally (without looking at signatures), when you already can't in other situations that exist in Rust today?

Since Rust has powerful inference, the base case answer is that "what's the type of this?" isn't something that can be answered locally: even something entirely normal like let x = foo(); bar(x) you can't know the type of x without looking up foo or bar. (Even in qux(0) I don't know what the argument type is!)

As a contrived example, if I was really annoyed at typing RadioState:: I can make this work:

fn main() -> Result<(), String> {
    set_radio_state(|x| x.all_unblocked)?;
    Ok(())
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=2c746457a7555a6219e5ba132d00d759

And in that I have no idea what the types are -- inference will let me use that all_unblocked field without me giving any local suggestion whatsoever of what type x is, nor what the type of that field is.

In comparison, set_radio_state(.AllUnblocked)?; feels positively restrained.

4 Likes

I see what you're saying, which is that I can obfuscate types even with current Rust, but I'm still not convinced that .AllUnblocked is in any way better than having you write use RadioState::AllUnblocked in cases where that's a commonly needed variant.

And I'd note that your thing is radically different - you've had to change not just your code, but also the shared library code to make it work (or provide your own private wrapper). Given:

#[derive(Debug)]
enum RadioState {
    AllBlocked,
    ZigbeeOnly,
    WifiOnly,
    AllUnblocked
}

fn set_radio_state(state: RadioState) -> Result<(), Error>
    dbg!(state);
    Ok(())
}

I can't see a way to fix your thing to work such that you can supply a closure to set_radio_state that infers the right enum variant for you, without introducing a second function or a wrapper type.

Basically, what I'm saying is that I'm not happy with the idea that we change the code to make it harder for the maintenance programmer just to allow the original developer to avoid writing use RadioState::AllBlocked in their file.

3 Likes

I'm trying to find a way to make it a general rule. When I'm next asked to :white_check_mark: an FCP or leave a concern, what's the rule that you want me to be using to decide whether a sugar is ok?

To go back to my set_radio_state(Default::default())?; example, one could easily argue

I'm not convinced that Default::Default() is in any way better than having you write RadioState::default(), just to avoid writing use some_crate::context::RadioState;.

But that's not how Rust works today.

Or, relatedly, we could have required that people say

let x: RadioState = whatever();
set_radio_state(x)?;

to make it easier for the maintenance programmer, but, again, we don't.

What's the general principle that applies to Rust today that would be violated by allowing set_radio_state(.AllBlocked)?;?

(Note that I actually do personally have some concerns about the general feature. But I'd like to focus the bounds on it as much as possible, so there's the possibility for someone to have a clever idea to address the original motivation without the downsides.)

3 Likes

I think for me the general guideline is: If a change in one piece of code, (e.g. changing the type of a parameter, or adding a parameter to a function) is likely to mean a change in contract (i.e. something I need to consider and possibly act on), does this feature make it less likely I will notice the change in contract I may need to act on?

[IDEA] Implied enum types - #30 by farnz has a great example of this - because the type of the parameter has changed, I may want to audit whether I have a change to make where I call a function.

From that perspective, changing:

#[derive(Default)]
enum RadioState {
  Disabled,
  #[default]
  Enabled,
}

fn set_radio_state(state: RadioState);

to

#[derive(Default)]
enum RadioToEnable {
  #[default]
  None,
  Bluetooth,
  Wifi,
}

fn set_radio_state(state: RadioToEnable);

would be an example of a place where set_radio_state(Default::default()) can be problematic (and note, there is a pedantic clippy lint about this). While we have to live with the choices we've made, it would be great to avoid adding more hazards like this.

set_radio_state(x.into()) is less scary to me, because an explicitly implemented conversion between types has probably had some thought given to it (e.g. I'd assume a From impl between RadioState and RadioToEnable would map Disabled and None because someone took the time to think about it).

Personally, I'm much less concerned with "can I tell what the type is just from reading the code without an IDE" and much more concerned with "do I need to re-consider my code based on a change made to some other part of code".

3 Likes

The Default::default() is IMHO a very unconvincing example because it's already allowed, and nothing terrible has happened. There's Type::default() for those who prefer to be more explicitly (and so is Type::Variant). And even then, the example where Default::default() could be problematic is also an example where the implied enum type would have worked fine: set_radio_state(Enabled) wouldn't compile and it'd have to be changed to some other option. So it's not a big deal, and not that related to the proposed syntax.

Library authors want to have clear APIs. Application authors want to write maintainable code. If Rust gets this feature, both will be aware of it. This is a cooperative situation, not an adversarial one. If a particular combination of names and syntax gives a terrible result, people will avoid it. If a variant name would be ambiguous, misleading, or have unexpectedly different meaning after an API change, library authors will pick different names and/or users will fall back to specifying the full type name.

So I'm not really worried that this could be used to make awful APIs that are unclear at the call site. This isn't that different from overuse of bool arguments anti-pattern. Somebody could design an API like call(.No, .Nope, .Nah) just like they could design call(false, false, false). But that is a bad practice, and people generally avoid it. It might happen sometimes anyway in poorly thought-out APIs, but that isn't a big deal, just like functions overusing bools weren't a reason to avoid having bool literals in the language.

6 Likes