Ideas around anonymous enum types

steffahn · August 4, 2020, 12:59pm

One extra thing to consider is that currently dyn Trait explicitly opts out of otherwise applicable blanket implementations. This feels almost like a minor form of specialization. Without the explicit listing of the trait, it would be hard to say what the correct rule is to apply here.

See the example below. I’m not sure yet if there is any situation where this distinction between a blanket implementation and an automatic impl from an anonymous enum would be “relevant” beyond a diagnostic function like std::any::type_name, but I feel like there might be problems from this indeed.

trait Trait {
    fn method(&self) {
        println!("{}", std::any::type_name::<Self>())
    }
}
impl<T: ?Sized> Trait for T {}
trait OtherTrait {}
impl<T: ?Sized> OtherTrait for T {}

fn main() {
    let x = true;
    let y: &bool = &x;
    let z1: &dyn Trait = y;
    let z2: &dyn OtherTrait = y;

    z1.method(); // prints: bool
    z2.method(); // prints: dyn playground::OtherTrait
}

⟶ playground

robinm · August 4, 2020, 1:00pm

Thanks!

And I assumed that enum (/* anything */) impl Trait means that all types in anything, including .. implements Trait, so even if you can't match on the concrete type(s) of .., you can bind a variable and call any function/method that take a Trait using this bind.

trait Trait { fn stuff(self) { /* ... */ } }
struct T;
struct U;
impl Trait for T {}
impl Trait for U {}

fn foo() -> enum (T, ..) impl Trait {
    if rand() % 2 == 0 {
         T as enum
    } else {
         U as enum // the concrete type of the return type is `enum (T, U) impl Trait`, but the only variant visible externaly by a programmer is T. U is opaque to other part of the code, but not the compiler.
    }
}
fn usage() {
    match foo() {
        t: T => t.stuff(),
        u: U => u.stuff(), // this line doesn't compiles since `..` is opaque
        u: impl Trait => u.stuff(), this lines compile, and will be monomorphised for every type hidden by `..` (only `U` here)
    }
}

steffahn · August 4, 2020, 1:26pm

Yep, you must not “override” any blanket implementations for Any, otherwise this would be a problem:

use std::any::Any;

fn main() {
    let x: enum(i32, bool) = 42;
    // ^^^ implements Any based on dispatch
    let y: &dyn Any = &x;
    // ^^^ unsize coercion: enum(i32, bool) is sized and implements Any
    let z: &i32 = y.downcast_ref::<i32>().unwrap()
    // ^^^ created internally by casting y, so we get an invalid reference (UB)
    // ^^^ succeeds since y.type_id() gives is TypeId of `i32` by dispatch
    // see: https://doc.rust-lang.org/src/core/any.rs.html#195-204
}

Jon-Davis · August 4, 2020, 2:29pm

I think that is a good idea.

I took out the auto trait implementation. The reason it was wanted in the first place was so you could treat the enum as a generic Error when you didn't care about a specific error. But with trait matching this use case is handled. It can always be added back in, but in my opinion the primary use case for it is now handled by trait matching.

match fail() {
    Ok(_) => ...,
    Err(e) => log!(e); // e is an enum
}

match fail() {
    Ok(_) => ...,
    Err(e: impl Error) => log!(e); // e is a concrete type that implements Error
}

I have been under the assumption that the type is transparent to the caller and documentation. So you could still write specific type matches or trait matches on it.

If I was writing a match on a function that returned enum impl error, and I noticed I was getting http::Errors and wanted to implement retry logic. It would be frustrating to have the compiler say, "I know it can be an http::Error, you know it can be an http::Error, but no I wont let you catch the http::Error".

For that reason I think it should be transparent to the callers and documentation.

Y is a reference to an enum(i32, bool), why would the typeId be equal to i32 and not it's actual type?

Edit: Oh wait I see what you mean, you were talking about overriding a trait implementation my bad.

robinm · August 4, 2020, 2:39pm

If it's not visible in the signature, it shouldn't be transparent to the caller. In Rust, the equivalent of C++'s decltype(auto) doesn't exists because this would require global analysis, and non-local changes can have a wider impact than expected.

fn foo() -> enum(..) impl Trait { /* ... */ return A as enum }
fn bar() -> enum(..) impl Trait { foo() }
fn baz() {
    match bar() {
        a: A => ..., // Is this line valid? You need to look transitively to the *implementation* of bar() then baz() to answer this question
        _ => ...,
    }
}

If enum(..) are transparent, then changes to the implementation of foo() (like returning B as enum instead of A as enum) could have an impact on baz(), even if both A and B implements Trait, and that there is no direct relation between foo() and baz().

robinm · August 4, 2020, 2:46pm

I agree that the main use-case is handled, and I wouldn't be opposed to ship-it as it (if I had any veto power !) . I just feel that the following snippet is a bit verbose. If safe trait can be auto-implemented, I think it should be part of this proposal to remove this unnecessary verbosity.

Jon-Davis · August 4, 2020, 3:05pm

I get this and don't disagree on principle but for the Error handling case. We have 3 desired behaviors

The ability to match on Type
The ability to have a generic implementation for Error
The ability to add new Errors down the stack without updating all propagating callers signatures.

If one of the desired behaviors excludes others, than I believe that is counter productive. So I think eum impl trait should be transparent to the caller, and specified in the documentation.

One solution could be to allow extraneous type and trait matches. If a type match doesn't match a type it would be a warning rather than an Error. So if a library removes an Error type from a function it wouldn't cause a user program to break, but would omit a warning that alerts the user of the change.

robinm · August 4, 2020, 3:32pm

Do you have concrete cases in mind where you returned a new concrete exception type from a function (in a language supporting exceptions) that you wanted to match several layers above in the call stack (and not just use the generic error type)? I feel that when I'm adding a new type of exception from a function and let it bubbles in the stack, I will just log and retry/ignore it, not try to catch it specifically.

I took the time to re-take a look at Any. Anonymous enum are some of the impl counterpart of &dyn Any. With this reasoning, I would agree that it could make sense to be able to match on the concrete type (using something like Any::downcast_ref).

@H2CO3 if you don't mind, I'd like to have your point of view. I feels that this would add a lot to the conversation since you are usually spot-on for this kind of proposals. The question being: “Should the concrete type of an anonymous enum be used in a match statement by a caller, even if those types are not visible in the type signature (ie hidden behind .. which is the non-exhaustive list of types of the variants of the anonymous enums)”.

Jon-Davis · August 4, 2020, 4:39pm

I have used and seen it in an rest controller, catching the results of a service, and handling specific errors, or mapping errors to status codes.

//pseudo rust, service that may speak to other services or databases
fn run_job_controller() -> HttpResponse<R> {
    match run_job_service() {
        Ok(t) => HttpResponse(t, 200),
        Err(h : HttpError) if h.status == 404 => HttpResponse((), 404),
        Err(h : HttpError) if h.status == 401 => reauthorize() then retry,
        Err(h : HttpError) if h.status == 503 => retry in a few seconds,
        Err(s : EmptyResultSetError) =>  HttpResponse((), 404),
        Err(s : PostgresConnectionError) => refreshPgConnectionDetails() then retry,
        Err(s : OracleConnectionError) => refreshOcConnectionDetails() then retry,
        Err(k : KubernetesError) if k == QuotaReached => submitToExternalQueue(k),
        Err(e : impl Error) =>  HttpResponse((), 500)
    }
}

I have also gone into code that hasn't been touched in 10 months and added new functionality to calls that perform complex behavior, if I added a small update to a function, then had to propagate that through all the services and controllers it would have added friction.

traviscross · August 4, 2020, 8:53pm

With respect to motivating the need for this kind of thing, perhaps the RFC should talk more about iterators, futures, and closures returning trait implementations in general.

Right now, e.g., Rust encourages and tempts you to write long chains of iterator method calls:

something
  .into_iter()
  [...]
  .flat_map(|x| ...) // closure returns impl Iterator
  .collect()

It's elegant! However, as soon as you need conditional logic in that flat_map closure -- perhaps x is an enum you need to match -- then the game changes completely. You can't return two different types of iterators from the closure. You either need to box them as trait objects, which is a bit gross, or you need to refactor the whole thing as a for loop.

This is a major tripping hazard in Rust. Those who have spent enough time with the language see this pattern coming well in advance and avoid it. But everyone runs headlong into this first a few times. It makes closures a lot less useful than they could be.

Anything that fixes this would be a major improvement to the language.

Jon-Davis · August 4, 2020, 10:46pm

I agree, I'll add product safe traits back into the proposal, and add some examples revolving around enums as interfaces. I'll also add a section on how this feature could add new patterns

I can't really think of many good examples beyond enums as interfaces, but if anyone has a suggestion please post it.

I thought up this example of performing an incremental map.

let iter = (0..100).iter()
   .select<usize>()
   .map_if(|x| x % 3 == 0 && x % 5 == 0, |_| String::from("FizzBuzz!"))
   .map_if(|x| x % 3 == 0,|x| |_| String::from("Fizz")) 
   .map_if(|x| x % 5 == 0,|x| |_| String::from("Buzz"))
   .map_if(|_| true, |x| x.to_string())
   .select<String>()
   .for_each(|s| println!("{}", s))

robinm · August 4, 2020, 11:15pm

In another thread it was proposed that instead of adding comment in github/IRLO in a linear way, we could try to use pull request instead. I did an experiment an created 2 PRs. Feel free to vote/comment/update/modify/reject them

steven099 · August 4, 2020, 11:24pm

I'm trying to figure out the typing here. Either you're returning an impl Iterator over an enum, in which case you need to match on x, or else you have an intermediate (Iterable?) type which isn't an iterator, and you need some sort of map_rest method to get an impl Iterator over a single type at the end, or else call iter to get an impl Iterator over the enum type. Seems like you could do this with an explicit Either enum, though, and it might be clearer.

Jon-Davis · August 5, 2020, 12:00am

Yeah that post wasn't compliant with the proposal or rust, it was less an example of how enums would work and more a guess at the patterns that could be made with a new primitive.

You could do this with Either as well although with anonymous enums you could have a larger number of output destinations. For example maybe you could have the FizzBuzz, Fizz, and Buzz be &'static str, while the remainder were a String.

Maybe the pattern would look more like this.

(0..100).iter()
   .map(|x| x as enum(usize, &'static str, String))
   .map_if(|x| x % 3 == 0 && x % 5 == 0, |_| "FizzBuzz!" as enum.1)
   .map_if(|x| x % 3 == 0,|x| |_| "Fizz" as enum.1) 
   .map_if(|x| x % 5 == 0,|x| |_| "Buzz" as enum.1)
   .map_if(|_| true, |x| x.to_string() as enum.2)
   .for_each(|s| println!("{}", s))

For this example map_if takes the first variant of an enum, runs the check, if it passes it runs the map, but the map is actually outputting the same type, but potentially a different variant.

I think there are patterns to be found, although I don't think the proposal should codify any of them. I also can't predict any of them.

Edit I think with specialization and getting rid of the map to String it could look like this

(0..100).iter()
   .map_if(|x| x % 3 == 0 && x % 5 == 0, |_| "FizzBuzz!" as enum.1)
   .map_if(|x| x % 3 == 0,|x| |_| "Fizz" as enum.1) 
   .map_if(|x| x % 5 == 0,|x| |_| "Buzz" as enum.1)
   .for_each(|x| println!("{}", x));

H2CO3 · August 5, 2020, 6:28am

Thanks for asking! I think the answer to that is a definitive "no".

First of all, it feels off that we first "hide" a type across an API boundary and then rely on it being a specific type later. This effectively introduces many disadvantages of dynamic typing into a static typing context.

The only place in Rust I know of where a concrete (literal or generic) type is not needed in a function signature is impl Trait, of which the purpose is exactly to hide the concrete literal type and expose only its interface. If the concrete type is (or the variant types of an enum are) to be used in downstream code, they should definitely be mentioned in the signature, otherwise breaking changes could be made to a function by merely changing its body without altering its signature.

Furthermore, as it is the case very often, proposals introducing similar kinds of inter-procedural, magical action-at-a-distance ignore the fact that this breaks the ability for local code analysis, which in turn breaks all sorts of other assumptions, making the life of the compiler and the human reader disproportionately more complicated. Global analysis is hard to do correctly and efficiently, and important parts of the Rust ecosystem (nota bene: unsafe code) heavily rely on the locally-analyzable nature of the language.

Finally, I do not fully understand how the "unmatched cases are not an error, only a warning" mechanism would work. Surely the fact that match arms aren't permitted to match non-exhaustively is not a mistake or an oversight in the design of the language. Since match expressions have a value, it would be impossible to assign a value to one if none of its arms matched the discriminant expression. It could "just" be specified that this situation is simply UB, but that is highly undesirable and unsound.

robinm · August 5, 2020, 9:27am

This was my initial reaction when I saw the initial idea being proposed. At the same time, I understand the motivation: to be able to bubble up the exact set of Error (like checked exceptions), without having to modify all the call-chains (unlike checked exception).

So I tried to find a better middle ground. To give you the gist, you can transform an enum (..) impl Trait into:

an enum (A, B, C) impl Trait as long as A, B and C is the full set of variant of the anonymous enum
a partial enum (A, ..) impl Trait

Such transformation would need to be done in the same crate. Therefore, it would be a source breaking change to add/remove any types in the anonymous enum for all places where the transformation is done, but since it must be in the same crate, source breaking change shouldn't be a real issue (downstream consumers are not affected).

@H2CO3 with this modification, is this still a definitive no?

H2CO3 · August 5, 2020, 10:43am

That's a clever solution to the problem, indeed, it alleviates the problem of breaking changes. I don't really like the same-crate restriction though, as it makes the construct feel somewhat like a hack. (This is more a stylistic/principle issue rather than a technical one, though.)

robinm · August 5, 2020, 11:10am

Thanks, it means a lot!

In my proposal, the public API of the crate is clearly exposed. Modifying the variants returned by failable() would result in a compile error in export_enum(). It it therefore easy for the programmer to know that such change would requires a major version bump if it was deemed to propagate through the public API. Such changes would be immediately obvious (you would get a compilation error), so it would be easy to fix it since you would know what was just changed.

// the concrete variants are part of the private API of the crate
fn failable() -> Result<_, enum impl Error>;

fn call_failable() -> Result<_, enum impl Error> {
    // ...
    failable()
}

// public function visible outside of the crate
// its only purpose it to make the concrete variant part of the public API of the crate
pub fn export_enum() -> Result<_, enum(io::Error, sql::Error, http::Error) {
    call_failable()
}

// downstream consumer of the above crate
fn usage() {
    // We can easily know what we can match on since the types are
    // part of the API (and thus documented)
    match export_enum() {
        // exaustive match, no `Err(_: Error)` case
        Err(io: io::Error) => ...,
        Err(sql: sql::Error) => ...,
        Err(http: http::Error) => ...,
        Ok(ok) => ...,
    }
}

But if we remove the crate boundary restriction, the following snippet becomes valid. As such, modifying the concrete variants returned by failable() would be a source-breaking change for downstream user, even if it's not obvious at all.

// the concrete variants are part of the private API of the crate
fn failable() -> Result<_, enum impl Error>;

fn call_failable() -> Result<_, enum impl Error> {
    // ...
    failable()
}

// public function visible outside of the crate
// the variants of the anonymous enums (io::Error, sql::Error, http::Error)
// are part of the public API of the crate even if it's not obvious at all
pub fn export_enum() -> Result<_, enum impl Error) {
    call_failable()
}

// downstream consumer of the above crate
fn usage() {
    // We must "guess" the types that we can match on, since it's not part
    // of the public API, and we may not be familiar with the internals of
    // the above crate (at least since the match is exhaustive, we would get
    // a compilation error in case of mismatch)
    match export_enum() {
        Err(io: io::Error) => ...,
        Err(sql: sql::Error) => ...,
        Err(http: http::Error) => ...,
        Ok(ok) => ...,
    }
}

oooutlk · August 5, 2020, 11:37am

I don't quite understand the purpose of hidding concrete error types via enum impl Error. You still need to gather and enumerate these types in match arms. Without the help of function signature, you need to gather them in the function body, which is worse.

robinm · August 5, 2020, 11:57am

It's when you have a (rather long) call chain, and you want to add/remove an Error from a function in one of the leafs, without having to modify the full call-chain, even if the only place you are using it is at the root of the call-chain. In the example above, adding or removing errors from failable(), you don't need to modify call_failable() that only propagates the errors.

Topic		Replies	Views
pre-RFC: anonymous enums language design	13	5559	March 25, 2019
Pre-RFC: Anonymous variant types language design	93	6189	March 25, 2019
[Pre-RFC] Anonymous enum language design	10	3440	March 25, 2019
Concept RFC: Tuple Enums	32	3374	November 12, 2020
[PreRFC] enum-variant-types language design	17	1984	September 14, 2023

Ideas around anonymous enum types

Related topics