Pre-RFC: "Anonymous" associated types


#1

Hi rustceans!

I have an idea that’d be pretty useful for several cases (some of them are my unimplemented ideas). Do any of you consider this valuable? Please let me know and help me design this feature!

Summary

Allow defining an associated type that doesn’t have other name besides Type::Associated.

An example will make it more obvious:

impl Foo for Bar {
    type Baz = struct { some_field: String };
    type Enum = enum { Variant0, Variant1 };
}

Since there doesn’t exist a name without Foo:: prefix, one must access these types using Foo::Baz and Foo::Enum. This avoids conflict. In case two traits with same name of associated type are in scope, one can use <Foo as Bar>::Baz to disambiguate - this is also recommended way to do it in generated code. The type is still first-class, so you can do with it whatever you can with other types - including reexporting it via use Foo::Bar as Alias; or type Alias = Foo::Bar;

Rationale

This change allows one to namespace types using names of other types. This is particularly useful when auto-generating code using custom derive or other mechanism. Without this change, the type name must be somehow generated, risking conflicts with other types.

Alternatives

  • Do nothing - force code generators to require users to provide names for associated types.
  • Introduce some kind of special scope for derived code - e.g. in the form of a module that can’t be named from the outside of the derive and never conflicts.

#2

My immediate knee-jerk reaction is:

  • “the type name must be somehow generated” comes up in a lot of other macros, and I’m pretty sure we actually need a way to generate a guaranteed-unique name for things. I’m not sure there’s any interesting design discussion here beyond bikeshedding the name of the function proc macros will call to get their unique name.

  • There have been other proposals for “anonymous” structs/enums that were not specific to associated types. If we do introduce anonymous whatevers, they should probably have the same syntax and semantics as associated types as they do everywhere else.

Links:

I’ve noticed a common issue with these suggestions is only specifying some of the syntax. For instance, what would a value of your anonymous struct/enum types look like?

I believe past proposals usually used syntax kinda like (A | B) for anonymous enum types and values, while you’re using the standard enum syntax with the name omitted. Your syntax is arguably more consistent and self-evident but probably cumbersome when you need to specify values of that anonymous type.


#3

Does this essentially propose proper structural typing? If so, that would be a pretty big addition.

If, however, you are just trying to solve a code generation problem, why aren’t you using an inline module instead?

mod whatever {
    struct Baz {}
}

impl Foo for Bar {
    type Baz = whatever::Baz;
}

The type names within the inline module won’t conflict with other names.


#4

“the type name must be somehow generated” - the type names of closures are generated as well - I don’t see any problem here.

I don’t think other anonymous types are related to this.

I’m not sure what you mean by values of that type. You could construct them easily like Foo::Baz {some_field: "Hello world!".to_owned(), } or Foo::Enum::Variant0. Basically the only difference is that you need to prefix it with Foo::.

The whole idea is very similar to C++ where you can define type within a class.

Using an inline module is nice solution too and has one nice advantage: ability to control visibility. The problem with it right now is that module names could conflict. This could be solved by allowing pub(derive) mod whatever {} which would instruct the compiler to isolate the module from the rest of the code.


#5

I haven’t yet formed an opinion about this feature but syntactically I would suggest considering the following instead of type Baz = struct { some_field: String } to clarify that structural typing is not what is going on here.

Note that consts and functions look something like this in top level scope.

const N: usize = 0;
fn f() {}

We write associated consts and associated functions using the same syntax.

struct Struct;
impl Struct {
    const N: usize = 0;
    fn f() {}
}

And if you have associated consts and associated functions in a trait, still the same syntax.

trait Trait {
    const N: usize;
    fn f();
}
impl Trait for Struct {
    const N: usize = 0;
    fn f() {}
}

So to me it feels syntactically less of a jump to go to associated structs / enums.

impl Struct {
    struct X {
        some_field: String,
    }
}

And from there to the same thing in traits.

trait Trait2 {
    type X;
}
impl Trait2 for Struct {
    struct X {
        some_field: String,
    }
}

#6

Huh. This sounds like a novel idea to me, and is not what I expected to see from the title.

I do a lot of code generation and type-level programming myself, and can’t immediately think of use cases in my own code, but that’s likely because I’ve never actually considered this solution. (perhaps next time I’m writing a macro, I’ll find myself slamming by fist down and shouting eureka!)

I wholeheartedly agree with @dtolnay’s suggestion that this should more closely resemble a struct or enum item and not try to shove the type keyword in. After that, my gut reaction is that it seems… harmless. I can’t imagine it causing trouble for other proposed language additions, and it doesn’t add much to the complexity budget of learning the language (the implementation might be a different story).

The type is still first-class, so you can do with it whatever you can with other types - including reexporting it via use Foo::Bar as Alias; or type Alias = Foo::Bar;

I was going to say some word of caution here, but then I discovered that the following code already compiles today:

trait Trait {
    fn boo(&self);
}

// yes.
impl Trait for <i32 as ::std::ops::Add>::Output {
    fn boo(&self) { println!("{}", self) }
}

fn main() {
    3i32.boo();
}

#7

I consider syntax suggested by @dtolnay interesting and I wouldn’t be opposed to it. (My priority is to have some solution and I don’t care much about details if they make sense. If someone notices important things, I’m happy about it.)


#8

I can help you with some examples:

trait Build {
    type Builder;
}

// builder with .set_bar() and .set_baz() is auto-generated
#[derive(Build)]
struct Foo {
    bar: String,
    baz: u32,
}

Another example:

trait AsyncDeserialize {
    // Holds deserialization state
    type Deserializer: AsyncDeserializer;
}

Another would be improvement of my configure_me crate to use derive instead of build script - in that case, I’d need some kind of config builder (the same thing raw::Config is used for).


#9

I kinda like this.

The thing that worries me is that you would almost never want just this

trait Trait2 {
    type X;
}
impl Trait2 for Struct {
    struct X {
        some_field: String,
    }
}

So it would probably need to at least be

impl Trait2 for Struct {
    #[derive(Debug, Clone, Default, PartialEq, Eq)]
    struct X {
        some_field: String,
    }
}

But then you also want Serialize, at which point custom derives get involved, and now I’m scared how that’s supposed to work. (Can it put the generated impls in the impl block?)


#10

I believe this should be allowed. It seems to be the same problem as derive producing a struct with #[derive(...)] too - AFAIK it’s allowed.


#11

I find the syntax suggested by @dtolnay to be confusing; It is not clear at all to me that the type of an associated type is being set simply due to the fact of names coinciding. With fn and const, the keywords used for those also coincide, but not so with struct and enum.

The syntax proposed in the main post is more clear to me wrt. semantics.

@nikomatsakis and I briefly discussed this here.


#12

Well, it’s not quite the same.

derive producing a struct with derives:

#[derive(Trait)]
struct Foo;

expands to

struct Foo;

#[derive(Serialize)]
struct Bar;

expands to

struct Foo;

struct Bar;

impl Serialize for Bar { ... }

derive in impl

impl Trait for Foo {
    #[derive(Serialize)]
    struct Bar;
}

expands to

impl Trait for Foo {
    struct Bar;

    impl Deserialize for Bar { ... }
}

which is currently illegal.


#13

For some prior art, the struct Bar syntax in impls would be correspond to some mixture between associated data and type families in Haskell, https://wiki.haskell.org/GHC/Type_families#Associated_family_declarations


#14

Good point. It’d have to be enabled probably.