Anonymous/inferred field types


#1

Is there any effort to support type inference for struct fields?

I mean, right now I can write:

let x = foo();
let y: _ = foo();

(for some typed function foo), but if I want to put the result in a struct I can’t do:

struct S { x: _ }  // a single-use struct
let s = S { x: foo() };

I can use generics, but this prevents me from doing anything with the field except via trait implementation:

struct S<X> { x: X }
let s = S { x: foo() };
// now I can't do anything with `s.x` even if I know the type!

Of course, this is non-trivial to do since the compiler must somehow prove that the struct’s type is well-defined (exactly one possible type), but I think with some restrictions it should be possible.

There are two motivations:

  • laziness (e.g. see this example)
  • ability to store unnameable types (i.e. closures)

#2

(disclaimer: I’m just repeating stuff I’ve observed through the grapevine as it were)

There’s no intrinsic reason this cannot be supported; the algorithm that Rust implements for type inference is capable of full program inference. (I think it’s roughly equivalent to the one Haskell uses? @Centril probably knows better than I on that one. (I saw you typing.))

The choice to limit inference to locally within a function was one of locality. The idea being that since the function is the point of division between units of logic in your program, it makes sense to provide anchor points for the programmer to understand what’s going on.

It’s also about one of the key values of Rust, fearless refactoring. Touching the implementation of a function is only possible to change inference within that function (type, lifetime, or otherwise) (if you don’t touch the signature), so this maintains locality of concerns.


That said, there are some specific proposals (and associated limitations) that could make many times you’d want this easier:

  • existential types, so you can give a name to the huge type that is then inferred
  • impl Trait types outside function headers, which could be seen as just a shortcut for using an existential type (or a generic therefore being problematic)

Though both of these do require explicit trait bounds which the usage abides to.


When limited to private interfaces, though, I definitely see benefit in allowing (controlled) type inference for more things. I know @Centril’s Haskell workflow they’ve shared here before in related discussions is to write the impl and let the compiler provide the type (which is then added on), and stronger type inference could be potentially used for this.

To maintain locality, perhaps it wouldn’t be “full program” inference, but restrict the parameters to just the current module. I’m pretty sure that would cover most cases of private inference while still allowing an account of locality such that the programmer could still easily predict/understand the outcome.


#3

I agree, requiring that type inference succeeds within the current module is a sensible bound and would satisfy my use-case.

Interesting point that we may not want inferred types escaping the current crate. I don’t think this would cause any issues for my intended usage and sounds like a good thing.


#4

AFAIK Rust employs bidirectional type checking with Hindley-Milner based inference limited to analysis inside functions (except for closures…) and for impl Trait a limited global type inference is done. @eddyb or @nikomatsakis know more.

This is already in the works and exists in the nightly compiler (see https://github.com/rust-lang/rfcs/pull/2071). The main open question to resolve is the surface syntax. (See https://github.com/rust-lang/rfcs/pull/2515 for a proposal to resolve the surface syntax as type Foo = impl Trait;. I could see permiting struct Foo { bar: impl Baz, } if we do that.

To describe my workflow more in-depth, I have two modes:

  1. Type driven mode: Here, I write the type signature first and then I try to get tooling to fill in the implementation for me (especially useful in a dependently typed language since more can be automatically filled in). In this case, I am developing to a specific API that I know ahead of time.
  1. Impl / generalization mode.

    In this case, I am not entirely sure what I want; I have a rough idea of the task I want to accomplish… Often, I have something in mind, but I want to generalize the function I wrote to retrieve the principal type of the function and make it as generic as it possibly can be. What I usually do in this mode is to write 1-3 functions without any type signature; After that, I do ghci> :r to refresh my REPL and then I ask for the types with ghci> :t myFunction; Then I simply just paste the function signature onto the function. Now I have retrieved the most general type for the API. This is particularly useful when authoring libraries.

    To do this, global inference is necessary; but inside-module inference for private items would be sufficient to make it work. I can always make the items public after attaching signatures for them.

    In Haskell, this mode is great for exploratory rapid prototyping, but it requires a certain discipline to ensure that type signatures are eventually attached before committing anything to git.

This use case of a single use struct is probably better served by structural records which I’ve proposed in https://github.com/rust-lang/rfcs/pull/2584.


#5

My use-case requires custom trait implementations (currently achieved via a derive macro and custom attributes). I’m not sure your structural records accomplish that. Currently I use a macro to generate code like this:

{
    #[my_attr(...)]
    #[derive(Clone, Debug, MyTrait)]
    struct AnonStruct<A: MyTrait> {
        fixed_field: u8,
        gen_field: A,
    }

    impl<A: MyTrait> AnonStruct {
        // custom impls passed into macro
    }

    AnonStruct {
        fixed_field: 0,
        gen_field: field_val,  // passed into macro by user
    }
}

Locally the struct has a name, but effectively the macro just returns an anonymous type with some trait bounds.

This approach works, except that within the impls the type of gen_field is merely A: MyTrait, which makes it more difficult to use than necessary.

(Actually, I already have support for explicit typing by the user, but as mentioned that’s not always possible due to use of closures and often not simple.)


#6

Oh I see; in that case structural records would indeed not help much.

I think having field: impl Trait might be something within reach but I think it will be hard to convince others to accept field: _ (not personally against it; I see the value it has for macros…).

The closest proposal I have written about this is https://github.com/rust-lang/rfcs/pull/2524 which could be extended to fields directly.


#7

Ah! That’s exactly what I want!

For now I have worked around this by supporting arbitrary trait bounds and facilitating implementation of arbitrary traits. This adds quite a bit of complexity to the system, but probably has other uses too.


#8

Why not allow field type inference within functions/expression blocks only?


#9

I guess that would also work. You want to comment on the RFC? I think this thread has served its purpose.


#10

I think “limited global inference” isn’t specific enough to make it clear just how limited it is.

The Rust compiler always did, and still does, inference only in “function/constant body + nested closures” groups.

impl Trait and still-unimplemented features like typeof, const X: _ and diagnostic-only “-> _, share a certain property that makes them much simpler than global type inference:
Full type information is required to be locally inferred before being made available to the rest of the world, and cycles are forbidden (i.e. the type information dependencies between non-closure fn/const bodies forms a DAG).

Note that it also must be easy to tell, syntactically and/or by using name resolution results, the direction of edges on that DAG, e.g. whether a const's type is declared and used for checking its body (“inwards”), or whether the type obtained by checking the body should be used as the type of the const (“outwards”).

I’m not sure I understand the context for this. Are we talking about some impl Bar for S<_> that wants to see the type of x from its construction, i.e. foo()?

The main problem I see with a x: _ field is that it needs the body of the original function to fully type-check before it can (soundly) have a “globally available type”. We could make it so the original function sees the “in-progress type”, since it’s the one responsible for its inference (if we can even determine that’s the case).
We do something like this for closures, but those are more limited in their interactions, than struct/enum fields.

A while back I “proposed” a syntax like this, for a similar purpose:

let s = struct {
    x: foo(),
    y: Default::default(),
} impl Bar {
    fn baz(&self) -> Self {
        Self {
            x: self.x.clone(),
            y: -1,
        }
    }
}

I included a field y that we could infer from the baz method into the original function, but we can only do this if we treat such an impl body like we treat closures, which requires that the struct is “globally unnameable”.

And this is where some subtlety comes into play: items defined inside a function body are still “globally relevant”, even if perhaps somewhat private - the trait system lets you do e.g.:

impl OutsideTrait for OutsideType {
    type AssocType = S;
}

I don’t even know that my proposed solution is fully workable but it’s very unlikely we’ll ever implement anything in rustc that doesn’t have its type information dependency DAG and inference groups simple and computed from syntax/name-resolutions (Haskell does something similar, after all, it’s just simpler in their case because name resolution is enough AFAICT).

Which means that soundness issues would be caught by “cyclic dependency” compiler errors, with no truly global mutable state in sight.


#11

@eddyb this is more-or-less what I’m after. Your struct { ... } impl { ... } syntax is, well, concise — but doesn’t leave room for application of derive macros, as far as I can tell.

Can a named struct private to a value-yielding block like in the example in my third post above not achieve the same thing, without new syntax?

I guess this requires an extra type-inference step within the function/block scope and successful construction of a DAG. I think however this can be sufficiently simple?


#12

Part of the problem is that it’s less syntactically “obvious” where the type information comes from: the enclosing function, or the methods on the struct?

You have to pick one of them, and then happen to be correct, otherwise you’ll get a cycle error.

As for derives combined with my approach, I expect #[derive(Clone)] struct { x, y } to work and generate:

struct {
    x, y,
} impl Clone {
    fn clone(&self) -> Self {
        Self {
            x: Clone::clone(&self.x),
            y: Clone::clone(&self.y),
        }
    }
}

#13

That won’t work without significant changes to the DeriveInput data structure, which currently expects type information to be present. Also bear in mind that several derive macros make use of macros on fields, e.g. serde and some macros I am working on (related to this).