Idea: Argument-position `impl Trait` inference

I’m a really big fan of impl Trait, especially in argument position. A lot of times there’s a type parameter for a function you really just don’t ever need anywhere else, and being able to avoid adding another parameter that is always inferred is great!

I propose adding some definition-time inference to impl Trait in argument positon. Consider the following code:

struct Complex<T: Num>(T, T);

fn consume<T: Num>(z: Complex<T>) { .. }

With impl Trait we can get rid of T and simplify this to

fn consume(z: Complex<impl Num>) { .. }

I propose allowing the following even simpler form (actual syntax TBD):

fn consume(z: Complex<impl>) { .. }

Using impl without a bound after it in argument type position will infer the bound to be the weakest possible. In this case, it looks at Complex<?0> and realizes that ?0: Num, so it desugars to impl Num. Formally, impl in the type parameter or associated type position of a type constructor T desugars to impl Bound, whre Bound is the sum of all of there where clauses of T involving that type parameter or assoc type. In code,

struct K<.., T, ..> where T: B1, T: B2, .. { .. }
fn foo(_: K<.., impl, ..>) { .. }
// desug
fn foo(_: K<.., impl Sized + B1 + B2 + .., ..>) { .. }

In particular, fn foo(_: impl) desugars to fn foo(_: impl Sized). If the parameter is T: ?Sized, we get

fn foo(_: &impl) { .. }
// desug
fn foo(_: &impl ?Sized) { .. }

as expected.

This is useful for avoiding repeating the same bounds all over the place for a particular struct which are easily understood from context. @Centril’s parameterized modules come to mind here:

mod<Db: Database + Serialize + 'static> { struct K<T> { .. } }
fn frob(k: K<i32, impl>) { .. }
// instead of
fn frob(k: K<i32, impl Database + Serialize + 'static>) { ..}

This doesn’t introduce any particularly wild inference; looking up what impl infers is as easy as looking up the type it’s used in.

The form fn foo(_: impl) is also useful for when you want to take ownership of something but ignore its value.

In the precense of const generics, this becomes even more powerful, since const parameters usually involve a long <..> clause. For example,

const fn last<T, const n: usize>(xs: [T; n]) -> T { xs[n - 1] }
// becomes the following, recalling that [_; _]::len can be made `const`
const fn last<T>(xs: [T; impl]) -> T { xs[xs.len() - 1] }

While I think I am one of the more obnoxious “enforce readability with an iron fist” people, I think this is actually a readability win. Anywhere you can use impl, the implied bounds are a single code jump away! I could also imagine this is useful for by-example macros, which might not want to repeat a bunch of user-provided bounds all over the place.

I don’t feel like bikeshedding about syntax quite yet, but I’ll mention this anyway. An alternative syntax is any, to reflect that this represents the “most universal type” available in the given position. This, one might write

fn foo(z: Complex<any>) { .. }

I think it’s more illustrative of what I’m proposing, but I don’t think it’s worth arguing about yet.

1 Like

Complex<impl _> would feel more natural to me.

3 Likes

let x: Complex<_> = ... already means “let the compiler figure it out” in type position. To color the bikeshed, just using that as “minimally constrained” makes sense to me. (Then you can even use <T: Debug + _> to say minimally constrained plus debug?)

1 Like

I like this! Maybe this points to a more succinct solution: constraint inference. In particular, any place where a constraint can appear, you can write _ to mean “the weakest constraint available”. For example,

fn foo<T: _>()

will infer T: Sized (boring). However, I could imagine that writing

fn foo<T: _>(z: Complex<T>)

would infer T: Sized + Num. As a corollary to this, we now get impl _ pretty much everywhere it makes sense.

Now, I will point out a problem with this; one could interpret T: _ to mean “infer the necessary bounds for whatever I do in the function body” (which is way too powerful) since this is what let x: _ does in a function body, and what _ does in a turbofish.

Rust explicitly avoids inferring types for fn items for the sake of clarity. In fact, it would be possible to infer all the types (including generics and trait bounds) for the wast majority of functions, thanks to the fact that Rust uses a modified Hindley-Milner type system, and some languages like Haskell do it. But Rust is acting strict around global functions on purpose: knowing the exact type annotations at what tend to be API boundaries makes it easier for the reader of the code to know what to expect in and out of the function. That certainty and ease would get lost as soon as trait bounds are inferred. (Which is, IMO, highly undesirable to say the least.)

8 Likes

I am not advocating for global type inference; in fact, I'm very opposed to it because of the reasons you describe. There are two kinds of constraint inference one can describe here:

  1. Infer whatever is required by where clauses of types in the signature. This is a lot like C++ auto. This has a very minimal readability impact, since it's pretty rare for types to have where causes, and when they do, they're often either pretty obvious from what the type actually is, or incredibly prevalent throughout the codebase anyways (crates using diesel are an example of the latter). This is what I'm proposing.
  2. Infer whatever is required by the type's uses in the function body. This is incredibly powerful, and a readability footgun. I am very opposed to such inference, and, in an above post, why I'm slightly nervous about using _ as a symbol (since in most places, _ is H-M inference).

To summarize in code, consider the following snippet:

fn dbg<T: _>(x: T) { println!("{:?}", x); }

With (1), this code fails to compile, since T: Debug cannot be proved from the inferred constraints; T: _ only infers that T: Sized, since it can't look inside the function body. With (2), T: _ infers T: Sized + Debug, so since x is passed into an internal std::fmt function that requires T: Debug.

This is actually not entirely true. While Rust does not infer types, it does infer another kind: lifetimes, via lifetime elision and in-band lifetime rules. Lifetimes behave very much like types in the context of type parameters, though they aren't quite as complicated.

This sort of confusion is something I've encountered a few times proposing new kinds of inference. I think it's a matter of me not communicating which parts of the source an inference context is allowed to see; in this case, I only want to allow constraint inference to see the where clauses of the types the constrained type is part of.

I guest I have to in general reject this idea, oppose to what I usually do.

The main reason is that as a code reader I expect to predict what code have to be when I need to call a function, by looking at only its signature.

when look at the above, I know it will work if I pass something a Num. On the other hand what I can or cannot expect for

? I have no idea. Maybe I can only sent a Drop? or it have to be a Debug? If I make a decision that looks OK today, tomorrow you change the implementation my assumption may break.

So this kind of thing should at least NEVER be pub, I would say if it ever being useful its scope have to be inside a function body.

4 Likes

To simplify:

If you see fn consume(x: Complex<_>), then any Complex can be passed in. The only bounds that can apply are those that Complex implies on its own.

To call consume, you need to already own a Complex<T>; by mere virtue of this you have already satisfied the constraints, since we have Complex<T> where T: Num in the definition. Complex<impl _> would never require traits you can’t learn from looking at the WC of Complex<T>. By the time you’re calling consume, you don’t care what T is.

In this simple case it seems to be true; but I don't know whether other harder cases. Anyways, I already saw some replication of trait bounds in structs and impl blocks. I think the other way round is to imply the trait bound that a certain struct do request. But the Complex<_> should at least being fully substituted by rust doc.

Today, the general wisdom is to avoid bounding type parameters on structures. This is even if the type makes little to no sense without the bounds.

The reason for that is that functionality that doesn’t care about the type can avoid having to constrain the type. If you were able to use impl _ (or whatever syntax) to elide those meaningless bounds (maybe you don’t even get access to them since you didn’t write them?) I could see this changing to support bounding struct type arguments to the sensible types.

Obviously existing structures can’t be changed, but new ones could be designed with this in mind.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.