[lang-team-minutes] Implied bounds


#1

Yesterday in the rust lang meeting we discussed an idea that has been floating around for some time. I think I first proposed it on my blog in 2014, using the term “implied bounds”. In fact, the idea exists in Rust today, but in limited form. (We briefly discussed other names, but I decided to just stick with “implied bounds”.) The discussion was about the implications for generalizing it.

Implied bounds aims to reduce the phenomenon where you have to copy-and-paste the same set of bounds all over the place again and again. In a nutshell, the idea is that whenever a type declares a bound, things that “use” that type shouldn’t have to repeat it. So, imagine that HashSet<K> required that K: Hash (at the moment, it doesn’t, but that’s another story):

struct HashSet<K: Hash> { ... }

In that case, if I write a function like use_set(), which takes a HashSet<K>, then I should not have to declare that K: Hash:

fn use_set<K: Hash>(set: HashSet<K>) { ... }
//            ^^^^ required today, but not under an implied bounds proposal

Similarly when I write impls, I should not have to add those bounds:

impl<K: Hash> Eq for HashSet<K> { }
//      ^^^^ required today, but not under an implied bounds proposal

The reasoning is that we already require that the types of all function arguments are “well-formed”, meaning that their basic correctness criteria are met (the burden of showing that the types are well-formed falls to the caller of the function; there is a soundness issue with this at the moment, but that is somewhat separate and I hope to close it by improving the trait system). We use similar rules for impls as well: the impl is permitted to assume its input types are well-formed, per RFC 1214.

In fact, as RFC 1214 notes, we have implied bounds today, but only for “outlives” relations like 'a: 'b, T: 'a, or T::Item: 'a. This rule was added around 2013, to describe some issues that arose for trait impls. The basic idea then is to extend this approach to more – or maybe all – of the WF requirements for a type.

How it works today

I’ll briefly cover what we do today based on this example:

struct Foo<'a, T: 'a + Eq> { ... }

impl<'a, T> Foo<'a, T> {
    // no need to say `T: 'a`, it's implied; but `T: Eq` is missing.
}

As I said, the impl gets to assume its “input types” (here, Foo<'a, T>) are well-formed. So the first step is to figure out what must be true for Foo<'a, T> to be well-formed. This is determined just by looking at the struct declaration and copying its where-clauses, of which there are two:

  • T: 'a
  • T: Eq

Next, today at least, we “elaborate” those by expanding out supertraits (e.g., T: Eq => T: PartialEq) and a few other things, and then we filter the results to extract outlives relations. This results in one predicate (T: 'a). This is what the impl is allowed to assume is true; we call that the “environment” E.

Finally, we then check that the input types are well-formed in this environment E. In other words, assuming that T: 'a is true, we try to prove that Foo<'a, T> is well-formed. To do that, we have to show that the two where-clauses are satisfied:

  • T: 'a is clearly satisfied, because it is in our environment, and hence we can assume it’s true.
  • T: Eq is not satisfied – it may or may not be true, depending on T (note that we did not add this to our environment).

Just drop the filter?

The simplest version of this proposal just scraps the second set here – instead of only adding outlives requirements to the environment, we add everything. Interestingly, this makes the third case redundant. There is no need to check if the input types are well-formed, because we know that they will be, since we have assumed everything that we need to show.

But this has some side-effects that are not necessarily desirable. For example, imagine I have some type that is not Hash:

struct NotHash { ... }

and then I wrote a function that takes a HashSet<NotHash>:

fn foo(s: HashSet<NotHash>) { ... }

Following our procedure, we see that, for HashSet<NotHash> to be well-formed, NotHash: Hash must be true. So we add that to the environment (things we can assume are true). Now we can type-check the body under this assumption – but the assumption is false! This isn’t a soundness issue, but we won’t know anything is wrong until later, when somebody tries to call foo(). They will fail because they will try to prove that NotHash: Hash (so they can show that HashSet<NotHash> is WF) and they will fail. It’d be nicer to give this error at the definition of foo().

Interactions with inference

Another challenge with doing no filtering at all is that it can lead to bad interactions with some of our current inference rules. For example, imagine that I have this code today:

fn foo<'a, T: Hash>(s: HashSet<&'a T>) { 
    // implied bound today: `T: 'a`
}

If we use the unfiltered version, we would also have the implied bound that &'a T: Hash. That is true, but it’s less general than we could have. In particular, because T: Hash, we know that for<'x> &'x T: Hash – that is, &T is Hash no matter what lifetime it has. But we have this fact in our environment that says &'a T: Hash is true for some specific lifetime. This interacts poorly with our current inference rules, since it will cause us to sometimes overly constrain region variables, forcing them to be 'a when they could in fact be a shorter lifetime. I think we can solve this, but it’s an interesting interaction, and it’s complicated by our attempts to make the trait system agnostic to the specifics of what region equality etc means. (In fact, I did an experiment where I implemented implied bounds in an effort to use them instead of the current T: ?Sized defaulting approach. It ran aground on precisely these sorts of ambiguities – and it didn’t infer enough Sized bounds to be satisfactory, but that’s a separate issue.)

Do we want some filters?

In some sense, an unfiltered rule is kind of surprising anyhow. One might consider a filtering rule that says something like “we will only keep bounds that apply to type parameters or projections” or something. I don’t know exactly what such a rule would be. With any such rule, there’s a risk that it winds up surprising people – because they can’t predict which where-clauses they will need and which they won’t – but then a rule that is too open is also kind of surprising, as in the example of NotHash: Hash.

One possible solution might be to keep the full implied bounds, but rephase our WF check for impls. Instead of the last step being “prove that the input types are WF given our environment”, we might have a step like “with no environment, whether there exist any types that could make the input types well-formed”. In that case, impl<T> Foo for HashSet<Option<T>> would be allowed, because clearly Option<T>: Hash for some T, but impl Foo for HashSet<NotHash> would not, because NotHash: Hash is never true. I’d have to experiment a bit with this.

Questions, and drawbacks

Towards the end of the meeting we had a few open questions and drawbacks. Open questions:

- If we adjust filter to add more things into the environment by default, is that backwards compatible (presuming we can resolve inference issues)? Presumably yes, since that is the premise for considering this at all.
- What filters do we want? Can we find a way to have the error for `impl HashSet<NotHash>` while still not filtering? Maybe there is a way to ask "could there ever exist types that satisfy these bounds"? Seems maybe doable. (Discussed in previous section)

Drawbacks:

- Removing a bound from a struct becomes a breaking change
    - There is this perception that removing bounds, since it accepts *more* types, should not be a breaking change. This isn't actually true in the strictest sense, in that it can interact with inference, but it's particularly untrue for structs/enums/traits under this proposal, since it will require others to change their outward facing signatures to include the bound explicitly.
    - Technically, removing bounds from fns *can* break code today too, particularly around closures, could also affect inference.

Some other topics that came up

Const impls. @eddyb raised some thoughts they have had around const impls. In particular, they wanted a proposal like const impl Foo for HashSet<T> where T: Hash which would mean something like “this is a const impl if T: Hash has a const impl”. This seems to work well with the implied bounds, maybe?

Thorny thickets. @arielb1 left some notes in the etherpad asking us various corner cases. For example, there is a question of how much to elaborate the implied bounds. You can certainly have a set of bounds that grows forever:

struct Foo<T> where Bar<Box<T>>: Eq { }
struct Bar<T> where Foo<Box<T>>: Eq { }

The problem here is that for Foo<T> to be WF, Bar<Box<T>>: Eq must hold. Depending on just how we define things, this might mean we have to check that Bar<Box<T>> is WF. For that to be true, Foo<Box<Box<T>> would have to be true, which in turn requires Bar<Box<Box<T>>>. And so on. These sorts of cycles are possible today of course and generally lead to recursion errors. I believe that for the purposes of implied bounds, we would evaluate until we reach a fixed point, and error out if we fail to do so (overflow), but I am not entirely sure. There are interactions with “coinduction checking” here as well, of course (which we support for auto traits). We’ll have to work this out but it seems like a detail to be addressed later.

Thoughts?

Thanks all for listening.


Lang team proactive agenda
#2

IIRC, currently we don’t let “global” predicates enter the parameter environment anyway, unless you add things like regions. We need to improve our story with that - you can encounter ICEs - and a “predicate will never hold” warning can go with the solution.


#3

I had always thought of “implied bounds” as a way to capture trait constraints inside structs:

struct SomeStruct<T> { value: T, where T: SomeTrait }

where you have to explicitly pattern match SomeStruct to acquire the trait bound (much like how GADTs work in Haskell):

fn foo<T>(s: SomeStruct<T>) {
     // here: T is unconstrained
     match s {
         SomeStruct { value: t } => {
             // here: T acquires the constraint SomeTrait
         }
     }
}

This would have been a lot more powerful as it would allow the user to carry around trait constraints as explicit term-level objects (you can pretty much emulate impl Trait and more), while avoiding surprises that might occur by a possibly looping constraint elaboration algorithm, since the user still has full control over how the constraints get inferenced.


#4

That is a different feature =)

I think the example becomes more interesting with enum variants, in which it is commonly referred to as GADTs.


#5

I think it would preempt a lot of the common use cases for implied bounds though!

Are you aware of any existing threads on this GADT-like feature?


#6

Are all struct bounds going to be promoted to implied bounds, or will they be opt-in?


#7

The assumption was that they would all be promoted.


#8

Please no. Proposals like these make me feel very sad, as I fear there is no care for readability and clearness of programs.

I think this feature will make programs less readable and enhance maintenance costs. When I found out about the feature, I really liked it. If you are too lazy to write this, just let a tool do it!

Making a post because on this forum you can’t -1 posts.

I don’t have the time (nor will it be of any use, you’ll discard it anyway) to write a more lengthy post, as I guess you have already decided to do this and dont really care about my standpoint that Rust should remain explicit and easy to understand.


#9

I’m in favor of some form of this, but there needs to be some sort of syntactic limit, or this will propogate totally unreasonably as I understand it:

struct Foo<T>(HashMap<T, String>); // look ma, no bounds

struct Bar<T>(Foo<T>); // even worse

Both of these seem clearly problematic because the type which provides the bound is not visiblein the public surface of the type. In general, I’m not in favor of propogating bounds across structs like this would imply.


I think the choice to use functions illustrates this rather poorly. To me, the really clear win is in impls, which are usually colocated & repetitive:

struct Foo<T: Bar + Baz> { ... }

impl<T: Bar + Baz> Foo<T> { }
impl<T: Bar + Baz + Clone> Clone for Foo<T> { }
impl<T: Bar + Baz> PartialEq<Foo<T>> for Foo<T> { }
// etc

This seems like a clear case where repeating this bound isn’t buying you anything.

Functions are more nuanced, but I think if the type appears in the surface API of the function and the parameter is fully abstract, its still a win.


#10

Both of these seem clearly problematic because the type which provides the bound is not visiblein the public surface of the type. In general, I’m not in favor of propogating bounds across structs like this would imply.

+1!

To me, the really clear win is in impls, which are usually colocated & repetitive

Sometimes they are not though. You can have impls at the other end of the crate, or even in a different crate. So unless you want to add (complex and unnatural) restrictions to only allow implied bounds inside the same module think it should not be done.

Instead we could have something like:

for<T: Bar + Baz> {
    struct Foo<T> { ... }

    impl Foo<T> { }
    impl Clone for Foo<T> { }
    impl PartialEq<Foo<T>> for Foo<T> { }
}

Or with a different keyword if for would make things ambiguous.


#11

I’m not concerned about textual locality even so long as the bounds are being inferred because of the receiver type of the impl. In my experience I tend to carry the knowledge of the bounds on that type with me as I think about it, and it always feels like busy work to elaborate them.

The for keyword is not a great idea here IMO because of its dual use with HRTB. However, Niko has also mentioned to me the (very unformed) idea of allowing namespaces to take params which are shared by all of their elements. I am concerned about the extent to which that starts to look like ML modules. (I know a lot of people love ML modules, but I don’t know how we have room for them in our complexity budget.)


#12

When elaborating something gets too burdensome for me, I make a macro and am happy.


#13

Unfortunately, the macro hygiene rules often make that difficult.

And besides, macros should be a last resort.


#14

And besides, macros should be a last resort.

I think macros as such are really great, so I disagree they should be the last resort. Also, changing the language for everybody has potentially far more harm than using a macro inside your code.


#15

Having constraint aliases could at least partially alleviate this. Perhaps you could use a short private alias and just have rustdoc elaborate them later.

type constraint TC = Bar + Baz;

struct Foo<T: TC> { ... }

impl<T: TC> Foo<T> { }
impl<T: TC> Clone for Foo<T> { }
impl<T: TC> PartialEq<Foo<T>> for Foo<T> { }

Not as clean as using for, I suppose.


#16

If there are a lot of type parameters or you want to give them descriptive names rather than the usual single characters, even listing the names can be busywork-ish. In that case the for idea starts to look better…

But it wouldn’t be as widely applicable as implied bounds.


#17

Given the troubles caused in Scala language by implicits, if this Rust feature gets implemented I suggest to keep it unstable (for nightly only) for a very long time, like two years (or more).


#18

It seems to me there exist alternative designs that would deserve to be compared to this proposal. I’m thinking for example about something like:

// bound is intended to be a keyword
bound<T> where T : Bar + Foo<T> {
  // T is constrained by the bound above whenever used in the current scope
  trait Trait<T> { … }

  impl<T,U> Struct<T, U> { … }

  impl<T,U> Trait<T> for Struct<T, U> { … }

  …
}

#19

I don’t see the connection between this proposal and Scala implicits.


#20

The connection is “implicitness” :slight_smile: