Yesterday in the rust lang meeting we discussed an idea that has been floating around for some time. I think I first proposed it on my blog in 2014, using the term “implied bounds”. In fact, the idea exists in Rust today, but in limited form. (We briefly discussed other names, but I decided to just stick with “implied bounds”.) The discussion was about the implications for generalizing it.
Implied bounds aims to reduce the phenomenon where you have to copy-and-paste the same set of bounds all over the place again and again. In a nutshell, the idea is that whenever a type declares a bound, things that “use” that type shouldn’t have to repeat it. So, imagine that HashSet<K>
required that K: Hash
(at the moment, it doesn’t, but that’s another story):
struct HashSet<K: Hash> { ... }
In that case, if I write a function like use_set()
, which takes a HashSet<K>
, then I should not have to declare that K: Hash
:
fn use_set<K: Hash>(set: HashSet<K>) { ... }
// ^^^^ required today, but not under an implied bounds proposal
Similarly when I write impls, I should not have to add those bounds:
impl<K: Hash> Eq for HashSet<K> { }
// ^^^^ required today, but not under an implied bounds proposal
The reasoning is that we already require that the types of all function arguments are “well-formed”, meaning that their basic correctness criteria are met (the burden of showing that the types are well-formed falls to the caller of the function; there is a soundness issue with this at the moment, but that is somewhat separate and I hope to close it by improving the trait system). We use similar rules for impls as well: the impl is permitted to assume its input types are well-formed, per RFC 1214.
In fact, as RFC 1214 notes, we have implied bounds today, but only for “outlives” relations like 'a: 'b
, T: 'a
, or T::Item: 'a
. This rule was added around 2013, to describe some issues that arose for trait impls. The basic idea then is to extend this approach to more – or maybe all – of the WF requirements for a type.
How it works today
I’ll briefly cover what we do today based on this example:
struct Foo<'a, T: 'a + Eq> { ... }
impl<'a, T> Foo<'a, T> {
// no need to say `T: 'a`, it's implied; but `T: Eq` is missing.
}
As I said, the impl gets to assume its “input types” (here, Foo<'a, T>
) are well-formed. So the first step is to figure out what must be true for Foo<'a, T>
to be well-formed. This is determined just by looking at the struct declaration and copying its where-clauses, of which there are two:
T: 'a
T: Eq
Next, today at least, we “elaborate” those by expanding out supertraits (e.g., T: Eq => T: PartialEq
) and a few other things, and then we filter the results to extract outlives relations. This results in one predicate (T: 'a
). This is what the impl is allowed to assume is true; we call that the “environment” E.
Finally, we then check that the input types are well-formed in this environment E. In other words, assuming that T: 'a
is true, we try to prove that Foo<'a, T>
is well-formed. To do that, we have to show that the two where-clauses are satisfied:
-
T: 'a
is clearly satisfied, because it is in our environment, and hence we can assume it’s true. -
T: Eq
is not satisfied – it may or may not be true, depending onT
(note that we did not add this to our environment).
Just drop the filter?
The simplest version of this proposal just scraps the second set here – instead of only adding outlives requirements to the environment, we add everything. Interestingly, this makes the third case redundant. There is no need to check if the input types are well-formed, because we know that they will be, since we have assumed everything that we need to show.
But this has some side-effects that are not necessarily desirable. For example, imagine I have some type that is not Hash
:
struct NotHash { ... }
and then I wrote a function that takes a HashSet<NotHash>
:
fn foo(s: HashSet<NotHash>) { ... }
Following our procedure, we see that, for HashSet<NotHash>
to be well-formed, NotHash: Hash
must be true. So we add that to the environment (things we can assume are true). Now we can type-check the body under this assumption – but the assumption is false! This isn’t a soundness issue, but we won’t know anything is wrong until later, when somebody tries to call foo()
. They will fail because they will try to prove that NotHash: Hash
(so they can show that HashSet<NotHash>
is WF) and they will fail. It’d be nicer to give this error at the definition of foo()
.
Interactions with inference
Another challenge with doing no filtering at all is that it can lead to bad interactions with some of our current inference rules. For example, imagine that I have this code today:
fn foo<'a, T: Hash>(s: HashSet<&'a T>) {
// implied bound today: `T: 'a`
}
If we use the unfiltered version, we would also have the implied bound that &'a T: Hash
. That is true, but it’s less general than we could have. In particular, because T: Hash
, we know that for<'x> &'x T: Hash
– that is, &T
is Hash
no matter what lifetime it has. But we have this fact in our environment that says &'a T: Hash
is true for some specific lifetime. This interacts poorly with our current inference rules, since it will cause us to sometimes overly constrain region variables, forcing them to be 'a
when they could in fact be a shorter lifetime. I think we can solve this, but it’s an interesting interaction, and it’s complicated by our attempts to make the trait system agnostic to the specifics of what region equality etc means.
(In fact, I did an experiment where I implemented implied bounds in an effort to use them instead of the current T: ?Sized
defaulting approach. It ran aground on precisely these sorts of ambiguities – and it didn’t infer enough Sized
bounds to be satisfactory, but that’s a separate issue.)
Do we want some filters?
In some sense, an unfiltered rule is kind of surprising anyhow. One might consider a filtering rule that says something like “we will only keep bounds that apply to type parameters or projections” or something. I don’t know exactly what such a rule would be. With any such rule, there’s a risk that it winds up surprising people – because they can’t predict which where-clauses they will need and which they won’t – but then a rule that is too open is also kind of surprising, as in the example of NotHash: Hash
.
One possible solution might be to keep the full implied bounds, but rephase our WF check for impls. Instead of the last step being “prove that the input types are WF given our environment”, we might have a step like “with no environment, whether there exist any types that could make the input types well-formed”. In that case, impl<T> Foo for HashSet<Option<T>>
would be allowed, because clearly Option<T>: Hash
for some T
, but impl Foo for HashSet<NotHash>
would not, because NotHash: Hash
is never true. I’d have to experiment a bit with this.
Questions, and drawbacks
Towards the end of the meeting we had a few open questions and drawbacks. Open questions:
- If we adjust filter to add more things into the environment by default, is that backwards compatible (presuming we can resolve inference issues)? Presumably yes, since that is the premise for considering this at all.
- What filters do we want? Can we find a way to have the error for `impl HashSet<NotHash>` while still not filtering? Maybe there is a way to ask "could there ever exist types that satisfy these bounds"? Seems maybe doable. (Discussed in previous section)
Drawbacks:
- Removing a bound from a struct becomes a breaking change
- There is this perception that removing bounds, since it accepts *more* types, should not be a breaking change. This isn't actually true in the strictest sense, in that it can interact with inference, but it's particularly untrue for structs/enums/traits under this proposal, since it will require others to change their outward facing signatures to include the bound explicitly.
- Technically, removing bounds from fns *can* break code today too, particularly around closures, could also affect inference.
Some other topics that came up
Const impls. @eddyb raised some thoughts they have had around const impls. In particular, they wanted a proposal like const impl Foo for HashSet<T> where T: Hash
which would mean something like “this is a const impl if T: Hash
has a const impl”. This seems to work well with the implied bounds, maybe?
Thorny thickets. @arielb1 left some notes in the etherpad asking us various corner cases. For example, there is a question of how much to elaborate the implied bounds. You can certainly have a set of bounds that grows forever:
struct Foo<T> where Bar<Box<T>>: Eq { }
struct Bar<T> where Foo<Box<T>>: Eq { }
The problem here is that for Foo<T>
to be WF, Bar<Box<T>>: Eq
must hold. Depending on just how we define things, this might mean we have to check that Bar<Box<T>>
is WF. For that to be true, Foo<Box<Box<T>>
would have to be true, which in turn requires Bar<Box<Box<T>>>
. And so on. These sorts of cycles are possible today of course and generally lead to recursion errors. I believe that for the purposes of implied bounds, we would evaluate until we reach a fixed point, and error out if we fail to do so (overflow), but I am not entirely sure. There are interactions with “coinduction checking” here as well, of course (which we support for auto traits). We’ll have to work this out but it seems like a detail to be addressed later.
Thoughts?
Thanks all for listening.