So, RFC 213 was accepted some time ago, and included provisions for integrating type parameter defaults into inference. These are hotly desired by some on the libs team (@Gankra, I’m looking at you) for enabling more ergonomic use cases. @jroesch recently implemented the missing support (with one bugfix pending review). Before we ungate this feature, though, I wanted to raise the question of what the proper interaction with integral fallback ought to be, to make sure we are all in agreement.
This decision is interesting not only because we want to avoid as many surprises as possible, but because it also has some impact on backwards compatibility. That said, crater runs that we have done indicate zero regressions (and by this I mean both compile errors and new runtime behavior). Read on for details.
Example the first
In this first example, we have a user-defined fallback of u64
used with an integer literal:
fn foo<T=u64>(t: T) { ... }
// ~~~~~
// |
// Note the presence
// of a user-supplied default here.
fn main() { foo::<_>(22) }
// ^
// |
// What type gets inferred here?
The question at hand is what type gets inferred for the type parameter T
. On the one hand, the user specified a default of u64
. On the other, integer literals typically fallback to i32
. So which should we pick?
There are a couple of possibilities here:
-
Error. The most conservative route would be to report an error if there are multiple defaults and they are not all the same type. This might be unfortunate since one of the reasons people want type default fallback is to help inform integer literal inference a bit.
-
Prefer the integer literal default
i32
. We could givei32
preference. Nobody I’ve spoken to actually expects this behavior, but it does have the virtue of being backwards compatible. (More on backwards compatibility below) -
Prefer the user default,
u64
. In informal polls, this is what everyone expects. It is also what the RFC specifies.
The branch as currently implemented takes option 2. I think that for this specific example, u64
is definitely the less surprising result – as I said, at least for people I’ve spoken to it is universally what is expected. An error is however the most conservative option.
Example the second
OK, let’s consider a twist on the previous example. In this case, the user-defined fallback is not an integral type:
fn foo<T=char>(t: T) { ... }
// ~~~~~
// |
// Note the presence
// of a user-supplied default here.
fn main() { foo::<_>(22) }
// ^
// |
// What type gets inferred here?
Now the question is a bit difference. The type variable has one default char
, but it also connected to an integer literal type (with fallback i32
). Integer literals are naturally incompatible with char
.
So, again there are several choices:
-
Error due to multiple defaults. Again, the most conservative route would be to error, as there are multiple defaults (
char
,i32
) that apply to a single unresolved variable. -
Prefer the integer literal default (
i32
). This is perhaps somewhat less surprising than it was before, given thatchar
is clearly not a good choice. -
Error due to preferring user-defined default. If we were to indiscriminantly prefer the user-defined default, then we’d get an error, because the type of an integer literal cannot be
char
. This is what the RFC chose, both because it seemed like a clearer strategy to reason about and because of concerns about future compatibility with more flexible literals (see section below).
I’m not sure what is less surprising in this example. For one thing, I didn’t do a lot of polling. =) I can imagine that people expect i32
as the answer here. However, the concerns about more flexible literals (discussed below) are perhaps valid as well.
Implementation strategies
There are various impl strategies we might adopt. Here are the outcomes for each example:
| Strategy | Example 1 | Example 2 |
| -------------- | --------- | --------- |
| Unify all | Error | Error |
| Prefer literal | i32 | i32 |
| Prefer user | u64 | Error |
| DWIM | u64 | i32 |
- Unify all: always unify the variables with all defaults. This is the conservative choice in that it gives an error if there is any doubt.
- Prefer literal: always prefer the integer literal (
i32
). This is the maximally backwards compatible choice, but I think it leads to very surprising outcomes. - Prefer user: always the user-defined choice. This is simple from one point of view, but does lead to a potentially counterintuitive result for example 2.
- DWIM: At one point, @nrc proposed a rule that we would prefer the user-defined default, except in the case where the variable is unified with an integer literal, and the user-defined default is non-integral. This is complex to say but leads to sensible results on both examples.
Backwards compatibility and phasing
You might reasonably wonder what the impact of this change will be existing code. This is somewhat worrisome because changing fallback could lead to existing programs silently changing behavior (like, now using a u64 instead of i32) rather than failing to compile. We did a crater run with the “unify all” strategy. This strategy has the virtue of causing a compilation error is there is any ambiguity at all, so code cannot change semantics. No regressions were found. From this I conclude that the danger is minimal to nil, but YMMV.
Nonetheless, when phasing in the change, it would probably be good to start with a warning cycle that will warn if code might change semantics (or, depending on what strategy we choose, become an error) in the next release. This can be achieved by simulating the “unify all” strategy.
Future, more liberal forms of literals
One of the reasons that the original RFC opted to prefer user-defined defaults is that, in the future, I expect we may try to change integer literals so that they can be inferred not only to integral types but also to user-defined types like BigInt
. At that point, any rule that attempts to differentiate between an “integral” type and other user-defined type becomes rather more complicated, probably involving a trait lookup of some kind. Adding trait lookups into the processing of defaults seems like it would push an already unfortunately complex system rather over the edge to me.
My conclusion
I’ve personally not made up my mind, but I think I roughly order the choices like so:
- Use user-defined default always, as specified in the RFC
- Always error when there is any ambiguity
- DWIM
- Prefer
i32
What pushes me over the edge is that the first two have the virtue of being extensible later. That is, we can convert the error cases into the “DWIM” rule if we decide to do so, but we cannot change back the other way. I am somewhat concerned that the “always error” rule will rule out a lot of use cases, and hence I lean towards the option espoused in the RFC.