So @aturon did a good job of saying the big points. I just wanted to walk through in a bit more detail what the options are with regard to upgrading. I was going to write this post just about match
but in writing it I realized that a lot of the same things apply to many cases, so let me just go over a few examples. They start with the easiest and get progressively harder.
How would we introduce a new keyword like catch
?
Leaving aside whether you like this keyword, one can assume we will sometimes want to introduce new keywords. Ultimately, this is the most straightforward case, though there are some subtle bits. In its simplest form, we can deprecate using catch
as an identifier. At the change of epoch, we are then free to re-use that identifier as a keyword. By and large, this transition can be automated. Some form of rustfix
can rename local variables etc to catch_
or whatever we choose.
Where things get a bit tricky is are things like bits of public API. In that case, there could be Epoch 1 crates that (e.g.) have a public type named catch
that is not typeable in Epoch 2. This could be circumvented by introducing some form of “escape” that allows identifiers with more creative names (e.g., I think that scala uses backticks for this purpose).
So the overall flow here:
- In Epoch 1:
- we introduce a new feature for “escaped” usage
- we deprecate
catch
used as an identifier (or type-name, whatever)
- and supply a
rustfix
tool that renames local variables, suing the “escape” form for public APIs
- In Epoch 2:
How would we transition meaning of bare trait?
Let’s suppose for a second that we want to repurpose “bare trait”, so that fn foo(Iterator)
means fn foo(impl Iterator)
. Let’s leave aside for a second if this is a good idea (I think it’s unclear, though I lean yes), and just think about how we might achieve it without breaking compatibility.
To make this transition, we would do it as follows:
- In Epoch 1, we deprecate
Iterator
as a type and suggest people migrate to dyn Iterator
.
- In Epoch 2, we can then change the meaning of
Iterator
to impl Iterator
.
This all makes sense, but it does raise a question: what do we do with impl Iterator
? If we stabilize that syntax in Epoch 1, then perhaps we will deprecate it in Epoch 2 and suggest people remove the (no longer needed) impl
keyword. An advantage of this is that (a) people can use impl Trait
sooner, which we obviously want and (b) some of those uses of Iterator
as an object may well be better expressed with impl Iterator
, and we can enable that.
The key components here:
- We issued deprecation warnings for existing code in earlier epoch:
- Whenever you issue a deprecation, we need to provide people with a way to fix the deprecation explicitly.
- In this case, by adopting
dyn Trait
(or, perhaps, impl Trait
).
- We can readily automate the fix for these deprecations via some
rustfix
tool.
- That deprecated code becomes illegal in the new epoch, so its meaning can change.
- Interestingly, the explicit
impl Trait
form presumably also becomes deprecated.
- So we would want to automate the fix for that too – but unlike before, these changes can’t be applied until the new epoch.
match
The idea of the match ergonomics RFC is basically to make it unnecessary to say ref x
– instead, when you have a binding x
in a match, we look at how x
is used to decide if it is a reference or a move (much as we do with closure upvars). Again, leaving aside the desirability of this change, can we make this transition?
Changes to execution order. This change can have a subtle effect on execution order in some cases. Consider this example:
{
let f = Some(format!("something"));
match f {
Some(v) => println!("f={:?}", v),
None => { }
}
println!("hello");
}
Today, that string stored in f
will be dropped as we exit the match
(i.e., before we print hello
). If we adopted the Match Ergonomics RFC, then the string will be dropped at the end of the block (i.e., after we print hello
).
The reason is because binding to v
today always trigger a move
, but under that RFC v
would be a move only if it had to be based on how it was used, and in this case there is no need to move (a ref
suffices). (This is much like how closure upvar inference works.)
So clearly there is some change to semantics here. That is, the same code compiles in both versions, but it does something different. In this example, I would argue, the change is irrelevant and unlikely to be something you would even notice (my intuition here is that it is rare that dropping one variable has side effects relative to the rest of execution, and rarer still that someone was using a match to trigger an otherwise unnecessary drop). But you can craft examples where the change is significant (e.g., the value being dropped has a custom drop with observable side-effects, and it is important that those side-effects occur before hello
is printed).
What makes this change tricky. A couple of things make this change tricky:
- Hard to have a targeted deprecation
- No clear canonical form in some cases
Let’s review those. Clearly, we can issue warnings for code whose semantics may change. But it’s hard to target those deprecations narrowly. Ideally, we’d only issue a warning if all three of these conditions hold:
- There is a binding that, in Epoch 1, is a move, and in Epoch 2, is a reference.
- The value in that binding has a
Drop
impl
- Executing that
Drop
impl at a different time matters to the code
That last part cannot be fully detected. We can probably use a variety of heuristics to remove a bunch of false positives. But, if we are correct that this change in order will almost never matter, almost everything we do report will be wrong, which is annoying. And it’s a subtle problem to explain to the user in the first place.
The other problem is that there is no clear canonical form that we can encourage users to migrate to. In other words, suppose I get a deprecation warning, and I understand what it means. How will I modify my code to silence the warning? The Ideally, we’d have a way that is better than just adding a #[allow]
directive.
There are really two cases. Either I want to preserve the existing execution order, or I don’t care. We believe the first one will be rare, but unfortunately it’s the easy case to fix. Once can force an early drop by adding a call to mem::drop()
. As a bonus, your code becomes clearer:
{
let f = Some(format!("something"));
match f {
Some(v) => {
println!("f={:?}", v);
mem::drop(v); // <-- added to silence warning
}
None => { }
}
println!("hello");
}
But if we don’t care about the drop, what should we do? Probably the best choice is to encourage people to change v
into a ref
binding – but of course that’s precisely the form that we aim to remove in Epoch 2! (This has some interesting parallels with impl Trait
I think, where we might be encouraging people to use impl Trait
, even though we aim to deprecate it.)
The other option, of course, would be to have some form of “opt-in” to the new semantics in Epoch 1 (e.g., something like the stable feature gates proposed here). That has the same set of previously discussed advantages/disadvantages (e.g., it muddies the water about what code means and this option is used frequently it raises the specter of there being many Rust dialects, rather than just Rust code from distinct eras).
Conclusion
Sorry this is long. I wanted to really work through all these issues, but for myself and for the record. I guess that the TL;DR is roughly this. First, we assume that we’re trying to repurpose some syntax in some way (as in both of these cases). This will generally be true, because if that is not happening, then there is no need to use an Epoch, we can just deprecate the old pattern and encourage the new pattern (e.g., try!
into ?
). In that case, the transition has the following form:
- Some kind of deprecation in Epoch 1:
- As targeted as you can make it, ideally with an automated tool that will make changes that preserve semantics.
- Need to ensure that people can migrate to some new, preferred syntax:
- catch keyword: new identifer or escaped form
- bare trait:
impl Trait
- ergonomic match:
mem::drop(x)
or ref x
- Deprecated code becomes illegal in Epoch 2, freeing up the existing syntax for a new purpose.
- Sometimes, the new, preferred syntax from Epoch 1 becomes deprecated.
- e.g.,
impl Trait
or ref x
- this transition can again be automated
- perhaps this syntax is removed in the next Epoch (if ever)
I think the key questions to ask of any such transition:
- To what extent can it be automated?
- How targeted are the deprecations?
UPDATE: I realized that we can fully, but conservatively, automate the transition for match ergonomics. This may want to be a hard rule.