This is a continuation of Relative paths and Rust 2018 `use` statements.
One of the final things that needs to get resolved for Rust 2018 is the precise contours of the module system changes. The current design is described here. We've been wanting feedback and discussion specifically around the details of the path syntax and people's experience with it. That part of the design is summarized as follows:
use
statements take fully qualified paths.- A fully qualified path starts with the name of an external crate, or a keyword:
crate
,self
, orsuper
. - Outside of
use
statements:- Fully qualified paths work, and have the same meaning as in
use
, unless a local declaration has shadowed an external crate name. - Paths may also start with the name of any in-scope item, i.e. an item declared or
use
d in the current module.
- Fully qualified paths work, and have the same meaning as in
One anecdotal report from a few folks who have tried out the 2018 preview is that the path changes don't quite go far enough -- they still often find themselves forgetting to write a leading self::
in situations like the following:
// trying to use an item defined in a submodule -- missing `self::`
use foo::Bar;
mod foo;
// a common mistake when trying to bring variants into scope -- missing `self::`
use MyEnum::*;
enum MyEnum {
Variant1,
Variant2,
}
These situations are particularly bad in Rust 2015 because the code works without self::
at the top level module, but not elsewhere. Rust 2018's current design helps by making the code not work anywhere. This post proposes a way to make the code work everywhere.
At the same time, a few folks from the lang team have been exploring a variant of the 2018 design that would help address these issues. That's what I want to talk about here.
A uniform treatment of paths
One of the unsatisfying things about the 2018 design (which is also true of Rust 2015) is that paths work differently in use
statements than they do elsewhere. This is perhaps most visible with the self::
issue mentioned above: within a function, you can freely say MyEnum::Variant1
, but use MyEnum::*
doesn't work in Rust 2018 (since MyEnum
is interpreted as a crate name). The mismatch between these two styles of paths is a frequent paper cut, and also makes the language less uniform.
But it turns out that we can alter the 2018 design to make paths work the same way everywhere. In this "uniform" approach, we break down paths as follows:
- Starts with
crate
,self
, orsuper
: the path is interpreted as starting at the current crate's root module, the current module, or the parent module respectively. - Starts with
::
: the first name in a path is an external crate name. - Starts with an identifier:
- If the identifier is in scope (e.g. declared or
use
d within the current module), resolve it to the corresponding declaration - If the identifier is the same as an external crate name, resolve it to that crate
- If the identifier is a name in the prelude, resolve it to the corresponding item
- If the identifier is in scope (e.g. declared or
This is roughly the Python2 model of paths (more on that later), and the model used by shells and similar programs when doing path searches. Start with what's immediately in scope, and otherwise look at external crates and the prelude. In a sense, this treats the external crates and then the prelude as though they're in scope, just lower-priority than names declared in the current module.
Beyond the uniformity (use
and other items all work the same way!), this design also:
- Retains the benefits of the current Rust 2018 design: makes the top-level module and submodules work the same way; makes referencing external crates more ergonomic (since they don't have to be
use
d in submodules to refer to them) - Makes importing from local items more ergonomic, both in the sense of eliminating the common mistake mentioned at the beginning of the post (forgetting to write
self::
), and also making the paths more concise. - Allows arbitrary hoisting: anywhere in your code, if you have paths like
a::b::c
, you can take any prefix of those paths, such asa::b
, hoist it up to ause a::b
, and then substitute the last component (b
) for the prefix in all your paths (b::c
). That's a natural transformation to shorten such paths.
On the other hand, relative to the current 2018 design, you can't always tell whether a given use
statement is importing from an external crate or a local item (which, notably, is also the case in Rust 2015). The mitigation is that the external crates and the local declarations of a module are all relatively "nearby" things when reading code or keeping it in your head.
What to do about name conflicts?
The explanation of path resolution above includes a series of "if"s for the leading identifier case, but there's a question of what to do when multiple of them apply.
Let's say a path is ambiguous if it starts with a leading identifier, and that identifier could be two or more of: a local declaration, a crate name, or a prelude item.
Outside of use
statements, we would resolve ambiguous paths in the following order: local name, external crate, prelude. In other words, much like Rust 2015 except that we add external crate names in front of the prelude. This is a core part of the Rust 2018 design, and is pretty much the universal expectation.
However, within use
statements things are trickier, because of potential circularities when macros or glob imports are in play. While it may ultimately be possible to apply the same disambiguation order for use
, the implementation is much more challenging, and it's not obvious that it's desirable. So instead, we can make it a hard error to write an ambiguous use
statement, and instead recommend using a leading self
or ::
to disambiguate.
Other edge cases
One possibly surprising thing that follows from uniform paths is that this works:
use std::collections;
use collections::HashMap;
What's happening here is that the first use
brings collections
into scope, and the second use
then imports from that in-scope item. (You can also see this as the first use
adding a private collections
item to the module, and the second use
importing it as a relative path).
This is an unavoidable consequence of having a uniform notion of paths. But I'd propose that we include a warn-by-default style lint, suggesting rewriting the above as:
use std::collections::{self, HashMap};
Relationship to the current Rust 2018 preview
It's possible to make the current preview forward-compatible with this proposal by simply implementing the hard error on conflicts in use
statements. However, the module system changes are a defining part of Rust 2018, so it seems best to instead ship with something closer to the final design if possible.
It should be easy to implement this proposal under a separate feature flag and include it in the feedback cycle we've kicked off -- the whole point of which is to gain experience with the new features and their variants.
A couple alternatives to mention
-
We could of course pursue implementing disambiguation for
use
statements immediately, but I don't think it makes sense to block Rust 2018 on it (since it's a corner case with easy workarounds, and forward-compatible with adding it later). -
We may want to avoid re-purposing leading
::
as proposed here. We may instead want to preserve the existing behavior of::
from Rust 2015. This is a relatively minor point, and I'd like to focus on the overall thrust of the ideas in this post first.
The Python2 -> Python3 story
Finally, it's well known that Python 2 had a pretty similar story around paths as what's proposed here, and that Python 3 moved to something more like the current 2018 design.
You can read the relevant PEP here, but the thrust of the rationale is:
As Python's library expands, more and more existing package internal modules suddenly shadow standard library modules by accident. It's a particularly difficult problem inside packages because there's no way to specify which module is meant.
Things look quite different in Rust. For one, conflicts are not about the contents of std
(which isn't growing much anyway), but rather with explicitly declared external crates. Further, the proposal includes a simple and ergonomic means of disambiguation (::
or self::
). And of course, Rust's type checking (and the hard error on conflicts) means that name clashes become evident immediately.