UPDATE: I’ve got a summary comment here for the thread up to comment 76. =)
Hello all,
Welcome to another episode of the Great Modules Adventure! When last we met, our brave adventurers had at long last come to the fabled land, having accepted RFC #2126. There they took a brief respite, and made ready to begin with the Impl Period. In that period, a great much work was done, and indeed the outlines of the module system came to be implemented. But lo there were a few niggling questions to resolve still… and that brings us to this discussion thread.
In less flowery language: during the impl period, we made a lot of progress implementing the “Clarify and streamline paths and visibility” RFC (along with a number of related RFCs). In fact, we made almost too much progress – we implemented a couple of different variants, and I’d like to get a discussion started about what we really want.
Overriding goals
I thought it’d be good start by reminding ourselves of the overriding goals:
-
Make it simple to see where an import is coming from
- Today, it’s not always obvious what comes from external crates and what is internal.
-
use foo::bar could of course refer to either one.
-
Allow paths that appear in
use to also be used in fn bodies
- A common mistake today is to use paths like
std::cmp::min from inside a function.
- (Not just for beginners; I do this fairly regularly -nmatsakis)
- To add to the confusion, this works if you are at the crate root, but not otherwise.
- It’d be nice if we had a guarantee that one could copy a path from a
use somewhere else in the program and it would always resolve to the same thing.
-
Maintain backwards compatibility and avoid fallback
- For implementation reasons, it is strongly preferred if we can avoid any notion of “fallback”
-
Avoid redundant extern crate declarations
- When using Cargo, adding to
Cargo.toml should be enough. One less thing to do wrong.
-
Streamline and clarifiy visibility rules
- For most crates, we should be able to declare things in one of three modes:
- private to the module (default)
- private to the crate (use
crate keyword)
- public to the world (use
pub keyword)
- Lints can check when things are declared as
pub but no in fact reachable from the crate root
Outline
I am going to propose two-and-a-half a few different variants.
- First I will describe “extern paths”, which achieves all of our objectives, but at the cost of verbose syntax.
- Next, I will toss out various other syntaxes that are equivalent to “extern paths” that have been brought up from time to time.
- Finally, I’ll discuss the “absolute paths begin with crate name” variant that is closer to what the RFC proposed, but with various technical problems solved. This variant fails to achieve the objective that paths in
use should also be usable in fns. (I’ll also try to compare this against what I think the RFC was proposing.)
I’ve also implemented a sample project in the two distinct styles, so you can get a feeling for how they look in practice.
Variant 1: extern paths
A “picture” is worth a thousand words, I suppose. Here is an example source file showing how this scheme works. The key point is that use extern::{..} path is used to select code from other crates, and use crate::{...} to select code from this crate, so the imports section looks like:
use crate::file_read::for_each_line;
use extern::{
regex::Regex,
std::{
env,
process,
io::{self, Write}
},
};
fn main() {
...
}
In this formatting structure, I am leaning on the “nested import groups” RFC to create a block of “extern imports” (which fits with many style guides). Of course it’d also be legal to do use extern::std::env as a standalone line.
Details:
- In this version, absolute paths always begin with a keyword:
-
use crate::foo::bar – selects foo::bar from the crate root
-
use extern::regex::Regex – selects Regex from the crate regex (no extern crate needed)
-
use self::foo::bar – selects foo::bar from the current crate, starting at the current module
-
use super::foo::bar – selects foo::bar from the current crate, starting at the parent module
- In-scope paths like
foo::bar work as they do today
- As today, they are not permitted in
use P or in a pub(in P) positions.
- Deprecations:
- Absolute paths like
::foo::bar still work, but are deprecated. They are equivalent to crate:foo::bar.
- Writing
use P where P does not begin with an absoluate path keyword is deprecated but considered equivalent to crate::foo::bar
- Backwards compatibility:
- This proposal is fully backwards compatible and does not require opt-in, though all existing code would be using the deprecated style.
- A
rustfix utility could trivially convert paths.
Variant 1b: other syntaxes
There were various other syntaxes proposed around explicit absolute paths. The key idea here is to introduce new syntax for absolute paths. Some other proposals, along with my cutesy names for them:
Proposal | Extern Crate | Local Crate
--------- | ------------------ | -----------
@crate | use @regex::Regex | use @crate::foo::bar
@:: | use @regex::Regex | use @::foo::bar
[crate] | use [regex]::Regex | use [crate]::foo::bar
[] | use [regex]::Regex | use []::foo::bar
: | use regex:Regex | use crate:foo::bar
Some of these may be ambiguous in expression position, I don’t know. I’m terrible at finding ambiguities. @petrochenkov will have to tell you. =)
Variant 2: absolute paths begin with a crate name
Here is an example source file showing how this scheme works. The key point is that in use foo::bar, foo is assumed to be a crate name, and use crate::foo is used to select from the current crate, so the imports section looks like:
use crate::file_read::for_each_line;
use regex::Regex;
use std::{
env,
process,
io::{self, Write}
};
fn main() {
...
}
Details:
- In this version, absolute paths are not syntactically distinguished, but instead distinguished by where they appear.
- As today,
use P and pub(in P) would be absolute paths, but other paths are in-scope.
- Absolute paths are assumed to begin with either the keyword
crate or a crate name:
-
use crate::foo::bar – selects foo::bar from the crate root
-
use regex::Regex – selects Regex from the crate regex (no extern crate needed)
-
use self::foo::bar – selects foo::bar from the current crate, starting at the current module
-
use super::foo::bar – selects foo::bar from the current crate, starting at the parent module
- In-scope paths work as today.
- You would be able to “switch” from an in-scope path to an absolute path by using the
:: prefix:
- e.g.,
::crate::foo::bar or ::regex::Regex.
- Note that
crate::foo paths cannot be used in a fn body – rather you type ::crate::foo
- see the “Niggly parsing ambiguity” section below for an explanation
- Open question: What should happen with
self and super paths? For consistency, I might expect ::self and ::super, but of course the older forms exist and are unambiguous.
- Backwards compatibility:
-
This proposal requires opt-in. Code like
use foo::bar or ::foo::bar changes meaning under this proposal, since foo is assumed to be a crate name.
- In leadup to the new epoch, we can deprecate paths like
use foo::bar where foo is not a crate and offer a rustfix like tool to convert to use crate::foo.
- How is Variant 2 different from the RFC? Two differences:
- No use of fallback in name resolution. This is technically challenging. This may be worth revisiting. This is what makes the transition harder.
- Using
::crate::foo within functons and not crate::foo. This wasn’t clear in the RFC, actually, but I think was the assumption that we would do the latter. This however ran afoul of a niggly syntactic ambiguity, discussed in the next section.
Variant 3: leading ::
(Proposed by @matklad, added as an update)
A kind of blend of Variant 1 and Variant 2 – in this version, absolute paths look like:
-
::<crate>::path, where <crate> is either the keyword crate (local crate) or the name of a crate (e.g., ::std::cmp::min)
-
self:: and super:: paths expand to ::crate::path::to::self::or::super::module, basically.
A use statement uses absolute paths, so you write use ::std::cmp::min or use ::crate::foo (or paths relative to self and super). Here is an example:
use ::crate::file_read::for_each_line;
use ::regex::Regex;
use ::std::{
env,
process,
io::{self, Write}
};
This has the advantage of having paths in a body be a subset of paths in a use. It also has relatively concise absolute paths (e.g., ::std::fmt::Debug vs extern::std::fmt::Debug), which can be useful.
A niggly ambiguity
One thing that came up is that there is a parsing ambiguity around the crate visibility keyword and the proposed crate::foo paths. In particular, consider these tokens:
struct Foo(crate :: foo);
Is this meant as a field with crate visibility whose type is ::foo? Or a private field with type crate::foo? The various schemes I proposed above lead to different interpretations here, I think.
- In variant 1 (
extern paths), this would be parsed as a private field of type crate::foo. This is because ::foo paths are deprecated, so there is no reason for us to prefer the crate (::foo) interpretation.
- In variant 2, in contrast, this is a in-scope context, and hence paths cannot begin with
crate here. Therefore, we should parse that as crate (::foo) – i.e., a field of type ::foo with crate visibility.
- In the original RFC, i.e. where you can write
crate::foo in an in-scope context, then there is no clear or correct interpretation for this ambiguity, but we could pick arbitrarily.
My personal opinion
I prefer some version of Variant 1, but maybe not extern. The clarity and simplicity of the proposal is very appealing, as it the fully backwards compatible aspect. I love the idea that there aren’t “two kinds” of paths in Rust.
Also, please note that variants 1 and 2 both exist and are usable in the Nightly compiler today. So you can try converting your project to these styles (or make a new project) and see how it feels!