The Great Module Adventure Continues

I tried to put together a question along the lines of “the learning friction seems to come from submodules behaving differently to the root module, can we just make submodules behave the same?” but I couldn’t word it nicely. This seems like a good starting point!

There’s one thing that the root mod does which isn’t listed here, which is add modules to the namespace root as well. This is where the “I have to refer to a module I just declared via self!?” shock comes from:

mod foo {
    pub struct S;
}
use foo::S;
// ^^^^^^^-- `foo` is always in scope because it's in the root module.

mod not_the_root {
    mod bar {
        pub struct T;
    }
    use bar::T;
    // ^^^^^^^-- `bar` isn't in scope here! Need to `use self::bar::S`.
    use foo::S;
    // ^^^^^^^-- This will work though.
}

I think the migration path for “include extern crates in all namespaces” and “include all local mods” could work: in both cases you had a use thing or use self::thing which is now redundant, except for disambiguation. Haven’t thought through it thoroughly though.

Regarding #[no_std] environments, I think the concept of “make every mod do what the root mod does” translates pretty directly - did you have any particular concerns @mark-i-m?

2 Likes

These two sound very similar, and the Java analogy really sells me on the idea. I think it might work best when combined with leading-crate, like @matklad describes.

This should have minimal actual churn, much like leading-crate. All external use statements remain the same; the mental model around dependencies is the same. The mental models around internal paths and paths in expressions change, but they're simplified and brought into line with dependencies so as a whole things get easier to use.

Implementation-wise, the absolute-or-local approach sounds much simpler than the full relative use version that IIUC was what turned out to be unimplementable. With the "add dependencies to the prelude" framing and crate it shouldn't even need fallback, at least not with a flag-day transition. It may also be possible to enable full 1path?

2 Likes

I haven’t had time to really digest this lastest direction, but here’s some stream-of-consciousness reaction.


For clarity, I want to re-state what I think is the combined proposal you have in mind:

  • Absolute paths always begin with a crate name, where crate is the way to refer to the current crate.
  • In the new epoch, use statements must use absolute paths.
  • Outside of use statements, you can freely use in-scope names or absolute paths; in-scope names shadow external crate names.

Thus, any use path can be freely transported into an expression with the same meaning (modulo shadowing).

I don’t agree to the “minimal churn” point for the same reasons as previously stated: almost every file will have to change, regardless. I think we should set that particular point aside.

If we take the “flag day” approach (as in my bullets above), then the experience when copying from StackOverflow will come down to our error messages when you fail to write a leading crate on an import; I suspect we can make those quite good. Notably, however, that’s the equivalent of implementing fallback.

We could consider retaining leading :: as a disambiguator for referring to extern crates that are shadowed; like today it would mean “this is an absolute path”. But normally it would no longer be needed.

I imagine some folks will be nervous about the possibility of shadowing, but the chances of compilation succeeding on an accidental shadowing are vanishingly small. The fact that Java follows this model does suggest that it’s probably a pretty safe bet.

re: implementation, we essentially already have this with the prelude today, which is treated as a fallback for resolution. We could quite literally inject into the prelude, as @matklad originally suggested.

Overall this seems like a reasonable addition to the design space. I suspect it would lead to a somewhat decreased use of use, in favor of e.g. just writing futures::Future in-line.

The loss of 1path here doesn’t seem like a deal-breaker to me. As I argued in the initial RFC on this topic (with leading crate), the fact that all use statements will use absolute paths, and in particular references within the crate start with crate, will be a very easy reminder when moving up a relative path that you need to add a leading self:: (or crate::).

IOW, I think I want to step back on the full-on commitment to bi-directional 1path, and consider the more global tradeoffs including ergonomics. That’s not to say that I prefer this proposal yet, but just that I’m open to it :slight_smile:

When I have more time, I’ll try to write a head-to-head comparison of what, I think, are the two “leading” proposals in each “camp”.

6 Likes

Originally I though that we can allow relative paths there as well, which should give us 1-path.

That is, I think the following could work

/*
Something like

extern crate future;

is injected by the prelude
*/

use future::Future; // using the future crate, "absolute" path
use my_futures::NiceFuture; // importing a name from a child module, relative path

mod my_future;

fn foo() {
    let f: futures::Future /* "absolute" path */ = my_future::MyFuture::new() /* relative path */ ;
}

One point there is that, while Java doesn’t allow relative paths in import statements, that’s possibly because they wouldn’t even make sense because Java packages don’t nest. They do have hierarchical-looking names, but that’s just a convention- import a.* only imports classes from package a, not other packages. a.b isn’t actually part of a.

I suspect this also has a large impact on the level of confusion, and probably in a good way. It translates to Rust as an inability to write mod foo; fn f() { foo::bar() }, which might lessen the instinct to write relative paths in use statements altogether.

I don’t really see a way to get there backwards compatibly, though, and it would probably be rather annoying with current idiomatic module structures. Just an aspect of Java’s design that makes for an interesting comparison.

5 Likes

What does it mean "all extern crates"? This is an open set.

Thus any reasonable system should work with extern crates in "on demand" way - if we have a name foo in the source code we 1) search this name in the source code somehow (that may include compiler's commandline) and if that search failed, then 2) we go into filesystem and search for a crate file named foo.rlib or something.

Btw, if all relative paths can fallback to extern crates, then we will have to search filesystem on every new name FIX: on every new unresolved name, that's better.

Also, scope-relative paths (this includes prelude and primitive types like u8, but also e.g. local variables) and module-relative paths (to the current module, or some other module) are very different things.

So far scope relative resolution didn't interact with use resolution in any way, use is module-relative "by definition" so far and only names "rooted" in some module may interact with import resolution.

(To clarify, prelude names are *not* rooted in every module and thus can't be used, so we can't reuse this existing mechanism.)

Ah, I've completely forgotten that rustc has a "search path" concept :angry: ! Is it really required for anything besides the sysroot stuff though :slight_smile: ?

Can we inject only crates, passed via compiler flags, the standard library facade crates and remove the search path concept from the public api of rustc?

Certainly! This is simple and this is what https://github.com/rust-lang/rfcs/pull/2088#issuecomment-319226633 did.
The drawback is that it creates a mismatch between crates passed via --extern and (at least) crates from the standard distribution (including std and core).

1 Like

This sounds super bad, how do I import things with a relative path with that?

With self like today. I don't believe @aturon intended to exclude that. I certainly didn't.

1 Like

This also interacts with people not using cargo... I'm not really sure about that side of the story. (That is, how onerous is it to expect them to supply crates by name on the command line?)

1 Like

No more onerous than building without cargo in the first place, I think.

4 Likes

So I was talking to @jseyfried today about fallback, because indeed it seems like fallback is one of the "lynchpins" about which this decision rests. It's worth highlighting that there are two kinds of fallback. Let's say we're resolving ::foo::bar in the new epoch:

  • "Fallback to new semantics": If there is a plausible foo resolution in the root module, use that but issue a warning/error. If not, look for a crate.
  • "Fallback to old semantics": If there is a foo crate available, use that. Otherwise, check in the root module for something. If found, issue a warning/error.

The second version is quite clean: the set of available crates is fixed, either by the Cargo.toml or the file system. This makes the fallback simple to implement, because the first test can be done anytime.

However, simply implementing "fallback to new semantics" is not compatible with the "hard constraints" section of the epoch RFC, which states:

There are only two things a new epoch can do that a normal release cannot:

  • Change an existing deprecation into a hard error
    • This option is only available when the deprecation is expected to hit a relatively small percentage of code.
  • Change an existing deprecation to deny by default, and leverage the corresponding lint setting to produce error messages as if the feature were removed entirely.

There might though be some way to make a variant that complies, wherein pre-epoch we resolve through the root module but also check for an available crate. If a crate is available, but we found something in the root module that is not the corresponding extern crate, then we report a warning. This means we warn only if the path would change semantics.

In the newer epoch, then, we would resolve to the crate, but make it an error if there is any item in the root module (i.e., any plausible resolution) and there is a crate by the same name, unless that root item is an extern crate. This avoids some of the concerns that @josh had about ambiguity, since there is only one possible meaning of ::foo. It does mean that -- at least initially -- you cannot have a root module named foo and an external crate named foo.

I don't know though that this is strictly compatible. I think there could be e.g. a macro expansion in the root module that used to resolve to something from within the crate but which now resolves to something from an external crate. In the older semantics, that macro might have generated items at the root level which would have shadowed an external crate -- but now that the macro is resolving via an external crate, it does not. (But, I guess for that to happen, the path to the macro itself must be ambiguous in the new semantics, so maybe there is an induction argument here?)

Sorry if I'm retreading ground here. It's hard to remember all the territory we covered. Maybe we need to make a kind of "fallback summary" post as well, covering the various ways we can do fallback, and the conditions to be concerned about? (I remember @rpjohnst in particular making a comment earlier referencing the detailed examination of fallback cases from the original RFC thread.)

1 Like

This alternative is bad because it steps into build system territory and makes untrackable changes somewhere in the filesystem to be able to change results of name resolution (https://github.com/rust-lang/rfcs/pull/2126#issuecomment-328079126).

1 Like

There is absolutely nothing about a fallback system that can be “clean”. The whole “let’s change the module system” is supposed to simplify things, having a fallback to some different set of rules is the exact opposite of this.

1 Like

Wait, if 'old semantics' means 'current semantics' then I think I'm missing something. If I have the following setup right now:

mod cc {
    pub extern crate cc;
    //  ^^^^^^^^^^^^^^^
    // Needs to be here, not at root, otherwise we get 'cc defined twice'
    // errors.
}

mod foo {
    use ::cc::Build;
}

I won't see a successful lookup for Build in the cc crate - instead I get an error that there's no item named Build in the cc mod. Where would I see an absolute ::foo::bar path resolve to an extern crate instead of a root module in current Rust?

Also, testing that snippet above is how I learned that extern crate declarations need privacy annotations, but I haven't seen any discussion about how that interacts with the "insert all extern crates into the prelude or at the path root" ideas. Are we mostly just dropping it?

Yeah, good point. To avoid touching the filesystem, we could limit it to cases where the crates are supplied on rustc command line (the "normal case") and expect extern crate for the "search directories" case.

(Ok, I've definitely got to make a kind of summary issue tracking the pros/cons of various fallback options -- will tackle later.)

1 Like

Just to clarify, the fallback system exists to ease the transition to the new system. In the new epoch, it would (by default, at least) only give errors to deprecated usage.

The new system would be either leading-:: or leading-crate, meaning that absolute paths have the form ::<crate>::foo::bar always (and you do ::crate::foo::bar to select from the current crate).

That said, a downside of the system I described is that it would affect you even if you were just using the new system, because you would be forbidden from having a module in the current crate that has the same name as an extern crate, even though -- using the new-style paths -- there is no ambiguity there.

Anyway, I gotta run, so no more comments from me for now -- this is the hazard of leaving comments on the weekends!

I think this is the correct behavior regardless of what syntax is chosen. I'm surprised the RFC didn't specify this; I don't think it would be good behavior to scan the library path for a name unless the code specifically says to do so somehow (e.g. with an extern crate declaration).

I am in general excited about something along the lines of this 'Java-style' approach to the syntax.