The Great Module Adventure Continues

One further point about @josh’s proposal. I think the fact that it’s such a minimal – and very natural! – delta over today’s module system should be given significant weight.

In particular, the path changes are without a doubt the most dramatic change in the new epoch. If we can limit them to something that is a very natural, back-compat evolution of our current path system, that will go far to limiting the risk of people thinking “Rust is breaking everything!!” And the fact that we can do so while still achieving all of our other primary goals is very appealing indeed.

4 Likes

It seems to me that widespread use of ::foo as an alias for extern::foo in submodules would in practice mean that the confusion around writing e.g. regex::Regex in the crate root and ::regex::Regex elsewhere would still be there. Yes, you can write the former in submodules with the right use statement, and you can write the latter in the crate root, but if that pattern remains in widespread use, the learnability challenges associated with it will likely remain. Perhaps there could be a coding standard against this kind of inconsistent usage, though I'm not sure whether that would mean encouraging ::regex at crate root or discouraging ::regex in submodules.

1 Like

To be clear, I wasn't intending to endorse this style. Just saying it was possible. I personally think I would learn towards having an "extern block" in code I write.

That's precisely what I was about to write. The old discussion often had people proposing "just add extern:: paths and then we can just replace extern crate foo with use extern::foo, it's a minimal change!"

But that defeats the entire purpose of changing paths at all- we don't want just any old minimal change, we want the minimal change that clearly solves path confusion.

@josh's variant is a little better than those old proposals because it also includes "forcing" a leading :: in use paths, but the 1path property then undermines this by encouraging people to write ::foo in expressions as well!

One of the benefits of crate:: is that it mitigates the problems of "use is an item." Because any absolute paths to crate-local items start with crate::, crate::foo is not an obvious thing to try and is an obviously-weird thing to write when foo is use extern::foo.

Yeah, certainly other styles could be encouraged. I'm more pointing out that the drawback of 'small delta' is that the present style—with its associated challenges—is more likely to persist. I guess that's sort of the inherent tradeoff. If ‘use is an item’ were (partially?) phased out through warnings, this wouldn't be an issue, but I think that was rejected last time around.

It also doesn't have the obvious advantage over extern::foo that ::foo does. It's just that, circularly, that also means that ::local has an advantage over crate::local. It's an unfortunate loop which is closed by ‘use is an item’.

IMO, we should just deprecate "use is an item" and be done with it.

3 Likes

This one feels even worse to me than the leading-extern variant (with both extern:: and crate::). :frowning:

I genuinely do not understand what you mean by this. The way this system would be used is very different from today- an entirely new system that doesn't overlap with the old one at all is precisely what makes me think "Rust is breaking everything!" What's worse, it feels very unprincipled and ad-hoc so everything's being broken for very little gain.

  • It's even more unbalanced than any other options, making it feel very unnatural to me.
  • If old code continues to work but is completely deprecated, people will still get the wrong ideas out of reading old code because the new mental model can't apply to it (unlike in the flag day scenario for leading-crate, for example).
  • Adding std to the prelude in this system makes path confusion even worse. It might be nice, but I would prefer to do it under a different system if it's done at all, since removing it helps with path confusion.

This is not at all unique to this variant. Leading crate and leading :: do this just as well, and arguably more effectively since they also split up different dependency crates rather than lumping them all into extern.

And yes, the repetition argument also applies to crate::, at some level. I believe it's far less of an issue there:

  • External uses seemed more common than internal ones, last time this was investigated. I remember having to argue that internal+absolute uses were worth considering at all!
  • "Absolute paths all start with a crate" means that only paths from a single crate get extended/repeated, rather than paths from all dependencies.
  • The extern:: block seems liable to group imports from all dependencies into a single block. Leading-crate and leading-:: seem to encourage a single block per dependency instead, because the first path segment differs between them.

To expand on my "minimal delta" disagreement: I care far more about the delta in mental model and actual usage than I do the delta in implementation and compatibility. "We replaced extern crate with extern:: paths" feels like it's optimizing too much for the latter and not at all for the former. rustfix can't change people's mental models or old blog posts/etc. At the risk of becoming a broken record, this is what I imagine the migration process being for leading-crate:

  • Learn about crate:: paths by analogy to existing some_dep:: paths and update your code. This change might be large, but it's pretty localized.
  • Learn about extern crate being removed and update your code. This change is miniscule because you already had Cargo.toml anyway.
  • Your mental model around dependencies remains identical; you still write e.g. use regex::Regex; 90% of example code (i.e. that doesn't have internal uses) still works without warnings and with the same mental model.
  • Your mental model around internal paths has to change, but it changes to match your mental model around dependencies, which also matches your mental model around imports in Python, #includes in C++, usings in C# and Java, etc. You rarely run into this in example code, and when you do it's probably in examples of the module system itself.
  • Your mental model around paths in expressions remains identical, and is strengthened by the removal of dependencies from the namespace of the root module. Even without full 1path, this might be enough to train yourself to write ::std::fmt::Debug instead of std::fmt::Debug all the time! (And if not, std or even just the traits you pass to derive all the time could be added to the prelude.)

And this is what I imagine as the migration process for this proposal:

  • Learn about extern:: paths as a replacement for extern crate items and update your code. This change is enormous, and affects almost all of your imports.
  • Learn about use paths no longer accepting unprefixed paths and update your code. This change is also enormous, and affects almost all your remaining imports.
  • Your mental model around dependencies has to change to something completely new, that doesn't exist in any other language you've used and has no analogy to anything else in Rust. All example code is suddenly also broken, so you have to translate it to the new model in your head when you read it.
  • Your mental model around internal paths has to change; now they look like absolute paths except they're not because they can't get to dependencies; except they can for backwards compatibility; except they shouldn't use-as-an-item is deprecated. You also run into this when reading example code, because its mostly-external imports happen to look a lot like the new internal import syntax.
  • Your mental model around paths in expressions has to change; now ::-prefixed ones only work for crate-local items, and you have to write extern:: on dependencies, except for std because that's in the prelude, but you could have gotten that without all the rest of these changes!
8 Likes

We discussed this some in the meeting. I definitely agree with @rpjohnst that there are different ‘perspectives’ in terms of how to weigh the “feeling” of the scale of a change – being able to express the delta simply is one way, but not the only way. Some other things to consider (basically echoing @rpjohnst’s list):

  • Affected LOC
  • Stack overflow answers
  • “Muscle memory”

There are also other things to consider. For example, the default setting of any “transition lint” (allow, warn, deny) probably makes a big difference.


So how about the various proposals?

I am not convinced that any of the proposals do better than the others in terms of “affected LOC”. In my previous summary comment, I described the difference as “low, medium, or high” – but in reality, it’s more like choosing between an A-, A, or A+. After all, with leading crate, all use statements within a crate still change. So basically every file is affected.

In terms of the impact on stack overflow answers, though, leading crate probably does better, since I imagine that most imports there concern external crates. Here, the difference between warn-by-default and deny-by-default also seems key: if you copy-and-paste some text, and it tells you "this style is deprecated, prefer use extern::regex", but the code otherwise still works, that may be a good sight better than getting errors. I’m not really sure how best to think about that.

(On the other hand, if there is ever a time to transition from warn-to-deny, it’s at the epoch point.)

In terms of muscle memory, I don’t know how to judge. I think that @aturon’s original notion of minimality seems important – the more you can “ingrain” the rules, the better – but that might not be correct.

A lot of this seems to come back to how important ‘1path’ is. If we want 1path, we have to accept that every use statement will change. Personally, I am not convinced this is a lot worse than every crate-local use statements will change, and I think that 1path is a great property to have (even though we won’t get it immediately). But it’s not an obvious trade-off.

2 Likes

I kind of hinted at this in my last post, but maybe rather than thinking directly in terms of unification between use paths and absolute paths elsewhere, we could look at how the motivations for 1path are addressed by the proposals?

My impression from earlier discussion is that the biggest draw on 1path is less in the literal sameness across different kinds of paths, and more in the way it encourages people to instinctively write the correct form when they reach for an absolute path outside of an import.

Further, you once mentioned that you usually do this for a very specific set of names from std, rather than for anything from any dependency.

Given these two points, are there any alternatives to 1path that would solve the same problem? I suggested that removing std from the root might help, since people would stop writing un-prefixed std::cmp::min or std::fmt::Debug in examples or small exploratory crates. I also mentioned the disconnect between traits being un-imported and un-prefixed in #[derive], which is often what std::fmt::Debug is transitioning from, so we might add some of those to the prelude.

Maybe we can bend the curve we've set up for ourselves here, to simultaneously solve the problems 1path addresses and preserve mental models by preserving the majority of use paths. Because personally I think those mental models are much more important than 1path itself.

2 Likes

I strongly believe that most of the code written in Rust is still in the future, not yet written - hoping that Rust will be the go-to systems language for many decades to come.

And so, the change of this epoch transition in LOC is almost irrelevant, imho. What matters is the long-term effect on the easiness to understand, learn and use - and ultimately, the number of minds won. Most Rust programmers have not yet been introduced to the language.

Existing examples, blogs and documentation is the bedrock of the means of growing knowledge and a community of new Rustaceans. We do not want to undermine this investment and wealth of information.

11 Likes

Hm, this is a very neat idea! I am a little bit worried that extern crates and std work differently as a result of this: the fact that std is special-cased is one of the drawbacks of the current system.

However, what if we add all extern crates to the prelude? The system would work like this:

  • There are no absolute paths at all, every path starts relative to the current module.

  • By default, the namespace of each module includes:

    1. all extern crates
    2. std (which is a special case of 1)
    3. crate pointing to the root of the current crate
    4. super pointing to the parent module
  • Paths in use and in function bodies are the same relative paths.

  • There's no syntactic form for absolute paths, no ::, no extern, no :.

  • Usual shadowing rules apply.

I am totally not sure that this is a great idea, and I have not though out the transition paths, but it seems to be a proposal not mentioned before, and I find "absolute paths are hard? Get rid of absolute paths" approach neat :slight_smile:

4 Likes

This would make all the module system changes even worse than they currently are, losing even consistency.

My enthusiasm for extern paths was dampened recently by this thought:

Currently, we print out “absolute paths” in our error messages. This is ungreat, and I’d like to give relative paths, but is always a useful thing to be able to do. Right now, we print something like std::option::Option<std::string::String>. If this were to become extern::std::option::Option<extern::std::string::String>, it would be that much worse.

This is not new, it’s sort of a proxy for “using paths in code”, but I think it’s an interesting and non-obvious interaction.

11 Likes

Recently I was revisiting the way that Java handles modules – to see if could have anything to offer us – and I wanted to leave a few notes. As you may recall, in Java, imports are always absolute paths (java.util.Vector); outside of an import, you can use either a relative path Vector or an absolute path (java.util.Vector).

Java’s system has “half” of the 1-path property: paths from imports work outside of imports, but not vice versa. That is, you can’t write import Vector, nor import Vector.SomeClass (iirc). However, the other half in Java is not particularly natural, since there the only way to name a package (iirc) is via an absolute path. That is, if you are in the util package, you cannot access java.util.concurrent.ArrayBlockingQueue by doing concurrent.ArrayBlockingQueue (as you can in Rust). You must do an absolute import.

“Ported” to Rust, Java’s system feels very similar to leading-crate, except that instead of writing ::foo::bar inside of a fn body, you would just write foo::bar.

Along many dimensions, if we could achieve a system like this, it feels like an overall win. From an ergonomics and learnability perspective, I think it’s largely a win: It has no sigils anywhere; it supports convenient absolute paths that are as short as they can be (shorter than today). It also achieves half of 1-path.

However, I am concerned that the other half of 1-path would still be a problem here. That is, to import from a module that is a child of yours, you still have to write use self::foo::bar (or use an absolute path). As I wrote above, this doesn’t arise in Java, and it may be part of why Java’s scheme feels less confusing.

Anyway, one other concern is with implementability. Obviously, a scheme like this relies intrinsically on fallback. It might not even be feasible. The main case that would be a concern would be a macro invocation like foo! { ... } that occurs at the top-level, since that could be referring to a foo macro defined in the root or a macro defined locally (perhaps a macro that will be created by some future macro expansion).

Anyway, I don’t know that this is viable, but I thought it was worth writing out my notes for future reference somewhere that I could find them again.

3 Likes

I can honestly say that the literal sameness, and the ability to copy-paste bidirectionally between use lines and code referencing names with full paths, is the critical property of 1path for me. I don't want to have one set of rules for use and a different set of rules for non-use.

4 Likes

My concern is that would make rust's portability story more complicated, which I personally would find disappointing. How would such a system interact with no_std? It makes that whole mental model a bit more complicated.

I actually used to do this before I understood the module system better. It is surprisingly really annoying.

This is somewhat tangential, but writing Java imports by hand is exceedingly rare with today's ide tooling and I doubt anyone would cite it as a pain point. Typically someone will write some amount of ArrayList and auto-complete the rest of the declaration, including adding the import.

4 Likes

I tried to put together a question along the lines of “the learning friction seems to come from submodules behaving differently to the root module, can we just make submodules behave the same?” but I couldn’t word it nicely. This seems like a good starting point!

There’s one thing that the root mod does which isn’t listed here, which is add modules to the namespace root as well. This is where the “I have to refer to a module I just declared via self!?” shock comes from:

mod foo {
    pub struct S;
}
use foo::S;
// ^^^^^^^-- `foo` is always in scope because it's in the root module.

mod not_the_root {
    mod bar {
        pub struct T;
    }
    use bar::T;
    // ^^^^^^^-- `bar` isn't in scope here! Need to `use self::bar::S`.
    use foo::S;
    // ^^^^^^^-- This will work though.
}

I think the migration path for “include extern crates in all namespaces” and “include all local mods” could work: in both cases you had a use thing or use self::thing which is now redundant, except for disambiguation. Haven’t thought through it thoroughly though.

Regarding #[no_std] environments, I think the concept of “make every mod do what the root mod does” translates pretty directly - did you have any particular concerns @mark-i-m?

2 Likes

These two sound very similar, and the Java analogy really sells me on the idea. I think it might work best when combined with leading-crate, like @matklad describes.

This should have minimal actual churn, much like leading-crate. All external use statements remain the same; the mental model around dependencies is the same. The mental models around internal paths and paths in expressions change, but they're simplified and brought into line with dependencies so as a whole things get easier to use.

Implementation-wise, the absolute-or-local approach sounds much simpler than the full relative use version that IIUC was what turned out to be unimplementable. With the "add dependencies to the prelude" framing and crate it shouldn't even need fallback, at least not with a flag-day transition. It may also be possible to enable full 1path?

2 Likes

I haven’t had time to really digest this lastest direction, but here’s some stream-of-consciousness reaction.


For clarity, I want to re-state what I think is the combined proposal you have in mind:

  • Absolute paths always begin with a crate name, where crate is the way to refer to the current crate.
  • In the new epoch, use statements must use absolute paths.
  • Outside of use statements, you can freely use in-scope names or absolute paths; in-scope names shadow external crate names.

Thus, any use path can be freely transported into an expression with the same meaning (modulo shadowing).

I don’t agree to the “minimal churn” point for the same reasons as previously stated: almost every file will have to change, regardless. I think we should set that particular point aside.

If we take the “flag day” approach (as in my bullets above), then the experience when copying from StackOverflow will come down to our error messages when you fail to write a leading crate on an import; I suspect we can make those quite good. Notably, however, that’s the equivalent of implementing fallback.

We could consider retaining leading :: as a disambiguator for referring to extern crates that are shadowed; like today it would mean “this is an absolute path”. But normally it would no longer be needed.

I imagine some folks will be nervous about the possibility of shadowing, but the chances of compilation succeeding on an accidental shadowing are vanishingly small. The fact that Java follows this model does suggest that it’s probably a pretty safe bet.

re: implementation, we essentially already have this with the prelude today, which is treated as a fallback for resolution. We could quite literally inject into the prelude, as @matklad originally suggested.

Overall this seems like a reasonable addition to the design space. I suspect it would lead to a somewhat decreased use of use, in favor of e.g. just writing futures::Future in-line.

The loss of 1path here doesn’t seem like a deal-breaker to me. As I argued in the initial RFC on this topic (with leading crate), the fact that all use statements will use absolute paths, and in particular references within the crate start with crate, will be a very easy reminder when moving up a relative path that you need to add a leading self:: (or crate::).

IOW, I think I want to step back on the full-on commitment to bi-directional 1path, and consider the more global tradeoffs including ergonomics. That’s not to say that I prefer this proposal yet, but just that I’m open to it :slight_smile:

When I have more time, I’ll try to write a head-to-head comparison of what, I think, are the two “leading” proposals in each “camp”.

6 Likes