Relative paths and Rust 2018 `use` statements

Update

After discussion here and elsewhere, there’s a new post with a more complete proposal and fresh comment thread. Please head that way!


I had a lengthy conversation with @josh last week about the Rust 2018 module design, and we uncovered a new variant that might be worth considering (and implementing) as we gain experience with the preview.

The design as it stands

To recap, in Rust 2018’s current design:

  • use statements take fully qualified paths.
  • A fully qualified path starts with the name of an external crate, or a keyword: crate, self, or super.
  • Outside of use statements:
    • Fully qualified paths work, and have the same meaning as in use, unless a local declaration has shadowed an external crate name.
    • Paths may also start with the name of any in-scope item, i.e. an item declared or used in the current module.

I won’t give the full rationale here, as it’s not vital for understanding or weighing the new idea.

A possible variant

The main insight is that we can make relative paths in use more ergonomic in a simple way:

  • use statements take module-local paths.
  • A module-local path starts with the name of an immediate submodule (not any other declaration), the name of an external crate, or the keyword crate or super (no self)
    • If a submodule and external crate name conflict, the submodule shadows the external crate. We can provide an additional syntax (like leading extern) to instead get the external crate, if necessary, but the situation should be rare.
  • Paths outside of use work the same as in the current design.

In this variant, paths in use statements and paths outside them work almost the same way; the main difference is that the “module-local paths” in use statements start with a crate or submodule name, whereas more general paths can also start with anything else that happens to be in scope. For example:

use std::collections;

// this doesn't work, since `collections` is neither a crate nor a submodule of this module
use collections::Vec;

// this works, though, because `collections` is in scope
fn foo() -> collections::Vec<u8> { .. }

Tradeoffs

Benefits of module-local paths

  • Ergonomics. It’s common to forget self:: when writing a relative import, I think in part because it’s common to expect relative paths to work – especially since they do work outside of use statements.
  • Familiarity. This setup is much like paths at the CLI: relative by default, with a set of “prelude-style” names always in scope (including the actual prelude and extern crate names), much like $PATH. It’s also an approach taken by other languages, including Python2.
  • Uniformity. While this doesn’t go the full distance toward making use paths the same as paths everywhere else, it addresses the most common remaining source of friction when trying to “hoist” paths into use statements.

Costs of module-local paths

  • Non-locality. In the current Rust 2018 design, use paths are always fully qualified, meaning that the path alone tells you 100% of the information needed to interpret it. There’s never any question where the path is rooted. Given that use statements are a key way of understanding the bindings in a module, this clarity seems potentially valuable. (The counter-argument is that one is generally aware of submodules when working in a module, just as one is aware of other local declarations.)
  • Shadowing. Because module-local paths are not fully qualified, there’s an opportunity for conflict (between a crate name and a submodule name). We’d almost certainly want to shadow the crate by the submodule (matching expectations from other similar path systems), but that then entails a means of disambiguation. OTOH, we can plausibly deprecate self:: at that point.

Name clashes in general

One particular issue to draw out: in general, the way that use paths and paths elsewhere differ can cause confusion when there is overlap in names. For example, in the current Rust 2018 design, we have:

// This is in the current Rust 2018 design

// this refers to `MyVec` from the *external crate* `collections`
use collections::MyVec;

mod collections {
    struct MyVec { .. }
}

// this refers to `MyVec` from the *submodule* `collections`
fn foo() -> collections::MyVec { .. }

Probably we should lint against any situation in which a local declaration shadows an external crate name.

The variant design using module-local paths, OTOH, resolves the example above (because the use statement would now refer to the submodule), but still suffers the problem with other kinds of declarations:

// This is in the variant design

use std::collections;

// there's no submodule `collections`, so this refers to a `collections` crate
use collections::MyVec;

// OTOH this refers to the `std::collections` submodule
fn foo() -> collections::Vec<u8> { .. }

It’s not clear to me how much we should be worried about these kinds of cases – we can lint against all of them. However, if we wanted, there are a few ways we could avoid these issues:

  • Make “fully qualified” be fully explicit. We could have all use statements start with an explicit designator of the root, e.g. extern::, crate::, etc, so that they would all have precisely the same meaning outside of use statements. But of course that would be far more verbose, just to avoid a weird edge case.

  • 1path: we could fully unify the paths for use statements and otherwise. That has the downside that you can “cascade” use statements, like use std::collections; use collections::Vec; which is confusing in its own right. Not to mention the implementation challenges.

A conservative route?

It seems worth considering whether there’s a tweak to the current design that would be forward-compatible with this one. The main problem is with conflicts between submodule and crate names, which under the current design would lead to use paths being interpreted as referring to external crates, and in the variant would be shadowed by the submodule name. To leave open forward-compatibility, we’d have to produce an error in such cases, and require use of some disambiguating syntax.

4 Likes

Speaking now for myself: first of all, I’m sorry for re-raising the spectre of module system discussions :joy:! But OTOH, it’s bound to happen as we come down the home stretch of stabilization; we’re going to need to rehash the discussion and examine our experiences, and it seems prudent to consider this new, nearby design.

From a personal perspective, I’m not entirely sure how I feel about the new variant. I really liked the complete locality of use paths in the current Rust 2018 design. But I have a feeling that in practice, the new variant is more intuitive and ergonomic, and unlikely to actually create confusion around clashes (especially if we lint).

If others think it’s worth at least trying, we might consider implementing this variant with a separate flag so we can experience it first-hand.

4 Likes

I’d have to think about this more carefully, but my off-the-cuff reaction is that this loses a key invariant that made name resolution tractable, which is that we always know the “full path” that is being imported.

In particular, the challenge is when you have a module like this:

use foo::bar;
...
baz!(...);

and you know that the foo crate is available, you still have to wait until you expand baz! to know whether it is going to generate a mod foo that would have shadowed foo.

Now, sometimes we handle this with a kind of “time travel violation” rule, where we go ahead and resolve to the crate but come back with an error if a module arises later that shadows – but you said that the module would take precedence, so that is harder.

That said, i’m a bit out of date on how things work, @petrochenkov may be able to weigh in on the latest tricks.

(I’d love to create a formal model of the name resolution system, as an aside, analogous to Chalk…)

7 Likes

I would always prefer explicitness over magic. I think both variants are equally on the wrong side of this. I like the idea of making non-relative paths explicit in all cases:

  • use extern::<crate name>::<rest of path>
  • use crate::<rest of path>
  • use super::<rest of path>
  • use <relative path>

Same for paths elsewhere. Make it explicit for non-relative paths and relative otherwise. No ambiguity (I think) ever.

2 Likes

Thank you very much for writing this up, @aturon!

I’d like to call attention to the mental model that this proposal encourages, in particular.

As described in the current Rust 2018 preview edition documentation, external crates are still always in your local scope, effectively as part of every prelude.

So, the proposal here is that just like you can always write foo::bar whether foo is an external crate or local submodule, you can always write use foo::bar whether foo is an external crate or local submodule. Effectively, use always resolves its first component via local name resolution, and external crates happen to always be available as local names.

1 Like

IMO the collision disambiguator here should be ::, not extern::. This would mean repurposing leading-:: rather than merely deprecating it, but a key point to make here is that it’s merely “switching which half” stops working.

That is, today a path ::foo can refer to either a dependency or a top-level module. Under the current 2018 design, it’s deprecated, and it can only refer to a top-level item (unfortunately including any potential use foo, thanks to use-as-item…). Under a repurposed ::, it can only refer to a dependency.

That would break existing uses of ::foo today, though. I don’t think breaking that particular backward compatibility is worth it for a shadowing/collision situation that should rarely arise in practice.

Verbosity in-service of avoiding "weird edge cases" sounds like a feature, not a bug.

Not if the edge case is one we can trivially lint about to catch that ambiguity.

We already have that edge case in the current 2018 edition preview. If you have an external crate foo and a local submodule foo, and you write foo::bar outside a use statement, that’s ambiguous. What’s especially confusing in the 2018 edition preview is that use foo::bar resolves that differently than foo::bar elsewhere.

We already need to resolve that ambiguity, and ideally lint about it. This proposal introduces a way to do so, but we’re also expecting you’ll rarely need it.

2 Likes

That said, I think we intended to lint against modules that shadow crates, right? In that case, the behavior in this edge case is probably less important. One could imagine a future edition ruling out such shadowing altogether in some form.

640k is all you'll ever need. Famous last words.

I know, I know, this isn't actually an accurate depiction of the original quote, but, it does illustrate something. There is always a tendency to under-estimate the amount of head-ache that "Corner Cases" or "Exceptions" will be.

Or we could make it an error (in the new edition, say) to have use foo if foo winds up being shadowed. (I don’t think we can do it in the old edition.) That’s a bit of a pain because we’d need some migration lints around it. :slight_smile:

I think we need to always allow for such shadowing, for compatibility. You might have situations in which you need that shadowing in order to preserve semver.

Suppose, for instance, your crate provides a module foo as part of its API, and then you need to pull in a new extern crate in order to implement some new API you’re introducing, but that crate happens to be named foo. You can’t rename your own module without breaking semver, so you need the ability to shadow, and then you also need the ability to refer to the external crate.

So, we need the escape hatch. But I don’t think we’ll need to use it often.

Which is why we always have to have the escape hatch to allow you to resolve the ambiguity. If we made it a hard error to shadow, or didn’t provide a means of handling the corner case at all, that would indeed be a dangerous lack of foresight. But what we’re proposing here is to simplify the common case while still supporting the corner case.

You can always rename the dependency foo rather than shadowing.

1 Like

Fair enough; I hadn’t considered that possibility. Still, I don’t think we want to go there any time soon. For now, for compatibility, I think we should stick with a lint, similar to the existing lint we have for shadowing an imported name (complete with the exception we have for ::* imports to ensure that introducing a new name in a module doesn’t break semver).

Hmm, thinking more on this, I think I might find it surprising that this is limited to modules. Perhaps I am too stuck on the current thinking, but consider a case like this:

enum Foo { Variant1 }
use Foo::Variant1;

might you be a bit surprised that this doesn’t work, given that it would work for a local module?

5 Likes

I suppose one would gain an intuition for “use statements is for pulling things from other modules”.

One thing worth noting though -- the original motivation (years ago) for use self::foo was glob imports. But this proposal I guess sidesteps that, since I don't think we expect this sort of thing to work, do we?

use std::ops;
use ops::Range;

(This in turn means that use std::* needn't block resolving use ops::Range)

Instead of renaming the external crate, the obvious solution is to do what you would do in any other scope/shadowing situation: rename the local name if you need access to the shadowed name.

Since you’re in control of your own local modules, if you find yourself having created a std.rs local module and want the original std, no problem! Just rename std.rs, a local transformation.

This is a little different for public modules (renaming becomes a breaking change) but you can still work around the problem by re-exporting the external crate in a crate-local location (making it eligible for crate:: prefixing) if you find yourself completely unable to do the local rename.