Relative paths in Rust 2018

would this break my crates? (hexchat-plugin, mpv-audio, etc)

Having started using 2018ā€™s module system in real code, Iā€™m a fan of this. Perhaps itā€™s not Best Style, but I personally like to use Enum::* before a match because I find that, when the enum name is large enough and/or there are enough variants, the stutter of Enum::Foo => foo, Enum::Bar => bar, etc. gets at me. Currently, this requires use self::Enum::* which feels quite weird.

Iā€™m happy with a hard error for ambiguity in use statements for now, since itā€™s the conservative option and for type names, style dictates that they start with a capital letter so that they wonā€™t conflict with any crate names. Iā€™d prefer full unification, but that could be done later when it was definitely safe.

I just also filed https://github.com/rust-lang/rust/issues/52209 which is not super relevant here but probably worth linking in terms of ambiguity hygiene.

1 Like

Hmm, I think I've preferred super over using absolute paths in my usage (and I know I introduced a bunch of these in cargo) for things that actually live in the parent module (or in a sibling module, but are exported by the parent module). I think my intuition is that using the relative super is better in most cases than fixating the paths outside the current module by using the absolute path to something in the super module. This keeps things nimble if I want to rename/refactor later.

On the other hand my only uses of self have been for using extern crates outside the crate root (to hide a dependency) and to import local enum variants, and I think 2018 will make both of these go away.

1 Like

I am so existed of this feature. Both this slightly modified proposal and previous are good enough. I only have a small feedback.

It would be nice to treat all files in a directory belong to that module instead of sub-module. for example given

collections/
    |- vector.rs
    |- hash_set.rs
    |- hash_map.rs
    |- index_map.rs

It would be intuitive to use collections::{Vec, HashSet, HashMap} instead of collections::vector::Vector etc. Not sure if this was considered/discussed.

I find that I almost never use super::, with the sole exception of mod test { use super::*; ... } at the bottom of a module to wrap all its test code.

1 Like

This was discussed earlier on, but rejected because it makes it harder to find things. Some people value having a single mod.rs that reexports things from elsewhere in that directory.

In addition, there was also discussion around various ways to make that reexporting easierā€”but it was also left behind, because we couldnā€™t agree on a nice way to do it.

(This new proposal that allows relative paths in use does simplify things a little bit though, from pub use self::vector::*; mod vector; to pub use vector::*; mod vector;.)

Thanks @rpjohnst

Would you know, how is this harder?

Though verbose, this is an improvement.

To summarize the entirety of the modules discussion the issue, much of the Rust userbase considers having mod xxx; declarations a good, illustrative verbosity ā€“ it lines out the structure of the code in the code, rather than inferring it from the filesystem. Another quoted utility of the mod statement is the ability to quickly and temporarily remove a module by commenting out the mod declaration.

I saw an idea for pub use mod xxx; or reexport mod xxx; or other ideas for flattening external file modules. Perhaps once the main group of module changes land, you can repropose that small sugar to make this pattern much easier.

Implicit flattening in this manner also contradicts the non-mod.rs modules feature added recently. Each .rs is its own module, and thus a privacy barrier, and itā€™s up to your code to reexport it if desired.

If that were the case we would not have considered non-mod.rs modules, isn't?

We won't have that flexibility with non-mod.rs modules unless we introduce a mod.rs file, am I correct? In practice, not having this has never bothered me in other langs.

I don't see how this impacts the privacy barrier.

What I see here is more configuration over convention. There is not much magic here to prefer explicitness.

If you have the following layout:

src
\ main.rs
\ one.rs
\ two.rs

This represents the module hiearchy:

crate
\ one
\ two

Which is evidenced by your main.rs containing mod one; mod two;.

Non-mod.rs mods is that you can do this:

src
\ lib.rs
\ one.rs
\ one
| \ sub.rs
\ two.rs

instead of

src
\ lib.rs
\ one
| \ mod.rs
| \ sub.rs
\ two.rs

for the following module hierarchy:

crate
\ one
| \ sub
\ two

All of the mod xxx; declarations remain. The only difference is that xxx/mod.rs is allowed to be called xxx.rs.

TBH, I think the feature was poorly named. You already have modules from .rs that are not named mod.rs. It's just that those modules were required to be leaf modules previously, rather than being allowed to have submodules.

The privacy barrier is the module. You're suggesting that the contents of src/module/source.rs be in the module crate::module instead of crate::module::source. Thus, it changes privacy.

ahh I missed this. I was thinking that you don't need those mod statements. Thank you. So it is more verbose than what I was thinking :slight_smile:

I was more suggesting contents of src/module/source.rs would be in crate::module if it is not explicitly defined in mod.rs as if all the contents of source.rs is directly in mod.rs

All of these posts from people saying they never use self or super has me worried that somebody at some point is going to propose getting rid of them. So, as a defense mechanism, I feel the need to share my pet pattern:


Most of my modules begin as inline modules, created ad-hoc for the purposes of namespacing and/or reasoning about privacy and invariants.

For instance, say I have this helper type Thing, and I want to implement IntoIterator for it. Often Iā€™ll wrap a mod thing around Thing so that I can use a standard name like Iter:

// a newborn baby module, defined inline (there is no thing.rs)

pub(crate) use self::thing::Thing;
pub(crate) mod thing {
    use super::*;
    
    pub(crate) struct Thing { ... }
    pub(crate) type Iter<'a> = ...;

    impl<'a> IntoIterator for &'a Thing { ... }
}

Here you see:

  • A facade pattern reexport of Thing, using a self path.
    • done here because Thing existed at this location before mod thing
    • I always put the reexport right above the module, like an annotation of sorts. To me, the mod-plus-reexport are a single unit.
  • use super::*;, for precisely the same reasons that many people use it in a tests module.
    • It makes the module cheap to create; nothing is ā€œtoo smallā€ to deserve one.
    • My imports are still easily found, at the top of the file.

Later, when mod thing has gotten a fair bit meatier and is beginning to take up a lot of space, I upgrade it to its own file:

  • The super::* import is replaced with a full import list (all absolute paths).
  • Facade pattern re-exports remain near mod thing; and continue to use self:: paths.
  • Private imports from self::thing, if any (this is rare), are changed to absolute paths and put in the parentā€™s import list like any other.

As you can see from the first and third bullets, even I agree that relative paths are not nice for most purposes. For private imports I prefer use path::to::this_mod::thing::foo over use self::thing::foo, because, as I often see it, I am more likely to move thing.rs (to e.g. somewhere nearer the root) than I am to move the entire directory for this_mod.


tl;dr:

  • self:: paths are useful for the facade pattern.

  • self:: paths and super::* are both useful for inline modules.

2 Likes

My initial response to this proposal was somewhat negative, along the same lines as @steveklabnikā€™s post. In particular, it felt to me like too big of a change to consider so late in the game. But after talking to @aturon, I realized the change was not so big after all. Since I think other people might have similar feelings, I thought Iā€™d write my own explanation of the change, and why its not so big.

The problem being addressed here is, one of the central problems the module changes is trying to solve, is that paths are resolved differently between use statements and local paths. In Rust 2015, the way this works is this:

  • Use statements are resolved from the crate root.
  • Local paths are resolved from the local module.

Despite being simple to explain, this has caused lots of confusion. So the 2018 editionā€™s current approach looked like this:

  • Use statements are resolved from the ā€œextern crateā€ scope.
  • Local paths are resolved from the local module, falling back to the extern crate scope.

The implication of course is that in 2018 local paths canā€™t be used in use statements, practically necessitating self. In other words, the difference between use and local paths is that paths that would start from the local scope are considered errors in use statements.

The proposal in this thread is to make this change:

  • Use statements are resolved from the local module, falling back to the extern crate scope, with an error when you use is a name defined in both contexts.
  • Local paths are resolved from the local module, falling back to the extern crate scope.

This brings us a lot closer to 1path by just reducing the error cases in use statements to the actual potential ambiguities, rather than disallowing all local paths. If this doesnā€™t introduce a lot of implementation challenges, this does seem worthwhile to consider.

32 Likes

Josh, using std as your example is misleading, since thatā€™s a widely accepted crate name. The problem is that having any local declaration named the same as any crate introduces ambiguity under this proposal. An obvious solution is to provide a dedicated and explicit syntax for absolute paths involving extern crates, e.g. extern::rand::random.

Requiring all absolute paths except extern crates to have a leading identifier is probably the worst of all worlds, since thatā€™s the one case where you really want to be explicit. If you donā€™t want to have extern:: sprinkled throughout the module, then you could place a use extern::rand at the top of the module, and directly reference rand::random from then on.

Edit: s/explicit::/extern::

3 Likes

6 posts were split to a new topic: Removing/changing the prelude?

Fallbacks? You mean, magic?

2 Likes

In this context ā€œmagicā€ is a negatively charged term, and isnā€™t even specific enough about whatā€™s wrong with the fallback to respond to such criticism.

5 Likes

True, it is meant to be negative and I think it is right in rejecting the suggestion firmly and outright. I strongly believe it is generically bad to resolve paths with an algorithm that can suddenly go in an entirely different direction depending on some fact you do not know immediately as a programmer. And the outcome would be hard to debug too, esp. if you do not have the source of extern crates on hand. Keep the rules as simple as possible please. If that results in backward incompatibility or challenging implementation, then weigh that against keeping things as they are. I strongly want to prevent we end up in worse place than before by making the language even more complex, for the sake of less important values I see stressed too often, such as:

  • brevity;
  • elegance;
  • backward compatibility;
  • easy implementation.
1 Like

Something I noted on Reddit:

(Modulo glob imports,) In the 2018 modules model, a leaf .rs always has enough information locally in it to determine whether an import is from a crate or a module, without or with allowing local paths in use. This is not true in the current 2015 model.

A 2015 path starts with (a keyword or) a name in scope at the root of your crate. This could be a module or an external crate, you canā€™t know without checking lib.rs.

In contrast, a 2018 path starts with (a keyword,) a crate name (where crate stands in for the local crateā€™s name) or in this extension, a locally defined symbol. This means that (modulo glob imports), your non-root .rs contains enough information to determine if an import is from an external crate, a module, or a local symbol, because the path cannot start with a root module and local symbols are local.

Glob imports break this a little, as you canā€™t know every symbol that is glob imported without knowing the glob import externally, but Iā€™d argue that any glob import importing a snake_case name is questionable style other than a few very specific, well-known identifiers.

That said, whatā€™s complicated about ā€œa path starts with a crate name or a local name, and is an error if thatā€™s ambiguous; hereā€™s how to disambiguateā€?

(Note: I am overall neutral on this proposal.)

4 Likes

That is not what is proposed. Ambiguous paths in use are forbidden, so if you get a path wrong, it won't compile, and the compiler will tell you why. I presume it'd be something like:

use foo;
    ^^^ error: you have both module `foo` and crate `foo`, 
        use either `crate::foo` for the module in this crate, 
        or `::foo` for the other crate.

And note that all of these names depend on your code ā€” you control names of your modules, and crates you import (and it's possible to import under an alias).

I don't expect the ambiguity to be a problem in practice, because it's in programmers' self-interest not to give same names to different things :slight_smile:

Even in Servo's case of cookie crate and cookie module, I think that's just unnecessarily shooting oneself in the foot. These could be imported as (I'm guessing here) cookie_storage, cookie_parser, cookie_whatever to avoid ambiguity. Even when the compiler can properly namespace a bunch of things named cookie, it's still needless complication for humans reading the code and having to disambiguate namespaces in their heads.

3 Likes