Revisiting modules, take 3

Another thought just occurred to me: If export also has all the effects of use; maybe it should be export use or so.

Thanks for writing up the next iteration. My reply to the last iteration. The rate of parts I agree with / parts I disagree with has mostly stayed the same.

I'm still strongly against deprecating explicit mod. That's a bad idea and it means you need to open the file view (which I hate opening and which is highly inconvenient to use in larger projects) and do all kinds of hacks like defaulting publicity to pub(crate) (which makes stuff harder for people who want to do stricter crate privacy).

Second, I want to repeat that I'm a great fan of the concept of separating use from the local crate, and separating use from external third party crates. The concept of from use is really awesome! I'm open to any other syntax that has been proposed (including the @ notation and your proposal), as long as it keeps the separation.

I'm not really a fan of

use std::io;
mod bar {
    use io::{Read, Write}; // NOTE: No `std::`
}

Its definitely not nice (lets call it "bar example" below) and think it should be treated like a bug. But I do like:

use std::io;

fn foo() -> io::Result<()> {
}

Because it allows some shallow level of scoping without importing everything from io or having to provide an explicit list. You can easily ctrl-f for io:: and you find everything you need!

And I also like the "everything is an item" rule. It keeps concepts consistent. In fact, recent movement has been to add consistency here, by adding macros to the traditional namespacing system, not to remove it. So this change would be against the general trend.

Regarding the "bar example": you have taken it as example of how "everything is an item" is bad. I don't think this is actually due to that rule, but more because use is not consistent with other items. Because for other items, the same setup fails:

fn baz() {
}

mod bar {
    fn foo() {
        baz(); // error: cannot find function `baz` in this scope
    }
}

I do think we should keep the "everything is an item" rule, and address your issues in other ways instead:

  1. Deprecate extern crate and let users do from crate use or use @<crate>::<path> instead.
  2. I think this can be fixed by better teaching! Just expand documentation. The separation of use and mod is really great and allows you to have public modules and private ones.
  3. To make the "bar example" an error which it deserves to be, you should just make use look into the super module for any uses. If you instead had to write use super::io::{Read, Write}; I think this would be very obvious.

As I like the "everything is an item" rule to be kept, I don't think we need to replace pub(use) with pub extern.

I disagree that multi-crate projects abide the rule. In fact, multiple high quality projects don't abide by the rule:

  • main rust compiler
  • cargo
  • servo

And there is an actual reason why they do it: the src directory is only overhead for them. The src directory makes total sense if you have a single crate project directly in the main directory of your git repo. Here you have lots of files like README or .travis.yml or LICENSE-* that are clearly no rust source code and where a separation is a good idea. But if you have a multi crate project, those files are already separated from the source code by the crate directory layer. There remains one sole file which gets separated: the Cargo.toml file. And doing this additional layer for one file is not really useful IMO.

The module system is not a hard to understand part of the language and having patterns absolutely 100% consistent over all projects shouldn't be so important that you are okay with making coding harder for the contributors of multi crate projects. I guess you'd argue that this is in order to make it easier for new contributors to join, well I don't think that understanding that there is no src directory is really a big leap.

I'm a big fan of having a consistent build system, especially considering what an inconsistent mess C/C++ buildsystems are, but this doesn't mean that there should be no way in cargo to adjust the patterns a little bit to make them best suiting.

@withoutboats I’m not a fan of your proposal.

For one thing, I already thought use and pub use today work exactly like use and pub extern in the proposal. There is nothing in any documentation I read that would suggest that regular use is an item. It’s counterintuitive. With that in mind, “fixing” the issue by eliminating pub use and adding yet another keyword seems like a step backwards. We already learn pub use as “works like use, but also reexports the name”.

The second issue is that it exacerbates the path confusion, making the default into something even less conventional. Not only would relative paths still mean something different in use than in regular code, they would mean something different than absolute paths as well.

Third and last, your motivation for from being syntax sugar is confusing me. Syntax sugar is supposed to be more ergonomic, yet you yourself admit that people would not use it, and “solve” that issue by actually prohibiting the non-sugar variant. If you have to force people into using the “syntax sugar” by preventing the baseline variant, you are doing something horribly, horribly wrong. Also, you haven’t convinced me you’d gain anything by it except complicating the module situation even further.

Neutral opinion on the rest.

I really like this part of the proposal. I feel like it provides a good universal naming scheme that can work everywhere.

Which is why...

...this absolutely baffles me.

Let's please introduce a new universally clear, nicely separated form, that puts external and internal items in clearly separated namespaces, and then use that exact form in the use statement, please. It's incredibly intuitive that use uses the same paths that work everywhere else; please don't break that in the process of improving it.

4 Likes

The only way to do that is to completely throw backwards compatibility out the window, which seems a far bigger problem to me. Perhaps allowing the new universal paths in use could be made to work, but forcing a new prefix onto every intra-crate use path is a terrible idea.

To restate these objections as I understand them:

  • Its not great that use uses different paths than those inside the module.
  • Separating use and export into two different keywords is worse than just having pub use work differently than use.
  • Imports/exports should just use the absolute path syntax.

I think all of these objections are reasonable. I chose pretty opinionated default settings, and I didn't clearly articulate the motivation for them.

I've experimented with applying this and the previous proposal to existing crates (mainly futures and chalk) and made these observations:

  • Absolute paths are rare inside modules; inside modules you almost always want self:: paths.
  • Import statements (use) are mostly from the crate root (crate::); many are also from extern crates (extern::); a small number are from this module or the parent (self:: and super::).
  • Export statement (pub use), in contrast, are mostly from this module (self::).

Based on this learning, I made these conclusions:

  • Import statements should have the crate root as the default, and it should be easy (and hopefully self-explanatory) to import from other crates as well. It should be possible to import from anywhere.
  • Export statements should have this module as the default. It should be possible to import from anywhere.
  • It should be possible to use a path from anywhere inside a module, but it isn't super important that it be the most abbreviated form.

And from these conclusions I think the design decisions follow:

The notion of path prefixes

Absolute paths have a prefix, which is one of:

  • self, refering to this module
  • super, refering to the parent module
  • crate, refering to the crate root
  • extern::<crate>, refering to a dependency

An absolute path, outside of an import/export statement, uses the syntax <prefix>::<path>.

use and export vs use and pub use

There are two motivations here:

  • I wanted the default prefix for import to be different than the default prefix for export
  • I wanted use to not introduce items visible outside of this module, even in submodules.

Changing what it means to use foo::Bar by adding a visibility modifier seems very surprising to me. That use foo::Bar means to search from the crate root but pub use foo::Bar means to search from self is troubling.

Of course this is an argument against having different defaults here. Ralf made the case that all paths should have the same default prefix, whereas Josh made the case that import/export statement paths should not have a prefix specified at all. But I think, with two different keywords, having different default settings will be reasonable to learn.

The from syntax

There's also objection to the from syntax as the way to change the prefix of a path in import/export statements. I want to point out that its actually shorter in the current form of this proposal for most cases. That is:

from std use io;
// vs
use extern::std::io;

In addition to being shorter, I think the first one is more intuitive.

Because absolute paths inside of modules are rare, I think users will learn the extern::std::io syntax later on, and will at first be using the from std. I would liken this to type ascription and turbofish. At first, I know that I can do:

let x: Vec<_> = iterator.collect();

But then I have a situation like this, where I don't want to create a temporary:

s1.parse()? + s2.parse()?

At this point, I learn about the turbofish syntax:

s1.parse::<u32>? + s2.parse::<u32>()?

I would even say that they're conceptually analogous - creating a temporary is like performing an import, whereas using an absolute path inside your code is like needing to inject the type into the middle of an expression with turbofish.

I might prefer supporting from as the only syntax for absolute paths also, though I don't know how I (or anyone else) would feel about that:

let x = from std vec::Vec<i32> = vec![];
5 Likes

I want to contrast these two statements to try to explain why I think "Use is an item" is a hard idea, and "crate means this crate" is an easy idea. The difference between them is what I'm going to call "generative implications."

By generative implications, I mean you have this simple idea, but it has implications that ripple throughout the system leading to complex and non-obvious consequences. Computer scientists love generative implications, and I think PL people even more so - look, with simple ideas A, B, and C we have turing completeness! And generative implications in semantics are what give the experience of programming a lot of its richness of experience, you always feel like you're learning more things about how to use the language you thought you already knew.

"Use is an item" has generative implications; things like how if you import a name into the root module, you can import it anywhere without any prefixes are not obvious, but once explained clearly follow from that simple concept

"Crate means this crate" does not have generative implications. Its a piece of syntax which may have two meanings that are obvious to someone, and so they ask "does crate::foo mean 'foo' from this crate or the crate foo?" and they get the answer - "its the first one, the crate foo would be written extern::foo". There's no deeper insights to learn later about what this concept implies.

My claim is that in nonsemantic systems of the language, like name resolution, we should be wary of concepts with generative implications, because they are surprising to users and can, in the worst case, lead users to develop inaccurate mental models because they have misinterpreted some of those implications.

4 Likes

Ignoring the from here, the rest of this version of the proposal looks perfect to me. And thank you for taking feedback into account.

I’m encouraged by the shift towards increased consistency.

3 Likes

Not intrinsically, but backwards compatibility prevents us from simply removing the old syntax. pub use will stay in the language whatever the conclusion, and adding almost identical construct that works from different root sounds like more trouble than it's worth, learnability-wise.

Something about that syntax doesn't feel right to me. Local crate is its own root, but external crates all share a single root. I can't quite rationalize it, but if I had OCD, it would be in overdrive here. Plus it's awkward to write.

How would you feel about figuring out something where the external crate is syntactically the first component of the path? @my_crate::path has been suggested, but might not fit well into the parser. Possibly it could be [my_crate]::path? Can anyone think of other possible variants that parse and don't look like a roadkill?

I think your basic premise here is that the current "path confusion", which has been discussed as a significant problem, is actually not a problem at all. Which is a perfectly valid opinion, but the fact that you don't even acknowledge path confusion as an established phenomenon leaves me wary of your conclusions.

My version is even better:

 use [std]::io;

from is decent as a workaround for the awkwardness of the proposed syntax, but I think that with properly designed base form, there shouldn't be any incentive to invent sugar for it.

I don't think "path confusion" is as simple as it is made out to be. That is, I think the underlying problem is not that import statements take an absolute path, and local statements take a relative path. In the previous thread, @theduke listed several minstream languages which use absolute paths for their import statements. I would add Ruby to this as well, though its UX is a bit worse because its require-based.

Our problem is a bit different, and I think its got more to do with the shift from a single-file project to a multi-file project. I think in a single-file project, you get bad hints (from the everything is an item situation) that lead you down the wrong road.

For example, you might think that once you've added a dependency, you can use it everywhere. But the reason you can use it your main.rs file is that extern crate foo; is an item, which adds "foo" to the namespace of the root. That's not an obvious fact to many users. Solution: we make all imports from external crates act the same way.

Another problem is that you write code like this once you add a submodule to your project:

mod foo;
use foo::Bar;

Now you have both foo and Bar in scope. You might think that because you declared mod foo;, you can now use from foo. But this is only true in the root module, in other modules you have to say use self::foo::.

A way to mitigate this is to only have imports and not mod statements. If you don't say mod foo, maybe you don't build a mental model that you'll be able to say use foo because you have that special "mod import" that you have to do when you add files.

I don't think this problem is entirely mitigated, and I would like for use foo, where foo is a submodule, to work in every module. I've had trouble finding a solution though, because of the problems that fallback creates with re-exports that pcwalton mentioned in the earlier thread.


However, I think path confusion is also mitigated to some extent by the from syntax. That is, lets say your imports look like this:

from std use collections::HashMap;
from std use fmt;
from std use path::{PathBuf, Path};

from semver use Version;
from serde  use ser;

use core::{Dependency, PackageId, Summary, SourceId, PackageIdSpec};
use core::WorkspaceConfig;

(These are the imports in cargo's core/manifest.rs module.)

My belief is that by making from <crate> a separate syntactic unit, users are hinted that the use core statements have had a from component elided. This makes it easier to internalize that bare use statements have the from crate statement elided on them.

Similarly, I think that the from syntax also makes it easier to understand that use and export have different elided prefixes. "use = from crate, export = from self."

5 Likes

As expressed in another thread, I would not agree with this. Having use and pub use be fundamentally different seems very confusing to me.

I agree that having different interpretations for paths in use and elsewhere probably leads to slightly shorter code. However, I think the confusion and mental complexity introduced by having to remember, when reading a path, how exactly this depends on the context is way worse than having to write a few more letters. So, even though use is usually going to need an absolute path, I feel strongly that it should (when no explicit root is given) still be relative, because consistency with the rest of the file outweighs character count.

And I don't think this is speculation. We have such a system right now, and experience shows people are confused by it. @aturon mentioned that in his 2nd proposal. @le-jzr already expressed his surprise that you did not even acknowledge the problem of "path confusion".

You later mention the problem of people building the wrong mental model. I think the "path confusion" is one of these points where people build the wrong mental model. When they start experimenting a little, they will notice that this actually works:

struct Type;
mod foo {
  use Type;

  fn bar(t: Type) { ... }
}

The reason this will work is that they are in a small crate, and this is happening at the crate root. Some time later, they try doing the same thing deeper in the module hierarchy, and they will be extremely surprised by this suddenly not working any more.

So, I still strongly object the idea that paths in use (or export) should start anywhere but in the local module. Saving some characters is not good enough an argument.


Re: "generative implications": I am not strongly attached to "use is an item". I think it is beautiful, and it is one of the things in Rust that struck me as a nice little bonus compared to other languages -- no more distinction between items declared locally and items imported elsewhere when doing lookup, what a great idea! But I can accept people telling me that being more like other languages here makes Rust easier to learn and is thus overall advantageous. The objections I raised against export are not motivated by the desire to re-unify use and export, but rather by the desire to hopefully make this feature better understood. :slight_smile:

1 Like

I attempted to articulate (though i think in parallel to you drafting your response) that "path confusion" isn't about imports working differently from local paths - you sort of hit on the same thing, that the problem is how they work the same as local paths in the crate root. I think fixing this for external dependencies significantly mitigates the problem, though I'm unhappy it still exists to some extent with modules.

In general, I am sympathetic to "being easier to understand is worth being longer," but here are some very rough numbers, generated using grep:

  • chalk - 122 non-public use statements, of which 4 are self:: prefixed
  • cargo - 658 non-public use statements, 18 self:: prefixed
  • diesel/diesel: 669 / 37
  • futures - 356 / 4
  • hyper - 508 / 14

These numbers seem just overwhelming to me that self:: would be the wrong default prefix for import statements.

5 Likes

Yeah sorry, I didn't see that post while writing mine.

Fallback to the parent module is not currently happening, is it? So wouldn't an alternative be to not keep not having fallback? (I wasn't even aware your proposal involves such fallback; must have missed this when you wrote it.)

Another option may be to require things to be more sorted: When there's an inline module (I assume fallback doesn't work for file modules either way, that would just be scary), then you cannot use things inside that module before the module. That should avoid the problematic cycles, shouldn't it?

Having use import stuff for the entire rest of the file, not just for this module, sounds like a nice idea indeed. However, even then you can still create cycles with from self use ..., right?

That is an interesting take on this. I have not thought of this before -- rather than making from use sugar for use, you are making use sugar for from from use. This is certainly a much better explanation for "why do things stop working when I move away from the crate root" than "well, paths in use are different". I will have to think about this.

I think that's still rather subtle.

If we get better grouping of imports (use extern::std::{ /* multiple lines of imports, with nested curly braces, in here */};), that shouldn't end up being much more to type than the from ... use you propose.

I don't even strongly mind the from sugar; it should be sufficiently clear that this changes the root with respect to which the immediately following use is interpreted. I'd be fine with linting people to use from std use rather than use extern::std:: -- using a style lint to provide better ecosystem consistency, just like we do for snake_case vs. CamelCase naming.

I think the only point we really disagree on is how to interpret use without from. So this is about use foo vs. from crate use foo -- we are talking about 11 characters per file (assuming we can do from crate use foo, bar, baz):

from std use
  collections::HashMap,
  fmt,
  path::{PathBuf, Path};

from semver use Version;
from serde  use ser;

from crate use // or maybe "from this use? "from this crate use"? ;)
  core::{Dependency, PackageId, Summary, SourceId, PackageIdSpec},
  core::WorkspaceConfig;

This syntax is growing on me the more I stare at it.

1 Like

I find it gets even more appealing if you compare it with crate-local absolute paths:

use [std]::collections::HashMap;
use [std]::fmt;
use [std]::path::{PathBuf, Path};

use [semver]::Version;
use [serde]::ser;

use ::core::{Dependency, PackageId, Summary, SourceId, PackageIdSpec};
use ::core::WorkspaceConfig;

Or with full use of nested curly braces:

use [std]::{
  collections::HashMap,
  fmt;
  path::{PathBuf, Path}
};

use [semver]::Version;
use [serde]::ser;

use ::core::{Dependency, PackageId, Summary, SourceId, PackageIdSpec, WorkspaceConfig};

Yeah, I think I do prefer this over extern:: and crate::, and it also seems (at least) on par with from, while also being shorter and not providing two different ways to express the same thing.

6 Likes

That syntax seems to have very little self-explanatory power, which is one of the chief advantages of from in my opinion. I have to admit I am very unenthusiastic about it.

Just like "Which crate does crate::foo refer to", I think you can learn this once and be done. Admittedly, the syntax here is impossible to google. OTOH, pretty much every example will have to import something from std, so this will be explained very early on everywhere.

But yes, I admit from std use foo is much more self-explaining than use [std]::foo.

2 Likes

from doesn’t work consistently for absolute paths elsewhere, though, and extern:: is really verbose there. [cratename]::module::name is a syntax that could work absolutely everywhere, including both use statements and code.

2 Likes

Turns out this differs across crates:

  • In the rustc repo, there are 10545 uses, of them 500 (4.7%) are module-relative, at least 6459 (61.3%) are cross-crate, and less than 3848 (34.0%) are crate-relative.
  • In "small scripts", it's even worse - consider that 546 of Cargo's 762 (71.5%) uses are cross-crate.

This means that users of these sorts of crates have differing experiences on the effects of making imports relative. futures and chalk aren't exemplary of all crates.

Statistics were gathered using this command and hand-inspecting the results:

find src -name \*.rs -print0 | xargs -0 cat | sed -rn 's@^use ([^:;]+).*$@\1@p'
1 Like

I think use [std]::fmt; is pretty self-explanatory. It's also:

  • reasonably concise
  • not ugly
  • can be used in item_path_str paths (in error messages) - [std]::rc::Rc<&mut [std]::boxed::Box<&i32>>.

Users can quickly see that "a crate you are importing from must appear between brackets". It also meshes well with absolute path syntax - ::rc::Rc in std is [std]::rc::Rc out of it.

9 Likes