Revisiting Rust’s modules, part 2


#1

It’s been a week since my last post on Rust’s module system. Unsurprisingly, the strawman proposal in that post garnered a lot of commentary–174 comments in one week!–with sentiments ranging from

Now this is a proposal I can get behind

to

I’ve rarely hated anything as much as I hate the module system proposal

and everything in between :slight_smile:

The discussion has raised a number of very interesting points; thanks to everyone who has participated so far!. I won’t try to give a comprehensive summary here. What I want to do instead is focus on one particular critique of the earlier proposal, and present a quite different strawman design that embraces a different set of priorities.

For ease of discussion:

  • I’ll call the strawman in my last post the “directories-as-modules” proposal.
  • I’ll call the strawman in this post the “use-universally” proposal.

http://aturon.github.io/blog/2017/08/02/modules-part-2/


[Pre-RFC] Yet another take on modules
Revisiting Rust's modules
Revisiting Rust's modules
#2

I’m encouraged by this approach, as (to my eyes) it feels like one that will appeal to folks coming from other languages as well as reducing some of the concept count.

One clarifying questions, can:

from petgraph use prelude::*;

use ::errors::Result;
use ::ir::{Program, ItemId};

pub use solve::Solver; // note use of relative path

Be written as:

from petgraph use prelude::*;
from errors use Result;
from ir use {Program, ItemId};

pub use solve::Solver; // note use of relative path

(Apologies for typos, I’m on my phone)


#3

No – ::errors means the errors module at the top of the current crate. Whereas from errors means “from the errors crate”.


#4

The new proposal seems less radical than the one from the previous blog post, which is good. I also find more in it than I agree with, unlike last time where I didn’t agree with almost anything.

Let me start with the stuff I like: The from <crate_name> use <path>; syntax is really really great! I’m a big fan! This is actually an improvement over the status quo, which doesn’t allow you to know whether something is taken from a module, or from a foreign crate. If this syntax gets adopted, I’m even okay with extern crate being deprecated! Maybe we could make it part of RFC 2088, to make usages without extern crate only possible through from use. Bikeshedding syntax, I’d like to suggest an alternative: crate use <crate_name>::path;. EDIT: another alternative: extern use <crate_name>::path;.

Regarding making use <foo> relative, I’m mostly neutral. It has its merits, but generally every change has its costs, so this has advantages and disadvantages.

I still dislike the removal of mod syntax. With it removed you can’t extend them via implicit mod foo any more, which would be sad. Underscores are ugly in my eyes.

I also dislike treating modules as pub(crate) per default without giving a way to do private modules. You should be able to have finer grained privacy than private, pub(crate) and world-public. Especially, I disagree with this statement:

As has been argued on thread, the vast majority of the time you only need visibility at one of three levels: the current module, the crate, or the world.

This is maybe the case for smaller crates, but bigger crates with more complexity do need finer structuring. There is no way to enforce that your submodules are private with the proposal; anything pub(crate) inside them may be exposed. My major pain point is not even that they are pub(crate) by default, but that there is no option to make it less private. Previously, we didn’t need syntax to make stuff less private, because everything was private by default. But if we change the default, we need to introduce syntax for enforcing privacy.

One thing I do like though, which is that modules need an explicit pub use in order to appear in the public API of a crate.

Regarding the "use ing submodules" knob, I’d like that submodules are not in scope of their parents automatically, but you need to use them. Anything that’s implicitly in scope is not really nice IMO, and the infer extern crate RFC also got modified in that regard.


Revisiting modules, take 3
#5

I had to reread your comment a few times to understand what you were saying. It feels subtle. While I would like to see how other parts of the proposal shake out, I’d think we may want a more readable thing that says “use submodule” explicitly rather than :: (if use becomes uber allies)


#6

One way to think about it: it works just like the file system from a shell. Relative paths by default, and leading :: is like leading / which takes you to the root of your crate. In this analogy, external crates are like volumes on Windows: from futures use Future is kinda like futures:\Future :wink:


#7

I like this proposal much better than the previous one! Two comments from my side:

Please, please require use to include a rs file so we prevent “stale file” horror stories like I’ve described before: Revisiting Rust’s modules . Files that wouldn’t be naturally referenced anyway are a really rare occurrence, and having to use them anyway, would be a small price to pay for confidence that Rust source code reliably compiles, no matter where and with how broken top-level build system. Unit tests or basically any form of a test or usage would catch such cases immediately during development, and lints and warning can always help.

A much more minor comment (since it’s about syntax only) is: I’d say that use X from Y; or from Y use X; is functionally exactly the same as use extern X::Y; that I’ve proposed. So I’m happy to see in this proposal, but IMO use extern (or some variation around it) as opposed to two-keyword syntax has many benefits:

  • everything module starts with use/pub use and ends with the path, so looks more uniform
  • IDEs have the easiest time completing that
  • we reuse an existing extern keyword
  • all existing use syntaxes would be applicable, and intuitive (pub use extern bzip2::Compression;, use extern bzip::read::BzDecoder as Decoder; or use extern bzip2::*; )
  • we have only one main keyword: use with natural modifiers for everything

I just can’t see disadvantages, so please consider it, or let me know what are the disadvantages that I overlook.


#8

Personally, I very much prefer RFC 2088. A few things in this proposal seem really problematic to me:

Introducing from/use. This form provides a much more clear distinction between imports from external crates and those from the local crate

That’s a bug, not a feature. One major reason to get rid of extern crate is precisely to get rid of this distinction, and just use use for everything. So let’s actually use use for everything, rather than introducing another artificial distinction.

So, for instance, instead of writing from petgraph use prelude::*;, let’s just write use petgraph::prelude::*; , without having to write extern crate petgraph first.

TL;DR: writing pub on an item means pub(crate) unless (re)exported in a public module (which itself is done via re-exporting).

This effectively enshrines something that feels a lot like the facade pattern, as the standard means of exporting an API. I’d rather write pub(crate) if I mean pub(crate), and pub if I mean pub. I do want to get rid of the “complex nest of re-exports and module visibilities”, but in the opposite direction, not like this.

Changing use to take paths relative to the current module. There are two main reasons to do this.

In the absence of the above, I think these reasons go away.


#9

I am a big fan of the multi-line imports you are proposing “on the side” here; I have wanted nested curly braces multiple times already. :wink: Also, making paths mean the same thing in use and everywhere else is a big win. That is something that had me confused initially, and that I still keep forgetting about.

I will join the chorus of those that argue against automatic imports based on the file system. One issue I am particularly worried about here is case-insensitive file systems. I’ve had some “fun” in the past with files having the wrong case, and this can get even more “fun” as things get more implicit.

Given that crates no longer are part of the path, what is the plan with this proposal to render e.g. a type name in an error message in a way that it is clear which type this refers to? Also, in a similar vein, currently I can e.g. use std; if I use a bunch of random things but nothing more than once; with this proposal, how do I get the same effect? I suppose one could do from std import self as std;, but that’s not exactly pretty.


#10

We could be arguing about it forever - there is a lot of people very opposed to removing that distinction. Please at least acknowledge the fact that existing Rust has that distinction, and many people will not be willing to let it go. Wouldn’t a healthy compromise be to make this distinction explicit, but as convenient as possible? Similiar to try! macro that was generating a lot of controversies, we settled on the short ? which is a balance between explicit vs implicit, automatic vs manual.


#11

I think there’s more nuance here. In this proposal, you’re still using use, you just have a way of requesting an external crate. There’s very real confusion around the fact that external crates are currently “mounted” alongside local modules. I think people have a good intuition that things they define in their crate, and things from other crates, are distinct and may be addressed separately.

Put differently, I still see this as “using use for everything”, just with a way to more clearly specify what the path is.

Perhaps a different syntactic choice, like a variant of @dpc’s, would help:

use extern::std::vec::Vec;

though that’s comparatively verbose.

I don’t quite follow – can you expand on this point?


#12

One more follow-up on this: do you agree that today, the fact that external crates are put in the same namespace as your top level modules is confusing? (There’s certainly plenty of anecdotal evidence that it is for some folks). If so, do you see a different way to address that issue?


#13

Maybe you can elaborate a bit more here – I don’t quite understand what “went wrong” in your scenario. Somebody left stray files checked in and pushed them to the server? It seems like this is a general expansion of “somebody pushed changes without checking if they build”, which is a common source of headaches, but not necessarily linked to the module system.

In general, I have been pretty strongly opposed to the idea of using use statements to decide what gets compiled. I find that to be a surprising overload, to me, of the use statement, which to my mind is more about creating links between things that otherwise exist, and not about specifying what exists.

(The extern crate discussion has been going another way, which I think is suboptimal, but perhaps ok since typically whether or not something is linked has no real effect anyhow.)

I usually raise the examples of tests and impls; consider that unless we specialize the unused imports lint, if you have a test module, you will need something like this:

#[allow(unused_imports)] #[cfg(test)] use test;

I feel like the unused imports lint here is very natural, though, because imports are not typically “significant” in this way.

I also had the opposite experience from the one you describe: that is, I’ve been annoyed at finding it hard to figure out what exactly is getting compiled. This is less true in Rust, since mod foo kind of makes it clear, though I do forget to add them on occasion.

I definitely remember (e.g.) in Java that ensuring that all my .java files got compiled could be annoying. Often things would get compiled, because of use statements and interlinking, but without setting up some kind of wildcard, it was hard to keep everything in sync. Switching to Eclipse, where it just used the file system (admittedly, you have to press F5 to rescan, which actually I also found annoying), was an eye-opening experience.


#14

I really like the proposal! Since the current system is already really, really nice this is like the cherry on top to me :slight_smile:. Your previous proposal seemed to me a bit too different from the current system.


#15

I had never really thought about this (and the question was not directed at me), but I have to agree that this is indeed unexpected. However, from is not a great solution as it only works for imports, not for all places where a path occurs. It would be nice to have some syntax to distinguish “paths rooted in this crate” from “paths referring to other crates” – a bit like drive letters on Windows (heh, who would have though that I’d ever suggest that these could be a good idea :wink: ), so use \A\B\C would be in your crate but use crate:\A\B\C would be referring to another crate. (I am not actually suggesting to use the \, I am just drawing an analogy.)

If the syntax wasn’t already used, ::crate::path could be an option?


#16

I’ll respond to your other points later, but wanted to clarify this one – what you’re saying is accurate, but my point is precisely that you have to mark them pub(crate) for that to happen. If they’re private, or pub(super), or whatever, then they have smaller visibility. This is exactly what I mean when I say that it’s item visibility, rather than module name visibility, that matters above all.


#17

Yes, I thought this would be a good idea also (ahead of any other changes).

Underscores have some advantages setting aside the aesthetics, but even without mod a #! attribute is an alternative syntax. (The most important advantage to me is that if I am in foo/_bar/baz.rs, I can see that bar is “inline”, whereas I would have to open foo/mod.rs or foo/bar/mod.rs for either of the other syntaxes.)

Aaron and I talked about this possibility today a little bit. The thing is that they aren’t so rare because of how impl blocks work - the real module that Aaron based his example on (the coherence module), only contains impl blocks for types defined elsewhere, and doesn’t actually export any types or functions of its own.

I don’t understand the connection you’re making to build systems. Its seems like you’re imagining an event in which a build system generates a junk file, with a .rs file extension, inside the source directory of a crate. Why would that ever happen?

I understand some of the concrete concerns about workflows - like git stash leaving your crate dirty if you don’t pass the -u flag, but from your comments I haven’t gotten a firm understanding of the concern.


#18

There is another link for you here: https://www.reddit.com/r/rust/comments/62rqws/dealing_with_extern_crate_vs_mod/


#19

We already have that: use std::foo requests foo from the std crate, for instance.

That’s at least an improvement over the from ... use ... syntax, which feels excessively verbose. (I think use crate::somecrate::module::name would be even better.) But it still seems like what I’ve seen described as “syntactic salt”: extra syntax that’s unneeded by the compiler and only exists to make humans type more.

I’ve seen extensive confusion over why you have to write mod foo; to make the module foo exist, even after creating foo.rs. And I’ve seen other confusion about the module system, as well. But reading through the list of module system complaints you link to, I don’t feel like “extern crates should be distinct from local modules” feeds into the confusion. Right now, extern crates are already distinct from local modules, in that they need special syntax. And some of the complaints are precisely that they need special syntax. Changing it to different special syntax doesn’t really address that.

Many other languages seem to use the same syntax for both, without any distinction.

This means that if I want to export an API, I have to go to the top-level module and write pub use thatmodule;, or pub use thatmodule::some_name;. I can’t just write pub in the module itself to export an API. So it encourages the “first write things in a module then go write something in another module to actually export them” pattern.

This introduces a confusing inconsistency: in a module, pub means pub(crate), except in the top-level module, where it means pub.


#20

The first iteration on this from syntax used from crate for absolute paths instead of ::, and had ::-prefixed path work like ::std::iter::Iterator and ::crate::local_module::Item (essentially the root of your crate is inside a “module” called crate). We moved to the proposal in the blog post because we found applying from crate use to everything you took from another module was pretty painful, but I don’t think we squared away an alternative for absolute paths like this. :-\

I don’t know if Aaron had an idea he didn’t mention to me, but I’m not sure how you’d write ::std::iter::Iterator under this proposal.