Curly brace support for mod

And I'd like to keep this syntax available for anonymous inline modules because these can be amazing power-up for macros.

2 Likes

To make auto modules public, make their file names start with an upper case letter :wink:

1 Like

That is just a bit cursed LOL

1 Like

Or just have private modules file names start with _?

Do not forget about pub(crate), pub(super), and pub(in path).

1 Like

From where I'm sitting, all the proposals to add more implicitness to where Rust looks for modules are proposals to add more possibilities for what foo::bar can mean, and are therefore proposals to make the language more confusing and harder to teach. This is why I do not like them. The possibility that the rules might be different from edition to edition just makes things even worse.

The counterproposal I keep bringing up, mandatory manual specification of filenames for all out-of-line modules,

mod foo = "filename_is_whatever.rs";

is a proposal to minimize the possibilities for what foo::bar can mean. You look for the mod line for foo, and then you look at the file it points you at. If there is no mod line, then go look in the parent; if you get all the way up to the crate root and you still can't find it, it must be an external crate. Can't get any simpler than that.

Yeah, we'd be stuck with the old confusing rules in old editions, but you'd only have to worry about them if you were looking at code for an old edition, and therefore you could get away with not teaching them in the first few weeks of the "intro programming with Rust" class, unlike how things are now.

I do acknowledge that my counterproposal would make it hard to know what module foo.rs contains without looking at the contents of other files. But that is already true sometimes for the existing language (use of #[path], inline modules, anything that is understood as a crate root) and it's only going to be a problem if people go ham on naming files strangely, which they can already do with #[path].

4 Likes

99+% of all module declarations would look like mod foo = "foo.rs";. In other words, the latter half almost always would not contain any useful information and would be a plain visual noise and annoying repetition.

Meanwhile, the auto mod proposals aim to leverage the fact that most crates mirror the directory structure by mod declarations. In other words, the proposal aim to reduce amount of unnecessarily duplicated information. In the same way I wish that we did not have to duplicate trait bounds on impl blocks which are already present in type declarations.

As someone who teaches Rust, I think it's easier to introduce modules as a simple mirror of directory structure and explain mod together with more flexible ways of handling modules much later as an "advanced" topic. I do not quite understand why, but some students have a hard time of understanding mod declarations, so auto mod could help them.

10 Likes

I'm still in favor of

  • #![public]
  • #![public(crate)]
  • #![public(super)]
  • #![public(in = "foo/bar")]

We have a keyword for that. If we want modules to declare this, they could just say pub; or pub(crate); at the top. (Though I see the appeal of incorporating the #! inner-attribute signifier.)

That said, personally my favorite option is to say "modules are exactly as public as the most public thing in them". In other words, if you want a module to be public, you put a pub item in it, and if you want a module to be pub(crate), you put a pub(crate) item in it.

That would be a notable change, given the current ability to put pub things in a module but then hide the module, but I think it'd be a good change that simplifies the model. (It would also eliminate the "pub-in-private-module" approach of making pseudo-sealed traits, but we're working on an actual mechanism for declaring traits explicitly sealed, which would work well as a replacement for that.)

3 Likes

When I first started to program in Rust, I had this exact difficulty. I will to try explain my thought process at that time.

So, I created one of my first Rust project. Then I wanted to move some code in a.rs. Since this code is used in main.rs, I wanted to #include it from main.rs, and learned that apparently it is spelled mod a; in Rust. Fine. Then I created a new file b.rs that contains code than is used in a.rs. So "of course" I added the line mod b; in a.rs. When I read that to use function in b.rs from a.rs I had to modify a completely different file, main.rs, I was utterly confused.

Since then I never had use any advanced use-cases that are apparently possible with the mod keyword, so I always do exactly what I expect auto mod to do, then write it explicitly, without any understanding of why I need to that.

5 Likes

Yes. (Or perhaps mod foo = "parent/foo.rs".)

No.

Having that information there, in all cases, even though it would usually be redundant, is in fact the thing I want. It would not be annoying noise to me. It would be consistency and reduced cognitive load. It would be collapsing the special case and the normal case together, making there no longer be any special cases.

It might make more sense to you if I turn it on its head: I believe that Rust would be easier to learn and easier to work with if we take away all the existing default associations between module names and file names, and place the correspondence between the two entirely in each programmer's hands.


The core of the trap you discovered is that C-style #include "blah.rs" is not spelled mod blah; in Rust. Rather, it is spelled include!("blah.rs"). If you had written include!("a.rs") in your main.rs, and include!("b.rs") in a.rs, it would have worked as you expected, except that all three files would have been smushed together to make the contents of the top-level module (aka "crate root"); fn do_the_thing in b.rs would be called do_the_thing(...) from a.rs, not b::do_the_thing(...).

What mod foo; currently does is more like mod foo { include!("foo.rs"); }. If you want to reason by analogy to the C family, it's a combination of #include and namespace. But, if you had written mod b { include!("b.rs"); } in your a.rs, instead of just mod b;, I think that would have done what you expected it to do (well, main.rs would have had to refer to items in b as a::b::whatever instead of just b::whatever, but probably that wouldn't have been so surprising to you). [NB I have not actually tested this, I could be forgetting some details.]

The reason mod b; in a.rs currently doesn't find a file named b.rs in the same directory as a.rs, is that Rust currently has fairly restrictive rules for where that implicit include!() will look for the file containing the module's items: it has to be in a subdirectory of the directory containing a.rs, either a/b.rs or a/b/mod.rs. Except that when the file containing the mod directive is "the crate root" -- usually, but not always, main.rs or lib.rs -- then it does look for the module contents in the same directory as the crate-root file, which is why mod a; and mod b; in main.rs do find the files you expect.

As I see it, the problem with mod is not that one must write mod at all, but rather, the restrictive and confusing rules for what file mod x; refers to. Therefore, my proposal is to eliminate those rules and replace them with full manual control. Under my proposal, you would have written mod a = "a.rs" in your main.rs, and mod b = "b.rs" in your a.rs, and both of those pathnames would have been interpreted relative to the directory containing the file containing the mod directive, so you wouldn't have gotten a missing file error. The b module would be nested inside the a module, but if that's not what you wanted, I expect that it would make sense that the fix for that is to move the mod b directive to main.rs.

We trade a modest amount of extra typing for an absence of surprises.

Don't modules frequently want to present a public interface to their surroundings without being public themselves?

Normally when I write a module, I write it in terms of what makes sense to be visible outside the module, and mark those things pub. Then I decide later on, and separately, whether it makes sense for the module to be public outside its crate, and the two decisions don't clearly seem connected. (For example, if a module contains unsafe code, it still wants to present a safe interface to its surroundings, so anything that could break its invariants must be module-private – but anything else that might be generally useful can usually safely be marked pub so that a reader knows that it can safely be used in arbitrary ways. That's despite the fact that quite a lot of this probably won't be exported from the crate using pub mod or pub use – the fact that it could soundly be exported is separate from whether you actually do want to export it, and it makes sense to separate the decisions so that you can change them independently.)

If having anything public in a module makes the module public, then you have to rewrite all your pub to pub(crate) or pub(in super) whenever you change the visibility of the module – doing that would make changing module visibility very difficult, when it is in practice one of the things I frequently change most towards the end of writing a crate (because I generally decide how a crate should work internally before I decide what its public API should be – and I generally decide what items a module can safely make public to the crate separately from deciding what items a crate can safely make public to its downstream dependencies, especially because the former decision is usually easier).

5 Likes

I can't speak for josh, but for myself? My problem isn't the typing. I just don't like how Rust lets you define a module hierarchy that doesn't match the filesystem. It makes it hard to find stuff when code authors do that.

Allowing arbitrary paths grants a modest amount of power to the author, but the cost is that it takes an extremely powerful assumption away from anyone trying to analyze the code. If the filesystem and the module tree mapped 1/1, I could use file-centric tools to navigate Rust code much more easily. In fact, I usually assume that a codebase has such a 1/1 mapping, even though the language doesn't actually guarantee it, because people violate the rule so rarely and the assumption is so useful.

I'm "meh" about true automod. But deprecating #[path=""] would, IMO, be a dramatic improvement.

8 Likes

File name matching module name is the default. When something else happens it stands out. It is a flag thst "hey, something unusual is going on here, pay attention". If you listed it always a typo could easily slip in and not be noticed when you have a long list of modules.

3 Likes

Agreed. For the unusual cases you can declare an inline mod and then use include!. The only legitimate cases I can think of are generated code from CARGO_OUTDIR and the occasional platform specific module.

2 Likes

I generally find it confusing that within a module, pub may actually mean pub(crate) or pub depending on whether the module happens to be public.

I would much rather people spelled "pub in the module but not exported outside the crate" explicitly as pub(crate) (which is sounds like is really the default meaning you use pub for), rather than pub having different meanings in different contexts depending on visibility defined in other files.

2 Likes

THIS SO MUCH,

I really wish that it was explicit and more granular.

1 Like

:man_shrugging: This is not something that bothers me. I find that, no matter what the filesystem and/or module layout is, I have to do grep -wr <symbol> <top of source tree> all the time anyway.

(If LSPs worked for me I would just use jump-to-definition and list-all-uses all the time, but they don't. They kill my laptop's battery and make the editor lag behind my typing.)

A machine can follow the paths no matter how they are specified, it's a simple matter of programming. An experienced human, well, see above. And a novice human is, I think, going to appreciate the explicitness.

deprecating #[path=""] would, IMO, be a dramatic improvement.

So IF Rust didn't already have #[path], AND there was only one possible location for the file for each out-of-line module, AND the rule for what that location is, was the same for crate roots as for submodules, AND we had an alternative mechanism for swapping out module implementations based on cfg patterns, THEN I could get behind not adding #[path]. But that is not the world we live in. In the world we live in, there are too many uses for #[path] for this to be viable.

... I should probably clarify that "every module has a #[path] attached" is not actually my goal. My goal is "eliminate new-Rust-programmer confusion about how modules correspond to the file system", and "every module has the equivalent of a #[path] attached" is the only viable way I see to get there. We already know that "deprecate #[path] and make everyone use a/mod.rs instead of a.rs (or vice versa) and do something about the inconsistency for crate roots" is not happening, because we've already had that flamewar. Repeatedly.

I'm prepared to trade that for the reduced new-user confusion. Most typos in this context will cause a compile failure anyway.

How would you propose handling a situation like this? I find that usage to be idiomatic, as it avoids needlessly and conditionally declaring all modules and then conditionally re-exporting when the sole difference is the OS. I suppose you could throw it all in one giant function, but that becomes harder to read and still clutters up imports.

Enable https://doc.rust-lang.org/rustc/lints/listing/allowed-by-default.html#unreachable-pub.

2 Likes