Module, SubModule, subdirs, etc

Hi, thanks for rust.

I want to like rust... in general it's fine, I can understand and produce code, no problems until now... But the module, submodule, subdir system is just awful, unnecessary and restrictive. Please recognize this as I see also a lot of people with problems with that too... In this instance I dare to say the current concepts are the wrong path to take. To control access to published modules there are many possibilities, some a lot more expressive (we can talk about them). So please allow something like java or a variation of that, it's just ridiculous the complexity, time and restrictiveness of the module system... On all the rest please continue the good work.

thanks

1 Like

The mod statement based mounting system for modules is a desirable feature for many people writing lower-level code, because it allows developers to write #[cfg] mod and similar, conditionally including (and excluding) entire file/module trees from inclusion in the compilation.

Similarly, the capability to define a type in a different namespace from where it's exported to the outside world is an important option in Rust, because of how privacy is used to encapsulate unsafety. Without the ability to introduce an extra API-invisible privacy barrier, my structure's private state is going to be accessible by more code, and I have to trust significantly more code to maintain my invariants.

And even if there's no unsafe anywhere in the crate, safe invariants benefit just the same, and being able to split up file hierarchy based on implementation rather than exported interface, having actual private namespaces, is extremely useful. Yes, it can sometimes be a bit nonobvious where some type/function is defined, but 99% of the time you can go from the export position (which is filesystem-tied) and trace reëxports back to the definition site; glob reëxports are generally discouraged outside of hyper-specific cases (e.g. bevy's prelude-of-preludes).

Java's organizational system is entirely conventional, by the way. There's no actual requirement that .java source files match their package location on the filesystem; the package declaration at the start of the file is all that matters. (Most tooling assumes this layout, but none of the core tooling requires it, IIRC. .class files and the classpath lookup is fs dependent, and .jar is just a fancy .zip.)

There's also the fact that for Java, the unit of compilation is each individual .java file, and the build system is in charge of compiling each .java and packing together all of the individual .class files created by that compilation. In Rust, however, the unit of compilation is the entire crate; the compiler rustc gets the path to lib.rs and discovers the rest of the crate via the mod statement mounting points. The cargo build system does very little for the local crate; its real work is in managing dependency crates.

The module/path system already underwent a significant migration from 2015 era Rust to edition 2018 and beyond. mod mounting statements not inside an index module lib.rs/mod.rs/main.rs; the extern crate namespace of ::lib being a distinct namespace from crate:: being names at the root of the current crate.

Rust is kind of unique in that it has two ways of interacting with the module system: mod to mount modules, and use to bring names into scope from mounted modules. In scripting languages with similar relative-path-based import, mounting and using a file are the same operation, and importing the same file more than once typically doesn't redefine its symbols[1]. Rust on the other hand is perfectly happy to allow you to mount the same file more than once with mod[2]. (If we don't already have default diagnostics for when people do this, we absolutely should.) Private module paths are also somewhat unique of a concept to Rust, that I haven't seen elsewhere.

But Rust is also surprisingly good at noticing when you've made a mistake (e.g. used a path through a private module, used an unmounted module) and making suggestions as to what you wanted to write. Any specific cases where the compiler could reasonably do a better job of inferring intent absolutely should be tracked as issues; making the compiler more helpful on invalid code is one of the best superpowers of the Rust culture.

Would I make different decisions were I designing a new language from scratch today? Probably![3] But Rust's module system is established enough that it's not going to undergo any drastic adjustments. Especially not via some vague "do it better" ultimatum; at a minimum you would need to provide a draft of what you would want the new system to look like, what benefits it would bring, and what the migration path looks like. Even with that, though, the chance of making large changes (e.g. removing the need for mod statements) is quite unlikely.

To control access to published modules there are many possibilities [...] it's just ridiculous the [...] restrictiveness of the module system...

This seems contradictory. Either there are a lot of (potentially redundant style-only) options, or the system is restrictive. You can't really claim that the module system gives you both too many and too little choice in how to express your API and implementation structure.

For what it's worth, the Rust system is the most expressive module/namespace system I've used. No other language has both proper module/namespacing-by-default support and allows you to export an API item independent from where it's defined. I don't feel any sort of restriction in what the language's namespace system[4] allows me to express.


  1. ... I think. Tbqh I have very little clear how the namespacing system actually works in scripting languages, and if names can just get redefined; such scripting languages are typically loosely typed enough that you couldn't tell the difference anyway unless there were top-level statement side effects to observe. ↩︎

  2. In this way, mod is actually kinda eerily close to just being textual incclude!sion. And in fact, for legacy-scoped #[macro_use], textual order of mod statements is semantic! ↩︎

  3. Link is not me, but I share many of the same opinions. ↩︎

  4. The orphan rules are a property of the package system, and apply over package boundaries. There's no namespace-based orphan restrictions within a single package. ↩︎

21 Likes

Amazing answer, just wanted to follow up with a question. I saw somewhere that a

mod my_module;

is just syntactic sugar for

mod my_module {
  include!("my_module.rs");
}

is that true?

It's not exactly that, because it may need to look in parent_module/, and from there it can also be my_module/mod.rs. But once the path is determined, yes it does act like that include!. You can even use cargo-expand to get a flat view of your whole crate!

This is still slightly wrong. The edge case is that we also allow putting mod foo; inside inline modules (and have allowed it since basically forever[1]). If you use inline modules, an outline module within them is still expected to be located in in the directories that those inline modules would have created.

There's also the fact that mod x { include!("x.rs") } doesn't allow inner attributes (e.g. //! doc comments), whereas mod x; does.

Complaining about perceived inconsistencies

This is my one gripe with the new/2018 module system: in hindsight, I think it might've been better to make mod x; always be #[path = "./x.rs"] or #[path = "./x/mod.rs"] relative to whatever file it appears in. What we have today is some weird hybrid where mod x; always looks in the expected crate-root-relative location for that module, unless you use #[path] or include!, in which case you have a new anchor directory for any mods within that code's mount tree. (And yes, this means that mod x; and #[path = "x.rs"] mod x; behave differently even when loading the same file.) mod x; looks like it could/should be a context-independent thing, but it isn't.

I'll admit that this makes Rust basically do what's expected when using #[path] or include! to include more than one file from a different directory (e.g. a multi-file buildscript generated source tree in $OUT_DIR), while allowing for a meaningful-by-default file directory layout in the main crate. Making mod x; always be ./x[/mod].rs would maintain the former, but break the latter, resulting in completely flat module layout when not using mod.rs. If I were to propose a change today to clean up the behavior a little, it would be to deprecate and in a later edition forbid the use of non-#[path]-annotated outline mod statements in inline modules or when behind a non-standard-path #[path] or include!. This results in mod x; having only one edge case with the root lib.rs acting like mod.rs w.r.t. a file-relative module paths interpretation, but this being globally consistent with a crate-root-relative mounting interpretation.


  1. I verified rustc 1.10.0 (cfcb716cf 2016-07-03) allows it. ↩︎

1 Like

Indeed. I implemented some support for this in gccrs and there are definitely interesting corner cases.

FWIW, I believe that C++'s modules cover the "export separately from definition" via partitions. Though C++ almost certainly lacks some of the "by default" behaviors. Fortran may also have some magic available (some of the features of Fortran are certainly impressive these days), but I'm not familiar enough to say with confidence either way.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.