Side-discussion (comparing `foo.rs` vs `foo/mod.rs` & more) from “Deprecating having both `src/lib.rs` and `src/main.rs` in the same package”

Replies starting from a reply to

in the topic Deprecating having both `src/lib.rs` and `src/main.rs` in the same package have been moved here to split the parallel discussion threads there.

1 Like

This would promote the inferior (in my opinion) mod.rs scheme for nested modules. I prefer my IDE tab names to be meaningful, and not overly long showing paths.

2 Likes

foo.rs vs foo/mod.rs is not something that's important to me. However, I think this hypothetical change -- i.e. lib/mod.rs as the crate root for library crates -- could, if we wanted, be used as a lever to discourage use of the mod.rs scheme for any other purpose.

Tangential, but I yearn for supporting /src/my_module/my_module.rs in place of /src/my_module/mod.rs. Best of both worlds IMO in terms of better tab names without separating file from folder in the file/browse project tree.

5 Likes

I'd say that's an IDE issue, which IDEs can fix fairly easily. For instance, in vscodium this is fixed by

    "workbench.editor.customLabels.patterns": {
        "**/mod.rs": "${dirname}/mod.rs"
    },

What makes the mod.rs style (IMO) superior is that it puts all the parts of a module in one place in the file system structure, which means everything from git status to grep to the github diff view will group things properly. The module_name.rs style can lead to that file being quite far away from the rest of the contents of the module, which I regularly find to add plenty of unnecessary friction.

13 Likes

That is an interesting trick I didn't know. There is another advantage still to avoiding mod.rs though: when you add submodules to an existing module, you now need to rename the file in git. If you also split the existing content out into the sub modules in the same commit, git won't properly track the rename in history. You would have to remember to do it as two separate commits.

As for quite far away I don't know what you mean: foo.rs is right next to foo/ unless you sort directories and files separately, which I would hope your IDE has a setting to not do.

I would need to find such a setting for literally every program that displays the directory structure, from git status to my file manager to the github diff view. I don't think they all have that setting. Though github does indeed seem to mix files and folders (which regularly confuses me because most everything I use puts folders first -- for general file management, that's just the much better option IMO). And even with that, the module.rs file sits awkwardly separate next to the rest of the module.

That's a one-time cost, compared to the permanent issues caused by having the file system structure not match the module structure. I will gladly take that one-time cost any time of the year.

This is, unfortunately, a difference between Windows and many other systems. Having “grown up” on macOS and Linux systems, I am used to directories and files being homogeneous and the windows scheme (which other applications and Linux desktop environments have adopted, occasionally) of sorting folders first constantly breaks my brain. :sob:

2 Likes

Thing is, you could make a similar argument about the "IDE issue" from before. My particular editing setup uses least two programs where mod.rs is more awkward, and one of them doesn't have any setting to fix it.

The first program is an obscure vim plugin called LustyJuggler which I use to jump between files. It's a fuzzy search, sort of like fzf, but only for currently-open files. However, it typically only searches the filename. If I open src/foo/mod.rs and try to jump to "foo", I get nothing; I have to type "mod" to get to it. Only if I have two files open with the same name does it start adding directory components. I just tried to find a setting to change this behavior, but it turns out there isn't one.

The second program is fd, which I often use on the command line to find files. By default fd only searches filenames, so fd foo.rs works but fd foo/mod.rs does not. There is a flag to change this though (-p).

Neither of these is the end of the world. But it goes to show that UX papercuts exist on both sides.

2 Likes

Clearly, people have strong preferences for each scheme, and neither one will go away anytime soon.

I don't see how pros/cons of either path scheme solve the problem of this topic. No matter how the modules are named, they can still overlap between lib.rs and main.rs. Non-nested modules can overlap too, so the problem exists regardless of the nesting naming scheme.

6 Likes

Both options are equally bad - its just a stylistic choice of what poison is preferable.

An actual improvement would be to separate the notion of "mounting modules" from the code since it's really more of a build concern. This notion was a failed experiment ftom the get go.

I think that there's no need for it at all for most use cases. The compiler should consider all available source files in the directory by default and user ought to use their FS and Git to properly manage that. For the advanced use cases such as conditional compilation and remapped paths scenarios, this can go into a separate section of the Cargo.toml file for the crate or maybe into a dedicated "project" file.

Another use case for mounting modules is that of generated code. E.g. protobuf/grpc bindings from prost/tonic.

A complete backwards-incompatible overhaul of the module system is an overkill, and it's not helping the issue in this thread.

These arguments have been discussed to death before the 2018 edition. There's no point relitigating them here.

1 Like

Generated code is an artefact that belongs to the cargo target folder. Current practice is to then use the include macro. This reinforces that mounting is unnecessary and that build configuration belongs with cargo rather than in your source code.

I think this was meant for me.
I haven't argued for a complete overhaul as you falsely claim. I merely advocate for addressing design flaws and the subsequent removal of the current limitations on project structure. I'm not advocating to resyntax module use statements as was the case in 2018.

More importantly, the notion that we can never revisit past decisions is ludicrous. Circumstances and requirements do change over time. This is the sunken cost fallacy.

The original debate in 2018 tried to do too many changes at once (as I say, wanting to change the syntax) and we've reached a bad compromise without getting a proper resolution. We have both approaches still prevalent in the ecosystem to this day.

Seven years have passed. There have been personnel changes at the Rust project and the tensions around this topic have subsidised. Moreover, circumstances have indeed changed: cargo has matured greatly and has become ubiquitous, rustc itself adopted cargo conventions since then. The other components of this have been progressing nicely: the cargo namespaces RFC has been accepted and the idea of relaxed coherence to the workspace level had been raised by core rust devs themselves (Niko I think mentioned this)

All of which to say we are at a much better position to address the current design flaws and shortcomings of the module system.

Lastly, this can and should be deployed gradually.
A good first step here is an opt-in in the cargo project file to use "auto-discovery" mode. This would eliminate 95% of all mod statements by relying on the physical file structure.

We should not do that or anything else along those lines. The current behavior, where the compiler reads only .rs files explicitly mentioned in mod statements, is a desirable feature of the language. I refer you to my previous remarks on the subject; unless you have an actual counterargument, backed by evidence, for anything I said there, please drop this terrible idea.

6 Likes

I did counter your arguments there, in the comments below. You seem to have forgotten that.

I could get behind a opt in/out file tree based approach though. E.g. mod *; as well as a recursive variant of that.

1 Like

That was specifically aimed at @yigal100 but I did forget that you responded. Rereading, though, I don't find any of your counterarguments convincing, for reasons that I think I adequately explained in the older thread.

It's going to be really hard to make me budge on this, to be clear. I would need to see evidence on the level of actual controlled experiments on real development projects.

1 Like

And I don't find your replies to my counterarguments convincing. So we are at a standstill.

For compatibility reasons it would have to be opt in. Which means something like mod *; or perhaps mod **; for recursive. Of course this is bikeshedable (and bikeshedding doesn't interest me, I don't care about the specific syntax for this).

Which seems like the best of both worlds.

Having to manually list modules is a papercut, and I have seen this go wrong in C++ with dead code still left but removed from the cmakelists file. And unit tests that were never actually built and tested. So it would be nice to be able to opt into a file system based approach. It is generally less error prone to have one source of truth than two separate ones.

3 Likes

It might help you understand where I'm coming from if I say that I have a lot of bitter experience with gigantic projects -- GCC and Firefox are the biggest two that I've spent substantial time on -- and that I use almost no IDE support because I've never found one that was able to handle projects that size without crashing, slowing the computer down to the point where keyboard input lags, and/or draining my laptop's battery so fast that I couldn't spend a couple hours hacking in a coffee shop without a power outlet. I don't even use a basic "jump to definition" database, because either it gets out of date the moment I start moving code around or it drains the battery, again.

This experience means that I either don't mind, or actively want, language affordances (or dis-affordances) that discourage people from making the program even bigger than it already is, and that I am way at the end of 'explicit is better than implicit' when it comes to mechanisms for controlling what source files go into what build outputs. In particular, when you say

it would be nice to be able to opt into a file system based approach. It is generally less error prone to have one source of truth

you have already lost me, because I've seen people try that at the scale of GNU libc and it is -- this is a currently existing problem I'm thinking of, that multiple people have tried and failed to solve -- a trainwreck. (To be fair, part of the problem there is Makefiles.) And GNU libc has a halfway sensible reason to want dropping a file into a directory to change what gets built! (Automatic adaptation of the build to the current compilation target -- basically doing with directory structure alone what rust/library/std/src/os/mod.rs at master · rust-lang/rust · GitHub does with a big pile of #[cfg] directives.)

And when you say

something like mod *; or perhaps mod **; ... seems like the best of both worlds

I respond, "I don't even want that." I could live with mod foo.* appearing specifically in either foo/mod.rs or foo.rs, but not higher up the module tree; recursive globs are Right Out; and this should be documented as a feature for people doing unusual things, not to be used unless you have to.

2 Likes