[lang-team-minutes] the module system and inverting the meaning of public

One more concrete example for directory traversing. Imagine a crate with a build script build.rs and code in src directory. If build.rs included src implicitly as a module, that would be a pretty large build script! Supposedly, only some modules (like mod.rs or lib.rs) would include other modules implicitly, but I haven’t seen the rules written out.

1 Like

Definitely some directories should not be included by default. I would think src and src/bin are good examples.

Both of these are unproductive statements to make. People have described the problems with the current system at length, which is certainly not in terms of “character count.” Moreover, your second statement is baldly assuming bad faith. This reaction makes this conversation much more difficult than it ought to be.

I don’t have really strong feelings about the shape of the module system, but commonly write module subsystems that are organised to support development and expose an API that supports usability, whether by other modules in the same crate or external users.

The proposed system seems like it wouldn’t stop you doing that, so I’m happy with it. I wouldn’t mind some strawman examples of how common patterns would be expressed. I can’t be the only one who feels they aren’t fully grokking the implications yet :smile:

I wonder if this is something we share too; when I started reading @nagisa’s post, I thought “well I use UNIX as my IDE too.” But I always stay at the root of the project, and open files from there.

Many of our users that describe difficulty with the module system describe this cost as much higher than two words; there’s a lot more mental overhead with even remembering that you need them at all, and then when it doesn’t work, figuring out how to fix it. Once you learn this stuff, yeah, it’s low overhead, but before you do, it’s much higher.

The people who have trouble consistently describe it as “foreign” or “alien”, not familiar. If it were familiar they wouldn’t have any trouble!

In fact, this has been one of my running theories about people who struggle with our module system: they assume it works just like the one that they use in their current language, and then hit a hard wall when it doesn’t work.

My main critique on the proposal by @aturon is about the idea of “implicit mod” and “implicit crate”.

  • It seems that the idea is to make the language “simpler” and “easier to understand”. Well, I doubt that will happen. As these systems have big shortcomings, and apparently the idea to cirumvent those shortcomings is to introduce complexity – that previously wasn’t needed. Like @withoutboats suggestion to not include some directories by default like src and src/bin. Or just take the rust compiler itself. Its test suites (src/test/run-pass, etc) consist of single file crates that all share the same directory. With the new change one would either have to put each of them into their own directory, or add some special flag to the compiler. Both options are bad, as one introduces complexity in the directory layout, and the other introduces complexity in the compiler. And this makes the “it will make things simpler” argument moot.

  • Right now you only have to open lib.rs to find a list of all the crates used. I find this useful as I don’t have to open Cargo.toml this often. I use unix as an IDE and open files via the shell and not from my text editors drop down list for files so opening additional files incurs an extra cost to me. Also, it gives me a clearer overview. Cargo.toml has different notations for importing features and scanning the file doesn’t feel as easy as scanning the lib.rs/main.rs, especially as it sometimes mentions a crate name multiple times and in multiple places in the file (for a feature name, for the actual dependency, maybe part of the github url or path if its not from crates.io).

  • Its not just more useful for reading and understanding a program, you can also quickly comment out a module by just commenting out its mod line. With implicit modules, you’d have to open the actual file.

  • If you previously had a crate with a file inside and didn’t use it with mod, or, which is probably more prevalent, if mod sth; is preceded by #[cfg(...)] then including that file unconditionally would constitute a breaking change. If the mod contained non working code, projects that previously worked would break now. Breaking changes should only be done if there is a real and actual benefit to the change, and there is none to any of the proposed ones.

  • The case sensitivity issue pointed out by @nagisa . Both Windows and Mac have case insensitive but preserving file systems. Means that if you store a file as Awesome.rs and include it with mod awesome; then the OS will direct rustc to the correct file. Right now Rust allows non upper case mods to be declared, imported and used. So just assuming that a file named Awesome.rs must represent a mod awesome will mean a breaking change. I have thought about it and I haven’t found a way to make it possible to use non lowercase filenames together with implicit modules, so apparently everyone will be forced to lowercase their module names if they remove mod or use. Uppercasing file names is not something I prefer doing personally, but I guess some people like uppercasing. Also, you’d have to keep code in the compiler around for legacy projects that include upper case modules via mod (to not break their code).

  • There are proposals out there that suggest to replace mod with use, so that a file only gets parsed/opened if there is an use statement connected to it. I think this will cause ambiguity when reading code. What does use foo; mean? Is it for including a module? Is it for using a crate? I don’t like python and dynamic languages in general for their ambiguities, and I don’t want them to leak into Rust just to ease them to come to Rust.

Also, Rust 1.0 has been released. Rust is stable now. This means, its only going to gain features, but not lose them. What this proposal is about is a new feature, an additonal thing to learn. Will this make the language simpler? No, by the definition of stability, Rust is only getting more complicated.

If simplicity were really a concern, the only things you do should be to make pub(restricted) adopt <= privacy (I think it uses = atm?), to be consistent with the current system, and stabilize it.

I mean otherwise you are headed towards a mess where you have to support two systems for all eternity, unless the 1.0 promise was a lie.

All the books about Rust printed and written have the old system, all the university courses teach the old system, all the code has the old system.

Add that to the fact that the new system won’t be much of an improvement, and you got needless churn.

When Rust 1.0 came out I thought, great, now I can learn something, and apply that knowledge for the time to come, you know, like it used to be and still is with C, and don’t have to adopt to the newest fad. But seems I was mistaken and I need to change code constantly if I want to keep it idiomatic.

4 Likes

Please do not include statements like this in your post. We all write them when we feel strongly about our position, but we should strive to replace them with more constructive feedback.

When directed toward me, as these are, they knock the wind out of me, and instead of considering & responding to the other comments you’ve made, I want to close the tab and go do something else.

1 Like

These statements are not directed towards you as a person, they are about the ideas. I understand that there is some personal effort inside that proposal, but this proposal being met with criticism or rejection doesn’t mean I doubt in your potential, or would dislike any new ideas from you.

I include these statements to make my opinion on the matter clear and put it beyond doubt. Otherwise I fear I’m misunderstood; that I liked things that I point out, while I actually don’t.

It sounds like there are downsides to the “implicit mod” feature, but one thing I’m wondering about is how you handle the other languages which already include source files implicitly (Go, Java, etc.)

The example I’m interested in seeing is where you conditionally include modules. Say, for example, you’re writing a parser and have 2 implementations depending on whether SIMD is available. You might have:

parse::with_simd::Parser

And:

parse::no_simd::Parser

And conditionally export either as parse::Parser so callers outside the parse module don’t need to know whether or not SIMD is available. In this case you don’t want anyone to be able to use the ‘true path’ to Parser, because it’s an implementation detail that depends on configuration.

I’m curious to see what this looks like with implicit modules.

You tag with_simd and no_simd as #![internal] and have two pub use statements in the parent, each with a cfg. In this case, it is the same number of lines of code as today (because you don’t need the mod declarations), but the fact that this module has been facaded has been moved into the module.

EDIT: Also the cfg attribute is moved into the module as well, since you’ll do #![cfg(..)] whereas today its more common to attach it to the mod statement.

This is something I see as a significant advantage of the proposal (and quite contrary to the people who seem to think this proposal favors a sort of sloppiness) - it makes it much easier to learn everything you need to know about a module, because its all in the same file.

1 Like

Ah gotcha, and would that mean someone couldn’t come along and use parse::with_simd::Parser as opposed to parse::Parser? ~Or would I need to mark Parser as pub(super)?~

If you don’t tag the module internal, someone with the simd cfg can access the parser at parse::with_simd::Parser. But if you do tag it internal, only your crate can access it at that path.

If you make the type pub(super), you can’t pub use it from the parent module, because it can’t be re-exported beyond super (this is already how pub(restricted) works).

1 Like

Hmm, I think it would be nice if it were possible to remove the possibility of that direct access completely, since it seems reasonable a future maintainer with some IDE support might discover that true path and not realise it isn’t always valid.

I realised pub(super) wouldn’t work and tried to cross it out but markdown isn’t my friend tonight :slight_smile:

You can’t cut it off entirely because the re-exporting module needs to be able to see the proper path in order to re-export it. Its an interesting argument though, sort of in favor of a more pub(super) meaning to internal.

1 Like

I’d like to put in my opinion here that I am strongly opposed to implicit modules. One of Rust’s main selling features for me is the explicit nature of things, and to make modules implicit would be detrimental to this. How would someone prevent a module from being implicitly created? The current module system allows you to cfg modules and even specify different paths for a module. With implicit modules, it could very easily get confused by and break crates that use custom module paths, especially if the custom paths are cfg’d or there’s macros at play.

The current private in public errors are quite inaccurate, with both false negatives and false positives. I personally wouldn’t mind seeing them just go away because they don’t actually provide any sort of soundness guarantees due to the false negatives, and are an annoyance due to the false positives. However, if you do want a better system, please create a system without false positives and without false negatives, so that it actually has value and isn’t just an annoyance. I don’t mind having rules against private in public as long as they are accurate which the current rules are anything but.

6 Likes

A counter attempt to the “confusing to beginners” argument:

  1. Do we have data on “new to programming” vs “knows a few other languages”? Ideally we’d have them grouped by module system design family.

  2. I put my money on “not confusing to someone new to programming”. There are a few rules and they can be followed to produce a desired module tree.

  3. If there’s something confusing, it’s how extern crates are placed at the root specifically, so code in the root appears to behave differently, not the explicit module tree.

  4. It’s far more likely IMO that someone with some experience in a few other languages, particularly if all those languages share design traits, which can make someone treat them as universal fact, will be confused by Rust’s different design.

  5. We need to do a better job at “shocking” newcomers upfront with the fact that Rust may not match their acquired intuition (cc @carols10cents @steveklabnik).

  6. Error messages and/or extended error explanations should push the user to read (and re-read) the relevant chapters of the (new?) book. If they can’t get it working, they should first try to understand how they might go about doing that.

3 Likes

To be completely clear, “confusing to beginners” is not primarily the motivation behind this proposal from my perspective, and I don’t think from @aturon or @nikomatsakis 's either. I am not able to contribute more substantively to this conversation at the moment but I’d suggest continuing on that tangent would be unproductive toward reaching a consensus.

@withoutboats Too bad, because most of the momentum this ever received was based on the assumption that the current system is confusing for beginners, you should own up to that.

Also, my position is that this proposal is as needed as being able to write methods inside a struct or enum declaration because “remembering to use impl is confusing for beginners”.

This is how we ended up with jumbled messes of features in languages that claim to have “OOP”, let’s not repeat those mistakes.

2 Likes

While I can understand wanting to flesh out the exact proposal first, that seems to have progressed far enough to start considering backwards compatibility questions (please point out if they have been addressed prior to the most recent iteration, as I’ve only skimmed the discussion up to that point). Existing code and existing directory layouts need to remain supported, and any additional features or hacks that need to be bolted on this new feature to get that count against it. Many a great proposal has been weighed down below the acceptance threshold by such problems, so they need to be considered early. In particular, it must be ensured that rustc doesn’t suddenly start including files that weren’t included before. Some examples:

  • Today, one can remove a module from a crate temporarily by commenting out the mod line. This is useful when first developing the module (start writing functionality in a new module, put it off and try something with the rest of the code, come back to the new module later) and also while removing/renaming modules (can experimentally remove the mod line without having to delete the file and perhaps later look up how to restore it from git). And while it may be good practice to remove unused modules quickly, currently one can put that off indefinitely (in the same way some people keep commented-out code around for a long time).
  • Today, it’s possible to have a bunch of binaries lying around in the same directory, possibly alongside a bunch of utility modules used by some but not all binaries (one example of such a layout is Cargo’s tests/. benches/, and examples/ directories). It would be unfortunate (increase binary size and/or spam unused code warnings) if the utility modules were included in binaries that don’t use them, and including the binaries in each other’s module trees can trigger compilation errors (as the crate root can contain attributes that non-root modules can’t, and there are some definitions that can only exist once per executable).
  • Today, most existing code relies on files not being touched if the corresponding mod foo; item gets cfg-stripped to make modules conditional. This needs to keep working, even if putting #![cfg(foo)] in the affected file becomes the new idiomatic way to do it.
4 Likes