[lang-team-minutes] the module system and inverting the meaning of public

Definitely some directories should not be included by default. I would think src and src/bin are good examples.

Both of these are unproductive statements to make. People have described the problems with the current system at length, which is certainly not in terms of "character count." Moreover, your second statement is baldly assuming bad faith. This reaction makes this conversation much more difficult than it ought to be.

I don’t have really strong feelings about the shape of the module system, but commonly write module subsystems that are organised to support development and expose an API that supports usability, whether by other modules in the same crate or external users.

The proposed system seems like it wouldn’t stop you doing that, so I’m happy with it. I wouldn’t mind some strawman examples of how common patterns would be expressed. I can’t be the only one who feels they aren’t fully grokking the implications yet :smile:

I wonder if this is something we share too; when I started reading @nagisa's post, I thought "well I use UNIX as my IDE too." But I always stay at the root of the project, and open files from there.

Many of our users that describe difficulty with the module system describe this cost as much higher than two words; there's a lot more mental overhead with even remembering that you need them at all, and then when it doesn't work, figuring out how to fix it. Once you learn this stuff, yeah, it's low overhead, but before you do, it's much higher.

The people who have trouble consistently describe it as "foreign" or "alien", not familiar. If it were familiar they wouldn't have any trouble!

In fact, this has been one of my running theories about people who struggle with our module system: they assume it works just like the one that they use in their current language, and then hit a hard wall when it doesn't work.

My main critique on the proposal by @aturon is about the idea of “implicit mod” and “implicit crate”.

  • It seems that the idea is to make the language “simpler” and “easier to understand”. Well, I doubt that will happen. As these systems have big shortcomings, and apparently the idea to cirumvent those shortcomings is to introduce complexity – that previously wasn’t needed. Like @withoutboats suggestion to not include some directories by default like src and src/bin. Or just take the rust compiler itself. Its test suites (src/test/run-pass, etc) consist of single file crates that all share the same directory. With the new change one would either have to put each of them into their own directory, or add some special flag to the compiler. Both options are bad, as one introduces complexity in the directory layout, and the other introduces complexity in the compiler. And this makes the “it will make things simpler” argument moot.

  • Right now you only have to open lib.rs to find a list of all the crates used. I find this useful as I don’t have to open Cargo.toml this often. I use unix as an IDE and open files via the shell and not from my text editors drop down list for files so opening additional files incurs an extra cost to me. Also, it gives me a clearer overview. Cargo.toml has different notations for importing features and scanning the file doesn’t feel as easy as scanning the lib.rs/main.rs, especially as it sometimes mentions a crate name multiple times and in multiple places in the file (for a feature name, for the actual dependency, maybe part of the github url or path if its not from crates.io).

  • Its not just more useful for reading and understanding a program, you can also quickly comment out a module by just commenting out its mod line. With implicit modules, you’d have to open the actual file.

  • If you previously had a crate with a file inside and didn’t use it with mod, or, which is probably more prevalent, if mod sth; is preceded by #[cfg(...)] then including that file unconditionally would constitute a breaking change. If the mod contained non working code, projects that previously worked would break now. Breaking changes should only be done if there is a real and actual benefit to the change, and there is none to any of the proposed ones.

  • The case sensitivity issue pointed out by @nagisa . Both Windows and Mac have case insensitive but preserving file systems. Means that if you store a file as Awesome.rs and include it with mod awesome; then the OS will direct rustc to the correct file. Right now Rust allows non upper case mods to be declared, imported and used. So just assuming that a file named Awesome.rs must represent a mod awesome will mean a breaking change. I have thought about it and I haven’t found a way to make it possible to use non lowercase filenames together with implicit modules, so apparently everyone will be forced to lowercase their module names if they remove mod or use. Uppercasing file names is not something I prefer doing personally, but I guess some people like uppercasing. Also, you’d have to keep code in the compiler around for legacy projects that include upper case modules via mod (to not break their code).

  • There are proposals out there that suggest to replace mod with use, so that a file only gets parsed/opened if there is an use statement connected to it. I think this will cause ambiguity when reading code. What does use foo; mean? Is it for including a module? Is it for using a crate? I don’t like python and dynamic languages in general for their ambiguities, and I don’t want them to leak into Rust just to ease them to come to Rust.

Also, Rust 1.0 has been released. Rust is stable now. This means, its only going to gain features, but not lose them. What this proposal is about is a new feature, an additonal thing to learn. Will this make the language simpler? No, by the definition of stability, Rust is only getting more complicated.

If simplicity were really a concern, the only things you do should be to make pub(restricted) adopt <= privacy (I think it uses = atm?), to be consistent with the current system, and stabilize it.

I mean otherwise you are headed towards a mess where you have to support two systems for all eternity, unless the 1.0 promise was a lie.

All the books about Rust printed and written have the old system, all the university courses teach the old system, all the code has the old system.

Add that to the fact that the new system won’t be much of an improvement, and you got needless churn.

When Rust 1.0 came out I thought, great, now I can learn something, and apply that knowledge for the time to come, you know, like it used to be and still is with C, and don’t have to adopt to the newest fad. But seems I was mistaken and I need to change code constantly if I want to keep it idiomatic.

4 Likes

Please do not include statements like this in your post. We all write them when we feel strongly about our position, but we should strive to replace them with more constructive feedback.

When directed toward me, as these are, they knock the wind out of me, and instead of considering & responding to the other comments you've made, I want to close the tab and go do something else.

1 Like

These statements are not directed towards you as a person, they are about the ideas. I understand that there is some personal effort inside that proposal, but this proposal being met with criticism or rejection doesn't mean I doubt in your potential, or would dislike any new ideas from you.

I include these statements to make my opinion on the matter clear and put it beyond doubt. Otherwise I fear I'm misunderstood; that I liked things that I point out, while I actually don't.

It sounds like there are downsides to the “implicit mod” feature, but one thing I’m wondering about is how you handle the other languages which already include source files implicitly (Go, Java, etc.)

The example I’m interested in seeing is where you conditionally include modules. Say, for example, you’re writing a parser and have 2 implementations depending on whether SIMD is available. You might have:

parse::with_simd::Parser

And:

parse::no_simd::Parser

And conditionally export either as parse::Parser so callers outside the parse module don’t need to know whether or not SIMD is available. In this case you don’t want anyone to be able to use the ‘true path’ to Parser, because it’s an implementation detail that depends on configuration.

I’m curious to see what this looks like with implicit modules.

You tag with_simd and no_simd as #![internal] and have two pub use statements in the parent, each with a cfg. In this case, it is the same number of lines of code as today (because you don’t need the mod declarations), but the fact that this module has been facaded has been moved into the module.

EDIT: Also the cfg attribute is moved into the module as well, since you’ll do #![cfg(..)] whereas today its more common to attach it to the mod statement.

This is something I see as a significant advantage of the proposal (and quite contrary to the people who seem to think this proposal favors a sort of sloppiness) - it makes it much easier to learn everything you need to know about a module, because its all in the same file.

1 Like

Ah gotcha, and would that mean someone couldn’t come along and use parse::with_simd::Parser as opposed to parse::Parser? ~Or would I need to mark Parser as pub(super)?~

If you don’t tag the module internal, someone with the simd cfg can access the parser at parse::with_simd::Parser. But if you do tag it internal, only your crate can access it at that path.

If you make the type pub(super), you can’t pub use it from the parent module, because it can’t be re-exported beyond super (this is already how pub(restricted) works).

1 Like

Hmm, I think it would be nice if it were possible to remove the possibility of that direct access completely, since it seems reasonable a future maintainer with some IDE support might discover that true path and not realise it isn’t always valid.

I realised pub(super) wouldn’t work and tried to cross it out but markdown isn’t my friend tonight :slight_smile:

You can’t cut it off entirely because the re-exporting module needs to be able to see the proper path in order to re-export it. Its an interesting argument though, sort of in favor of a more pub(super) meaning to internal.

1 Like

I’d like to put in my opinion here that I am strongly opposed to implicit modules. One of Rust’s main selling features for me is the explicit nature of things, and to make modules implicit would be detrimental to this. How would someone prevent a module from being implicitly created? The current module system allows you to cfg modules and even specify different paths for a module. With implicit modules, it could very easily get confused by and break crates that use custom module paths, especially if the custom paths are cfg’d or there’s macros at play.

The current private in public errors are quite inaccurate, with both false negatives and false positives. I personally wouldn’t mind seeing them just go away because they don’t actually provide any sort of soundness guarantees due to the false negatives, and are an annoyance due to the false positives. However, if you do want a better system, please create a system without false positives and without false negatives, so that it actually has value and isn’t just an annoyance. I don’t mind having rules against private in public as long as they are accurate which the current rules are anything but.

6 Likes

A counter attempt to the “confusing to beginners” argument:

  1. Do we have data on “new to programming” vs “knows a few other languages”? Ideally we’d have them grouped by module system design family.

  2. I put my money on “not confusing to someone new to programming”. There are a few rules and they can be followed to produce a desired module tree.

  3. If there’s something confusing, it’s how extern crates are placed at the root specifically, so code in the root appears to behave differently, not the explicit module tree.

  4. It’s far more likely IMO that someone with some experience in a few other languages, particularly if all those languages share design traits, which can make someone treat them as universal fact, will be confused by Rust’s different design.

  5. We need to do a better job at “shocking” newcomers upfront with the fact that Rust may not match their acquired intuition (cc @carols10cents @steveklabnik).

  6. Error messages and/or extended error explanations should push the user to read (and re-read) the relevant chapters of the (new?) book. If they can’t get it working, they should first try to understand how they might go about doing that.

3 Likes

To be completely clear, “confusing to beginners” is not primarily the motivation behind this proposal from my perspective, and I don’t think from @aturon or @nikomatsakis 's either. I am not able to contribute more substantively to this conversation at the moment but I’d suggest continuing on that tangent would be unproductive toward reaching a consensus.

@withoutboats Too bad, because most of the momentum this ever received was based on the assumption that the current system is confusing for beginners, you should own up to that.

Also, my position is that this proposal is as needed as being able to write methods inside a struct or enum declaration because “remembering to use impl is confusing for beginners”.

This is how we ended up with jumbled messes of features in languages that claim to have “OOP”, let’s not repeat those mistakes.

2 Likes

While I can understand wanting to flesh out the exact proposal first, that seems to have progressed far enough to start considering backwards compatibility questions (please point out if they have been addressed prior to the most recent iteration, as I’ve only skimmed the discussion up to that point). Existing code and existing directory layouts need to remain supported, and any additional features or hacks that need to be bolted on this new feature to get that count against it. Many a great proposal has been weighed down below the acceptance threshold by such problems, so they need to be considered early. In particular, it must be ensured that rustc doesn’t suddenly start including files that weren’t included before. Some examples:

  • Today, one can remove a module from a crate temporarily by commenting out the mod line. This is useful when first developing the module (start writing functionality in a new module, put it off and try something with the rest of the code, come back to the new module later) and also while removing/renaming modules (can experimentally remove the mod line without having to delete the file and perhaps later look up how to restore it from git). And while it may be good practice to remove unused modules quickly, currently one can put that off indefinitely (in the same way some people keep commented-out code around for a long time).
  • Today, it’s possible to have a bunch of binaries lying around in the same directory, possibly alongside a bunch of utility modules used by some but not all binaries (one example of such a layout is Cargo’s tests/. benches/, and examples/ directories). It would be unfortunate (increase binary size and/or spam unused code warnings) if the utility modules were included in binaries that don’t use them, and including the binaries in each other’s module trees can trigger compilation errors (as the crate root can contain attributes that non-root modules can’t, and there are some definitions that can only exist once per executable).
  • Today, most existing code relies on files not being touched if the corresponding mod foo; item gets cfg-stripped to make modules conditional. This needs to keep working, even if putting #![cfg(foo)] in the affected file becomes the new idiomatic way to do it.
4 Likes

I have to admit that when I first heard criticisms of our module system, I was confused. “It’s so clear,” I thought, “it maps directly to the AST”. But over time I’ve kept those criticisms in the back of my mind, and I’ve come to see that there are some very confusing and suboptimal things about our module system. And also to realize that many users are not, in fact, compiler authors who think in terms of the AST. =)

To clarify, I am very much motivated by the fact that I have frequently heard that our module system is a stumbling block for beginners. But I am not solely motivated by this: I also believe that usability problems persist for experienced users – not as confusion, but as a continued source of papercuts. Some of them are rectified in this proposal, and some of them may indeed be improved.

I will list the things that I personally find confusing or annoying on a regular basis that I think will be improved:

  • Forgetting to add a mod declaration.
    • For example, I commonly make a foo/test.rs file then forget to add #[cfg(test)] mod test in foo/mod.rs. I only notice this when I expect a test to fail and it does not.
  • Forgetting to remove a mod declaration.
    • I commonly rename or delete modules as part of refactoring. I do this in the file-system. I often update all uses even. But I often forget about the mod declaration.
  • The dance of adding an extern crate dependency.
    • When I realize I need something, I find it annoying to have to
      • go to crates.io/crates and find the latest version
      • go to cargo.toml and add it
      • go to lib.rs and add it there
      • add the use statement in the code that wanted it
    • Admittedly, I could probably use cargo edit, but I have never learned to and keep forgetting it exists.
  • Forget the proper path to name something.
    • I wasn’t as aware of it until this discussion, but it’s true that I often find the file where something is, and then find out that this is not the proper name for it, and I find that a bit confusing.
    • This happens sometimes in libstd, but mostly within my own crates, since for other crates I use docs.rs rather than browsing their source directly.
    • Under at least some versions of this proposal, within a crate I can always use the “true path”, though this is the part of the proposal I’m least confident about. At minimum though it ought to make it easier to detect when a module is “internal”, though I’m not sure that would help me, since usually I find things through ctags or ripgrep so I only have the filename where it is found and I haven’t actually bothered to open the file.

These are papercuts for me, but they annoy me. But I think for many beginners they lead to much bigger frustration. Most people don’t want to sit down and read a long tutorial, they want to get up-and-going hacking, so every bit of ceremony (make a file, now declare the mod, etc) stands in their way. Not being DRY also means that there are more things to copy and more things to get wrong.

There is one additional papercut that I think this proposal does NOT help. But which it may make it easier to explain:

  • Confusion about which paths are absolute and which paths are global.
    • I regularly try to type an absolute path in an expression (usually as a quick hack to test something) and find it doesn’t work. I experience a moment of confusion (“but I am importing foo::baz above, why can’t I use foo::bar?”). Then I remember I have to type ::foo::bar in expressions – for some reason this usually tops my “too ugly even for a hack” threshold so I run up and add an import.
    • The only way I think this might be easier to explain now is that one can say “uses are about bringing things from other files into your scope”. Ignoring “inline modules” for a moment, modules in this system can be thought of as being synonymous with files, and the idea that use refers to “other files” and hence you have to write self::foo may be helpful as an explanatory tool. Or maybe not, I don’t know.

Of what I’ve seen so far, I think there is one very real technical concern with the proposal (i.e., leaving aside for a moment questions of whether it is desirable):

  • Case-insensitive file-systems.
    • This is a solid point. In these systems, it is a pain to not know the desired case of the file you are looking for, and the way our name resolution works, we really want to know the names of the modules up front (i.e., without considering use declarations). Bears some thought.

There are also some patterns that this proposal makes harder. Here are the ones I recall, did I miss anything?

  • Overlaying multiple crates within one source directory (as we do with libs and binaries).
    • Personally I think this is a very confusing pattern. It often works out ok because main.rs is like a 3-line wrapper though. But it’s the recommended pattern today in many cases.
  • Making temporary, throw-away .rs files in your src directory.
    • I’ll note that this is directly in tension with the usability problems I experience around forgetting to add mod declarations. In fact, I make a point of keeping my src clean precisely so that git status is a useful tool to tell me what (A) I forgot to git add and (B) I forgot to create mod declarations for. In other words, so it can help me deal with the various bits of creating a module that are not DRY.
  • Temporarily commenting out modules by commenting out the mod line.
    • This can also, of course, be done by commenting out the contents.

With respect to @est31’s concern that they will constantly have to learn new things, I am sympathetic. This is a balancing act, and I think it’s worth asking ourselves, for every proposed change, whether it is worthwhile. But clearly we plan to make a number of ergonomic improvements that will involve changes in the recommended style of Rust programming. That is roughly the meaning of “stability without stagnation” to me. Naturally, we want to ensure that old code continues to compile to the best of our ability, even if it is making use of deprecated forms, and that it is easy to adapt in any case; but we can’t let that stop us from addressing known shortcomings. (Just in passing, I’ll note that my experience with C is quite variable here; old code frequently fails to compile, and when it does, it usually comes associated with an avalanche of warnings.)

15 Likes