[lang-team-minutes] the module system and inverting the meaning of public

This is definitely the reason it confuses me sometimes. It's not always obvious what I'm expected to explicitly declare and what I'm expected to leave up to the filesystem..

1 Like

There has been some ongoing debate on whether it is useful to have "privacy horizons" within a crate that are more fine-grained that "this module", "this crate", and "world". I find it useful to create submodules with structure that share state but that state is not exposed more broadly, but others disagree, and feel that crate-level is "good enough" for such cases. This may be a function of the size of your crate, but I suspect it's also a matter of your personal style.

I'd like to strongly agree with your stance that having privacy horizons within a crate is useful. One use case that I've found particularly interesting is keeping the use of some dependency hidden from the rest of the code, so that it might be swapped out later (for example, I actually did this at some point, replacing the use of openssl with ring).

For this reason also, the way that extern crate statements are currently setup can be a bit weird. This is another thing I floated in the Reddit discussion on withoutboats' blog post. It would be nice if they were more like use statements.

Your Rust samples contain a significant error. Every use statement in your post needs to be use self::foo instead of use foo etc, because use statements are tracked from the crate root. (I would love to figure out how to solve this but haven't found a backwards compatible solution that isn't even more confusing).

In the language team meeting several of us mentioned that even though we understand this, we get it wrong often & have to go fix our imports when we compile.

This is a big part of the reasoning behind the idea of inherited privacy - pub use self::foo; is in itself a confusing statement that we'd like to avoid.

No, itā€™s not an error. My examples were in the magical context of the root of the crate :slight_smile:

But even in deeply nested modules, I think that would still be a great improvement, because you only have to learn use, rather than have to learn use, and learn mod and understand the inconsistencies and weird relationship between them.

When I was learning Rust, I was utterly confused about relationship of mod and use. I did not realize that one is relative and the other is absolute, so I was completely dumbfounded why sometimes mod foo works, sometimes use foo works, why both together donā€™t work, and why Rust is so picky where I should use one or the other.

And pub use self:: is never needed. You can use the full path. User of the crate will have to use the full path too, so if the path is too hard to type for the author, itā€™d be too hard for users too :slight_smile:

To be clear, Iā€™m not argueing that the current module system is the best. Iā€™m just resisting any change, whether it is good or bad, unless it is tremendously better.

If Rust wants to be treated as a serious language it must not change a big aspect of the language just because it can be a bit confusing or could be a bit more ergonomic.

EDIT: I know that some languages are doing changes for exactly these reasons, but Iā€™m not taking these languages seriously, and C/C++ people in general do not take these languages seriously partly because of that.

4 Likes

I think whatever changes happen to the module system will not break existing code, so no need to worry. If so, that would be a Rust 2.0 sort of thing, and I donā€™t imagine this is a major issue enough to go that far.

That said, I wish core team should start planning for 2.0. We should not get stuck with stuff we donā€™t want, just because we want certain people to take us seriously. With continued use, we will discover lots of things that are not ideal. How many collective hours do people spend dealing with not-ideal parts of the language, and working around them in order to preserve backcompat? We also got tools that can detect if things are deprecated and can provide suggestions, and thatā€™s a hint at automated conversions. This should make things a lot easier than Python 3 (as an example).

I wish Rust does not become a victim of its own successā€¦ ā€œBut, but what about all these users? For one, thereā€™s so much Rust code in Firefox. Think of Dropbox as well. If we break, they wonā€™t take us seriuosly.ā€

3 Likes

As a bit of general perspective, Rust is what, seven years old, counting from Graydonā€™s announcement at the Mozilla all-hands in Whistler, which I think was 2010? The 1989 C standard was nearly thirty years ago, the language itself is nearly fifty years old, and C89 was a huge breaking change which some people are still carrying around compatibility #ifdefs for.

I do think C89 levels of stability are a good long-term goal, but by that metric we have another 13 years to get there.

(I havenā€™t been paying any attention to whatā€™s going on with C++ these days so someone else can make that comparison.)

This seems internally inconsistent, in that it defines the various types of visibility as private type in module X and then switches to public to a module M. Being public to module M isn't defined, that I can tell? That and various other minor inconsistencies make it hard for me to understand the proposal completely.

At a high level, I have a hard time understanding why people have a difficult time with the module system and whether it's the system that's too complex. A position I take that probably makes me sound elitist is that some things are complex and you just have to try harder to understand them. For example, I first read the recursive page table mapping in Redirect when it was first written over a year ago and I didn't understand it then. I eventually reread it and understood it a few weeks ago. Complaining that paging is too hard wouldn't make it less hard at a fundamental level. Anyways, that's just anecdotal.

In the current case, explicitly declaring a module using mod foo seems to represent a fundamental aspect of the module system. In terms of system programming languages with module systems (at least proposed ones), I only know of Rust and C++. In both cases, you have to declare them, explicitly and I think the Rust system is easier to understand. I also happen to prefer explicit over implicit.

At a high level, every language that wants to see wide adoption has to stop changing things at some point and after that point, bugs become features. I would agree that having names in use statements be based on the crate root while other names be based on the current scope is weird and confusing, but it's no where near as bad as other languages. And trying to fix it can also end badly, in my opinion. Anecdotally, C++ had the 'most vexing parse' which uniform initialization was meant to fix. But it amusingly created another parsing ambiguity that's even more obscure with enums while making it backwards incompatible to add an initializer_list constructor for a class after it's already being used, in the general case.

Sometimes when you try to fix things, you just make them worse. Hopefully we can avoid that.

No, use of modules isnā€™t an inherently complex problem. Rust just has overlapping functionality of create, mod and use, a mix of absolute and relative paths using confusingly similar syntax (which also happens to have a special case in crate root), which all complicates things for reasons that are minor and/or arbitrary, and not inherent to modularity.

In CommonJS thereā€™s a single require that combines role of both mod and use, it is explicit, and it is easy to understand.

2 Likes

I never said that modules were inherently complex. I said that declaring a module to exist is a fundamental aspect of the module system. I personally think that the module system as it exists today is fine, ignoring the <=, >= and == stuff that is not stabilized yet. The use keyword, which is technically not required (reexporting can be done in other ways for types and functions, though not easily for modules), has weird semantics, because it does name lookup differently than everything else in the same file (except for when writing the crate root). The mod statement however, is simple and does what someone would expect after reading the ā€˜bookā€™. And the extern crate statement is even more niche, but also simple and required.

It probably seems like Iā€™m being dismissive of people who have a hard time learning the module system. However, my point is that there is a point at which things become difficult for everyone to understand. If thatā€™s the module system in Rust for some people, that doesnā€™t suggest that the modules system needs to be changed. And changing it to make things implicit seems like the wrong answer. Almost everything thatā€™s hard for me to learn and reason about in languages Iā€™ve used are due to the implicit features of the language. Perhaps having more explicit things increases the initial investment and learning curve for things that need to be typed (or perhaps you just copy them from somewhere else until it compiles), but ideally, this additional knowledge helps with learning the language long term.

The #1 goal for Rust (edit: this year) is to lower the learning curve. If people hit modules on their learning curve, then that is the reason to make modules simpler to use.

While mod foo {} in itself sounds easy, it doesn't exist in a vacuum. It exists alongside use, and they are not orthogonal features, which makes it confusing, e.g.

  • in the root of the crate you write mod foo;, and can't write use foo;,
  • but elsewhere you write use foo;, and can't write mod foo;.

This is specific to Rust, and it looks weird and arbitrary from perspective of JS, Python or Java, which don't have this distinction.

I stumbled upon this, and it just added to my frustration with Rust. I did read the book, I did ask on IRC. Eventually I got it, but it was a waste of time for everybody.

Rust can't exist in practice without use, but mod can easily be removed from the list of things a novice has to know in order to use the language.

5 Likes

One thing Iā€™ve been wondering is whether explicitly telling the compiler which source files to include is really worth all the misery that it causes. Go just includes all source files in a given directory, and it seems to do just fine.

The pub(module) distinction this relies on is very confusing and uses ugly syntax.

I do understand the sentiment that having more refined horizons inside the crate might help refactoring but I get this gut feeling that this is the wrong tool for the job. This is better served by other means such as IDEs, using smaller crates, making concise and small modules and files, etcā€¦

This smells to me exactly like friend classes in C++ which is better avoided in practice.

I think most popular languages stick with a mostly three-part system and I think we could adapt a similar approach:

  1. private - implementation details inside this module 2a. something akin to javaā€™s default ā€œpackageā€ which is accessible by parent module and siblings 2b. something akin to C# internal which is assembly scoped or Crate scoped in Rust land.
  2. public - world accessible.

Iā€™m not sure which of the #2 options is better or maybe we need both. perhaps, we can say that ā€œpublicā€ is crate local and the user can ā€œexportā€ APIs with pub use

Iā€™m a great fan of pub(restricted) where you can make things public on a crate level. I disagree though with the proposal to make pub sth inside a module public to the world. This would be a breaking change. How long was Rust 1.0 ago? Not much. I donā€™t want to rewrite my program, especially not for such a minor change that isnā€™t even very useful (in fact I donā€™t like it at all).

Also I donā€™t like the argument ā€œits confusing for beginnersā€, as for me it wasnā€™t confusing.

In fact, it would be now far more confusing, as the proposed system would introduce a big inconsistency:

mod foo {
    pub struct Thing {} // This is world readable
}

fn foo() {
    pub struct Thing {} // This is not world readable!
}

The rust-lang.org page says: Rust is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.

That is basically: safety, performance, concurrency; pick three.

Thatā€™s the number one goal of rust, though at this point, itā€™s just splitting hairs. The #1 focus of Rust development in 2017 may be improving the learning curve, but that goal isnā€™t at the expense of everything else. If you want a language thatā€™s easy to learn and sacrifices any of the three things above, Iā€™d recommend Lua. :slight_smile:

At a high level:

  1. I donā€™t personally think the Rust learning curve is that bad and the places I think it happens to be bad is in the places where people made changes to the language to make the learning curve better, like autoderef, as an example.

  2. Even if I did buy into the idea that the learning curve is bad, I wouldnā€™t focus on changes to the language or the module system. That would primarily result in instability and confusion. Iā€™d focus on documentation and other educational initiatives. And where I did make language changes in this vain, Iā€™d have to prove to myself that making the change will improve the experience for the vast majority of Rust users, at all skill levels.

  3. Even if I did believe that changing the language was appropriate and modules was a big problem, I donā€™t see how either of these changes makes it easier for Rust users at all skill levels to learn Rust in the long term.

About mod specifically:

You can write mod foo; in a non-root crate, it just refers to src/**/foo/mod.rs or equivalent instead of src/foo.rs or equivalent. You can write use <identifier>; in the root crate at all, because any individual identifier you can put there is already imported. But that issue is one of name lookup and has nothing to do with mod. mod doesnā€™t do name lookup, it introduces a name.

Rust has 2 core concepts and 1 auxiliary concept:

  1. Declaring names in the current scope: mod foo, fn foo, struct foo, enum foo, etc.
  2. Using names from the current scope: foo::bar, foo(), foo {}, foo::variant, etc.
  3. Using names from the current scope and declaring them as different names in the current scope: use foo::bar, extern crate foo as bar

As far as I can tell, use statements are for convenience. Perhaps they are required in corner cases, but if youā€™re willing to type the entire name every time, you donā€™t need them. Reexporting names using pub use isnā€™t a beginner feature anyways.

If someone doesnā€™t understand the distinction between the three scenarios above, then it doesnā€™t matter what the syntax is, theyā€™ll have issues with Rust. And hiding mod from them isnā€™t likely to help or even delay the learning curve in a useful way.

And as for other languages:

JavaScript seems to require module.exports to declare a module. Python seems to require a __init__.py file in a directory and Java requires package statements. They are all different ways if indicating that the module system is in use and they are different from the way Rust does the same. I donā€™t see how they impact this discussion at all, other than to suggest that there are huge inconsistencies and misconceptions between languages and their module systems.

As for whether a novice needs to know mod, it depends on when you want to teach them code organization. When would someone in JS learn about modules.exports? Or a Python programmer learn about __init__.py? Or package in Java? None of those things are part of tutorials because code organizations isnā€™t usually part of a language tutorial. I donā€™t need mod in Rust to write a single file program as part of a tutorial, so why is it part of the critical path?

3 Likes

And just to be clearer about comparisons to other languages, every language I know that has a module system, requires something explicit at the declaration of the module which would not otherwise be required by the language to be valid code. Sometimes this is additional text in the file declaring the code for the module (JS, Java and Lua). Sometimes itā€™s an extra file (Python). Sometimes itā€™s additional text in the file which represents the ā€˜parentā€™ of the module. But in all cases, you canā€™t take a file with normal code and turn it into a module without doing something explicit.

Changing an error unresolved import 'foo' to find the module foo instead does not compromise Rust's safety, performance or concurrency. It does not break backwards compatibility. But it does help users avoid an error and progress with use of Rust further with one concept less to learn.

No, they're essential for traits. Rust relies on them for basic things like file I/O. Rust without use statements is impractical, so teaching of use is unfortunately unavoidable.

However, Rust could be made usable, and taught to beginners, without mod. If use implied mod, then novice users would not need to learn about mod to progress from single-file programs to multi-file programs.

The problem is that other languages have merged/simplified these concepts for modules. You can do const foo = require('./foo') without declaring existence of foo first.

You mention syntax like module.exports and package. These declarations are different from Rust's way of thinking, because they are in the module file. This is another way Rust's module system can be misunderstood. One could put mod foo in foo.rs mistakenly thinking that mod foo marks the file containing it as the module called foo.

mod.rs is analoguous to __init__.py and index.js, so that aspect of modules is actually similar to what other languages do. I do not suggest changing that part.

Saying it's not bad for you does not help users for whom it is bad. But making a concept unnecessary does help users who don't understand/misunderstand it.

2 Likes

That's what's called a quality of implementation issue. There could be a warning that notices a foo.rs next to lib.rs and indicates that lib.rs should have mod foo; in it somewhere. And similarly, if lib.rs contains use foo; while there exists a foo.rs with no mod foo; in lib.rs, tell the user that use foo; should be mod foo;. If those warnings don't exist and they aren't in the process of being implemented, then I consider complaining about the usability of module system to be almost a self fulfilling prophecy. Obviously if you don't have the best possible warnings and errors guiding the user, it's going to be hard to use/learn. I don't see why we would change the language before addressing the quality of implementation of the current language spec and see if that improves usability enough to move on with other things.

UFCS doesn't require use statements. I'm not suggesting that UFCS should be taught to beginners, but my statement is accurate. It's also the case that the most common traits being in the prelude means that at least the initial programs being taught and explained don't require use or UFCS at all.

If use is taught before multi-file programming is introduced and mod is avoided before that, then multi-file programming can be introduced along with mod while introducing no other new concepts. That seems like the ideal scenario for adding mod to a user's knowledge base with the greatest chance of them not being confused.

While I'm not a JS programmer, my understanding is that in order to do const foo = require('./foo') someone else is required to have typed modules.export = .... The appropriate analogy of that in Rust would be extern crate foo;, which would result in the user being able to use foo, just like in JavaScript. You seem to be trying to compare a JS user who uses another person's module with a Rust user writing their own module. Obviously, those are significantly different user experiences and it is not fair to compare them.

You are correct, it is different than other languages. But then again, Java, Python, Lua and JavaScript all have different models for modules and knowing any given module system prepares one to learn any of the others. Rust isn't special here, so I don't see why comparing it to another language is interesting after stabilization. The best case we could achieve is copying a system from an existing language and making Rust as close as possible, but that would only benefit the learning curve of users with a single language as a background, which is obviously not an interesting goal at this point in the language's evolution.

As a Rust user, I get a 'vote' and an 'opinion' in whatever process is used for community governance, much like you get an opinion and a vote. If people are allowed to say that they find the system complex and hard to learn, I'm allowed to indicate that my experience was the opposite. That's all.

1 Like

Rust is special in the sense that its module system is unlike any other language, and not in a good way.

Compared to Python, Java, Go, etc. Rust is the only language that requires separate statements to declare the existence of a file and to import it. The closest analogue I can think of is C++, but in C++ #import and namespaces are completely orthogonal, and I don't think anyone would praise C++'s import system anyway.

Anyway, I'm not sure to what extent it can be changed backwards compatibly, but removing the requirement for redundant mod statements should be backwards compatible and is an obvious ergonomics and beginner friendlyness win,

1 Like