Data point about the new module system learnability and musings about language stability

matklad · April 5, 2019, 11:38am

I am teaching a Rust Course at the moment, and we’ve just conducted a mid-course survey, which included “what is the most difficult topic” question. The lecture about modules is the unfortunate leader:

Which of the topics are unclear?

Week 4, Modules — 4/7 [57%]
Week 1, Intorduction — 0/7 [0%]
Week 2: Lifetimes, ADT — 0/7 [0%]
Week 3: Traits — 0/7 [0%]
Week 5, Functions and Iterators — 0/7 [0%]
Week 6, Error management — 0/7 [0%]

There’s also at least one student who spend hours trying to split code into two files, before asking a question

Note that I explicitly covered only 2018 flavor of modules: no extern crate, no ::abs::paths, no mod.rs.

This is a small amount of data, of course (there’s about 50 students on the course, about 30 took the survey, and 7 answered this particular question), but it seems trustworthy to me.

The immediate conclusion is that modules are still hard to learn

However, the thing that worries me is that module system changes produced the most churn when transitioning to 2018, and the changes didn’t seem to make the situation drastically better (here I can be proven wrong by providing more data). See also this reddit thread that shows another aspect of this “change, but not for the better”: https://www.reddit.com/r/rust/comments/b65ecn/modrs_in_rust_2018_yes_or_no/.

I wonder if we could design language-design process better to avoid similar situations?

This is an extremely sensitive topic, but I personally feel that the current language-design process is relatively biased towards changing the language, rather than towards rejecting the features and keeping the language stable (we do reject more RFCs than we accept RFC, “unbiased” in his case is not 50/50). So I see this particular data point as a weak conformation of my feelings.

repax · April 5, 2019, 12:05pm

I don’t know. 4 people out of 30 or even 50 cared enough to state that the topic was hard to follow? Perhaps you could try to interview them?

Centril · April 5, 2019, 12:21pm

Well, at least there's one positive data point there... Everything else was clear. (..or you are an excellent teacher, that could also be the reason...)

What were the reasons this student failed during these hours? Did they not ask questions about other topics as well?

It is hard to draw any such conclusions from your data point. For example:

Would the number have been higher than 4 with the old module system?
- There were many data points that the old system was confusing so the relative change is what is interesting to the question "was it a success?"; not absolute numbers.
Was there particular parts that were more unclear? e.g. was it mod.rs that was unclear? uniform_paths? the lack of extern crate? or something else?

As the reason for your post is a data point, I think in similar situations of "how can we make X easier to learn?" the answer is to gather more statistically significant data in the language design process.

vorner · April 5, 2019, 12:35pm

I don’t know about learnability. But after switching the mindset to the new one, I find it more comfortable to use and reason about.

Additionally, I’m working on a work project with another person. This project is his first bigger Rust code base (like, just after these small exercises while learning it, so actually the first real Rust code). When transitioning to 2018 epoch he expressed that he likes the new one much more than the old and that it is much more predictable for him.

So while I’m very conservative when it comes to language changes, I think this one turned actually quite nice.

I believe module systems have the problem that people come with a previous idea how they should look like from other languages (unfortunately, each one with a different one from a different language) and are confused because Rust does it differently but not that much differently. Lifetimes are hard, but there’s nothing to unlearn about them. Also, because it was a single lesson, this could be influenced by other factors (eg. the coffee machine not working that day).

dhm · April 5, 2019, 11:29pm

Imho we should incentivise and teach using crate paths with the leading :: (even when no ambiguity is involved), it is far clearer to distinguish crates from modules this way.

sanxiyn · April 6, 2019, 6:55am

My hypothesis is that both Rust 2015 and Rust 2018 module system is not hard to learn, but people don’t expect to spend any time greater than zero to learn module system. They think, of course, it should work just like Python, Java, Go, fill in the blank here. That’s why they spend hours before asking anything: they don’t think there’s anything to learn, so they don’t ask.

Especially, in my experience, many people never even imagined possibility of module system not directly 1:1 tied to file system. It just never occurred to them. In my opinion, given the extremely low expectation of time investment by students, the only way to really fix learnability here is to give up and use 1:1 filesystem model. I am extremely against such change, so I don’t think learnability is fixable, except by correcting low expectation.

gbutler · April 6, 2019, 1:52pm

The really odd thing here is that neither Java nor C# are one-to-one with packages/modules and the file-system but almost everyone thinks it is. Rust can be one-to-one if you want, just like Java/C#, but, doesn’t have to be, just like Java/C#. I’m amazed how many people think Java/C# has one-to-one modules with the filesystem and think it is bad that Rust isn’t one-to-one. It kinda just makes me shake my head.

samsieber · April 6, 2019, 1:56pm

I’d wonder if part of the issue is that they googled for things and found things referencing the old module system. In other words perhaps they were actually confused about more than modules, but were able to resolve those concerns most clearly because there was less churn with the other areas than compared to rust.

I’d be really interested in finding those that have issues with modules, help them learn it, and then clearly identify what made it click. Hopefully, the reasons would fall into clear categories, or maybe even just a single cause.

kornel · April 7, 2019, 12:09am

One thing that is definitely drastically better is the ability to refer to std::foo inside modules.

In my little experience teaching Rust I've also seen that the mod items in particular are confusing. If I explain to people it's the same as struct/enum/fn, it seems to help. But my impression is that people expect modules to just exist without any declaration.

mark-i-m · April 7, 2019, 2:01am

Meaning that if we removed the need to write mod foo;, people would struggle less?

sanxiyn · April 7, 2019, 3:54am

Yes, I believe people will struggle less if mod foo; is directly inferred from filesystem. I am strongly against: that way lies madness.

rpjohnst · April 7, 2019, 6:02pm

The alternative that actually matches other languages is to move mod foo into the build system, instead of interspersing it throughout the program source.

People coming from compiled languages are already familiar with “adding source files to the project” or similar, and people coming from interpreted languages often have to do something similar (e.g. Python __init__.py).

bill_myers · April 7, 2019, 10:40pm

There’s also the fact that in several other languages (e.g. Java, C#, C++ convention), directories create modules and files just add their contents to the directory’s module.

I think this might be better for Rust too, and also solves the “mod.rs” problem since it doesn’t matter what filename you use for it.

This change plus automatically including any “.rs" files recursively as already pointed out in the previous comments might make the module system much more approachable and reduce the need for boilerplate (namely “mod xxx;” and "pub use filename::” entries)

CAD97 · April 7, 2019, 11:15pm

The problem with any solution that gets rid of mod statements is that you no longer can have a difference between pub mod and pub(self) mod (or more commonly just mod).

I agree that a solution that “just” made any pub symbol in a folder public to the world at that folder path would make the module system easier to learn. Or rather, to not learn. But it also loses what makes the module system powerful.

The power of Rust’s module system comes from being able to separate where code is to where code is exposed. It comes from the emergent complexity of declaring modules as “first class” constructs in the language rather than just “where code is” and reexporting to fit the desired API.

Incremental changes that could improve things:

Warn on .rs that aren’t in the compilation (maybe even go ahead and check them based on if they were included “simply”)
Tooling to separate “where the file is” from “where the symbol is exposed” entirely
Auto use as seamless as IntelliJ IDEA’s for Kotlin/JVM

Changes that remove power of the system today in favor of ease of use:

Make files automatically pub mod exposed
Make files automatically mod exposed but allow pub mod
Make paths automatically pub mod included iff they have pub definitions

I fully back the idea that the module system is “too complicated” because it exists. We don’t have 1:1 exposed symbol location <-> code location. Being able to organize code how you want is good.

Users go into the module system expecting it to work however their last language worked. If they only know one language, if it’s different (it is), they’re going to have issues because it’s not expected to be a pain point. If they’ve used multiple languages with different ways of including code, they’re more likely to adapt to how Rust does things, but the stumbling point is that there’s anything to learn at all, and given the number of existing different ways to do it, there’s no way to make it intuitive to everyone.

comex · April 8, 2019, 8:25am

matklad · April 8, 2019, 8:37am

Note that rustc can’t do that, b/c it doesn’t know what are the compilations, it works with a single crate at a time. RLS maybe can do that, by looking at the depinfo files rustc/cargo produces.

IntelliJ does wan about unused files, and it also automatically creates a foo.rs file in the appropriate location for a mod foo; declaration. rust-analyzer does the latter as well, but it doesn’t issue warnings yet (might be a good-first-issue, if the LSP supports file-level warnings).

In general, I fully support the meta-point that good tooling might drastically change the calculus around what is hard and what is easy in the language. I bet a significant chunk of Kotlin programmers don’t know what is the syntax importing stuff, and what are the rules around the visibility of extension functions, simply because imports are fully managed by the IDE and are always folded by default.

Nemo157 · April 8, 2019, 8:57am

This could cause issues with crates that conditionally include different modules for platform support, e.g. https://github.com/carllerche/iovec/blob/master/src/sys/mod.rs, unless rustc was smart enough to look through disabled cfg attributes for defined mod's and ignored them automatically (although that seems impossible in general as the code behind the cfg attribute might not compile enough to find the mod statements).

orthoxerox · April 8, 2019, 10:31am

I can confirm that the module system is the most confusing feature of Rust. Lifetimes and borrowing are complex, but not confusing, as you have no existing knowledge to confuse you. You can spend hours bashing your head against a compiler error, but when you finally fix it, the fix makes total sense.

But modules are something that exists in every language, so you have to fight your intuition. If you treat mod foo; as C’s #include "foo.h" you are in a much better starting position that someone who has C# or Java experience. Some of the most confusing aspects I’ve fought with are listed below, in the order of decreasing annoyance:

I can use any visible item of an external crate anywhere in my code as soon as I reference it in Cargo.toml (thank you, Rust 2018). I cannot use any visible item of the current crate without importing its specific module.
There’s no way to split a file in two without creating a new module. Java enforces a very rigid structure in practice (directory=package≈Rust module, file=class≈Rust struct and its impls), which is suboptimal if your classes are tiny, but Rust’s file=module approach is suboptimal is when your impls are large. C# is extremely flexible and can’t be compared with Rust at all.
mod foo {...} defines a new module, mod foo; references an external one. use imports a name into your module, pub use exports it as well.
You have to explicitly mod your modules if you want them to be included in the compilation. Again, if you treat mod as #include, it kinda makes sense, if you come from a language where the compiler just globs a directory for files, it’s not.

zrk · April 8, 2019, 12:30pm

As another datapoint, I gave a few crash courses on rust at my company (C++ and python developers), and the modules were one of the things on which people stumbled at first.

I took an approach were I explained "how compilation works" so to speak, with the idea that a crate (~compilation unit) has a root src file, that can define inline modules or declare modules defined in other files. A key point to mention is that modules are the privacy boundary in rust, and the fact that it more or less align with files is intuitive.

It helped to compare with what python does, and to start directly by explaining the 2018 module system.

BTW, I'm convinced (and so are my colleagues once they finally grasped the concepts behind modules) that the module system of rust is extremely well done and one of the best points of the language (especially when coming from C++). Having private reexport by default, access to parent modules private stuff from children modules, and foo.rs + foo directory for a foo module with submodules are in particular good points.

IMO when teaching modules we must emphasizes that it is an important part of rust (by coupling it with their role as privacy boundary), that other languages all have different module systems, and that it is a difficult problem to solve "optimally" (at least IMO it is).

I'm not sure I understand? Seems to me that you can?

//! any_module.rs
use crate::foo::bar::Baz; // assuming that Baz is visible to the current module
fn bleh() -> Baz {
    Baz::new()
}

The above should work. Of course, you have to declare your module foo in the root source file of your crate, and the submodule bar in the foo module, but you don't have to "import" foo and bar into some_module to use them?

I don't really understand why. Since submodules have access to the private parts of their parent modules, it is never a problem for me to create submodules of a module to split the implementation. The privacy system is then very flexible to export the items of the submodules to pretty much any visibility (private=implementation detail, public to parent module=internal module interface, public to crate=internal library interface, public to the world=external library interface)

This explicitness seems good to me, as it allows for more control (like was reported in this thread). An example in this thread was to conditionally include modules depending on the configuration. Another one was to be able to set the visibility of the imported module. Python does the whole "implicit import" thing, and it forces us to prepend our module names with _ to signal that they are private. It makes more sense to me to have nothing imported by default, import private with mod foo; and import with another visibility with the pub keyword.

orthoxerox · April 8, 2019, 1:20pm

The need to declare the modules is probably what tripped me up. I expected the parent module to pick up its submodules automatically etc.

Let's say you have structs Foo and Bar that refer to each other, because they form a FooBar graph. Both implement a whole bunch of traits. For me the natural impulse is to create foo.rs and bar.rs, but refer to them as a single module. Maybe spit out debug.rs or a similar file for trait impls that have no important business logic.

I am not saying that Rust's module system is bad or that it is worse than others. It is, however, confusingly different. People come to Rust expecting compiler to swear at them about their poor memory management skills. No one expects module management to be a stumbling block.

If the current system is deemed good enough and proposals like this aturon's proposal are off the table, it really needs some documentation love. That first quote of yours in my reply is really a thing the docs must highlight.

Topic		Replies	Views
Revisiting Rust's modules language design	180	21548	August 3, 2017
A Potential Rust Learning Project Group announcements	40	8558	April 8, 2021
Revisiting Rust’s modules, part 2 language design	118	14389	March 25, 2019
[Roadmap 2017] Productivity: learning curve and expressiveness	116	17293	March 25, 2019
Cross-post from u.r-l.o: Module sytem changes are going into FCP Edition 2018 feedback	1	824	March 25, 2019

Data point about the new module system learnability and musings about language stability

Related topics