Data point about the new module system learnability and musings about language stability

That is ingenious! Could you open an issue about this?

8 Likes

Related to the stats: the sample size is really to small to draw any firm conclusions, and the tests were not independent - they were in the same course! In cases like this I recommend valuing qualitative evidence, like any shared narrative from the students, over stats.

2 Likes

In the latest revision of Chapter 7 (still only on nightly), we've done exactly that-- we now have a subchapter named "Separating modules into different files" as a result of the feedback in this issue which is similar to this thread.

The latest revision of Chapter 7 does this as well-- when defining modules, we start out by saying they're for organization and for privacy. Your data points validate a lot of the changes we made in the latest revision!

We'll be thinking about the feedback in this thread, thank you all! We did explain modules and files in the way we did deliberately, however, and not because it's "only convenient for the teacher". I really feel like the way the Rust module system works, you should think about the modules first, and then think about the files (by defining modules inline first and then extracting them to files named with the modules). It seems like folks have the most trouble when they put code in files first, and then try to shoehorn their files into the module system.

18 Likes

I think this gets to the heart of the problem: people naturally view files as a separating point, because they provide separation at the filesystem level, and also separation within their editor.

And since other languages rely on the natural intuition of file separation, of course people will carry that intuition over to Rust.

But that intuition of file=module would still exist even if Rust was their first programming language, because that intuition is based upon the coding environment (the filesystem and editor).

In my own personal Rust projects, most of the time I use the file=module approach, it's only in a few select cases where I actually use inline modules.

That's not because I don't understand inline modules, it's just because it's easier and more natural to use files for separation (remember, editors are specifically designed to organize based upon multiple files and folders!).

I think we should do some polls for experienced Rust programmers to see how often they use inline modules (and why they use inline modules). That will help guide how much the book should focus on inline modules vs how much it should focus on files.

8 Likes

I use inline modules for macros. It is very useful for macros that you can generate namespaces solely using code and without touching filesystem in any way.

2 Likes

It’s probably not about how often inline modules vs separate files are used, but about the mental model.

I remember how I learned Rust modules having prior C++ experience.

  • mod m { ... } is namespace m { ... }, but not as transparent (the basic concept).
  • mod m; is namespace m { #include "m.rs" } (a sugar for out-lining module bodies into separate files).

It worked pretty well.

3 Likes

Oh, no doubt inline modules are useful. But the question is about how we teach modules to beginners, so that naturally excludes macros (which should be taught later, once the student has reached the intermediate level).

Basically, inline modules need to be taught, there's no denying that, but the question is whether files should be taught first (and inline modules later), or the other way around. In other words, how much emphasis should be put on them.

3 Likes

To me files are a stronger unit and more fundamental than a module, and definitely were when I was learning modules (and failing to get them, and being extremely frustrated that they fail when I use multiple files, and the book only shows inline case that is too easy and has no practical use for me).

As I've explained in the issue, the current chapter doesn't explicitly state the crucial difference between how modules are split in Rust and most other languages:

If you take an inline module and split it into files (- lines), in Rust it's:

+// Outside module
+pub mod instrument {
-    // Inside module
-    pub fn clarinet() {
-        
-    }
+}

but in C++/PHP/Go, if mod was equivalent to namespace or package, it'd be:

+// Outside module
-pub mod instrument {
-    // Inside module
-    pub fn clarinet() {
-        
-    }
-}

mod m; also maps more or less to JS const m = require("m.rs"), but that construct doesn't have an inline analog.

To me this was problematic, because mod is like namespace in inline examples, but the inline examples don't apply to multi-file case. When files are involved suddenly the analogy is wrong, and it's not 1:1 mapped to namespace, but becomes a different syntactical construct with the namespace being outside of the file, rather than inside.

2 Likes

FWIW, my lighting fast introduction to modules went something like that:

  • In rust, functions, structs, traits, enums are items accessible through a path (e.g. std::mem::replace, String::new(), String)

  • Each item has a visibility that is controlled by the keyword pub (world-visible) or qualified keyword pub(crate) (crate-visible) that can appear in front of the item (by default an item is said private). To use an item through a path, the item must be visible in the current context. cue example

  • A module is also an item, that contains other items (functions, structs, traits, enums, other modules).

  • The module name appears in the path of the inner items.

  • To use an item contained in a module, both the module and the item must be visible in the current context. So a module allows to restrict the visibility of its inner items to the module’s visibility.

  • To instantiate modules, you can either define them inline, or declare them explicitly, and then define them implicitly in a file with an expected name:

    • Explicit definition:
    // in src/lib.rs
    mod foo { 
        fn bar() {} // path: crate::foo::bar
    }
    
    • Explicit declaration, implicit definition as a file (“implicit” in that the filename is the module name and doesn’t need to be repeated in the file itself)
    // in src/lib.rs
    mod foo; // path: crate::foo
    
    // in src/foo.rs
    fn bar() {} // path: crate::foo::bar
    
    • Explicit declaration, implicit definition as a directory
    // in src/lib.rs
    mod foo; // path: crate::foo
    
    // in src/foo/mod.rs or src/foo.rs
    fn bar() {} // path: crate::foo::bar
    mod sub; // path: crate::foo::sub
    
    // in src/foo/sub.rs
    fn baz() {} // path: crate::foo::sub::baz
    
  • The explicit declaration is needed because the compiler starts from the root source file and adds modules (defined inline or as files) as it discovers them through mod declarations. Not adding files automatically has use cases (such as conditional compilation).

  • items defined in parent modules are visible to the current and children modules. By default, items defined in children modules are not visible to parent modules. Use the pub keyword to get a different visibility for an item (still restricted by it’s module visibility).

  • Things to keep in mind about modules:

    • A module cannot be defined in multiple files, but it can have submodules (visibility rules allow for a rough equivalent of modules split over several files)
    • A module can only restrict the visibility of its inner items, not extend it. For an item to be world visible, its entire path must be pub, and the item must be pub.
    • A pub item in a module will only be visible to those who have access to the module.
    • structs are not the privacy boundary in rust, modules are. This is especially useful for e.g. builder patterns, unit testing
    • Modules aren’t compilation units, crates are (well, something something incremental compilation)

Granted, this is more “cheatsheet-level” material, but I find the distinction between declaration and definition useful, and it works by analogy to how items can be declared and defined at different places in C++. Nowhere do I draw the parallel to C++ namespaces, because I think they are an ill fit for rust modules. namespaces don’t control visibility and can be defined part by part (in multiple files or even in the same part). They are more some kind of mangling sugar IMO.

9 Likes

Just joined this community, I will love to join this tutorials class, I don’t know how far you guys have gone and what are the chances of me joining.

@antzshrek my lectures are available on YouTube (https://www.youtube.com/playlist?list=PLlb7e2G7aSpTfhiECYNI2EZ1uAluUqE_e), but they are in Russian. If you want to learn Rust, the best way is probably following the Rust Book

Hmm, from reading the whole thread at once, one of the main issues seems to be the following: splitting code accross multiple files implies more namespacing / longer item paths.

Maybe the following “trick” should be noted in the documentation:

  • the single file path/to/list/mod.rs

    //! path/to/list/mod.rs
    
    pub
    struct List {
        // ...
    }
    
    impl<'list> IntoIterator for &'list List {
        type IntoIter = ListIter<'list>;
        // ...
    }
    
    pub
    struct ListIter<'list> {
        list: &'list List,
        // ...
    }
    
    impl<'list> Iterator for ListIter<'list> {
        // ...
    }
    
    impl<'list> ExactSizedIterator for ListIter<'list> {
        // ...
    }
    
  • can be split into two files without hurting ergonomics:

    //! path/to/list/mod.rs
    
    pub use self::iter::*;
    mod iter;
    
    pub
    struct List {
        // ...
    }
    
    impl<'list> IntoIterator for &'list List {
        type IntoIter = ListIter<'list>;
        // ...
    }
    
    //! path/to/list/iter.rs
    use super::*;
    
    pub
    struct ListIter<'list> {
        list: &'list List,
        // ...
    }
    
    impl<'list> Iterator for ListIter<'list> {
        // ...
    }
    
    impl<'list> ExactSizedIterator for ListIter<'list> {
        // ...
    }
    

This way, item paths don’t “grow” when splitting code into multiple files.

3 Likes

Do you have any plans to translate your lectures into English? I watched some of them and really liked them. I think a lot of Rust beginners would appreciate them.

I do have vague plans for this, but no promises!

3 Likes

I’ve talked a bit with students, and looks like the main hurdle is lib.rs/main.rs confusion. That is, in the simple cargo layout bin and lib crates live in the same directory, so it’s unclear when one should mod and when one should use.

I wonder if this particular problem is execebrated on 2018 by the fact that there’s no extern crate anymore, and main depends on lib implicitly (that is, its not specified in Cargo.toml).

The tooling fix here would be to tweak Cargo’s default layout and conventions to make sure that different crates live in disjoint directories.

4 Likes

It might also make sense to have bins and libs in different directories somehow.

src/bin/foo.rs works already, and it’ll pick modules from a different directory.

So for a crate with both bin and lib, that seems like a sensible layout:

src/lib.rs
src/modules_for_lib.rs
src/bin/main.rs
src/bin/modules_for_bin.rs

However, it’d be a bit odd to have cargo new always create src/bin/main.rs by default. For a bin-only project that seems like an unnecessary directory nesting.

I have personally always found that layout confusing (though i do use it) because it is weird that bin needs to import the crate that is in the parent directory.

I would rather something like this:

src/lib.rs                  // or maybe lib/lib.rs?
src/modules_for_lib.rs
bin/foo.rs
bin/modules_for_foo_bin.rs

If we still had mod.rs as common, I’d suggest lib/mod.rs and bin/mod.rs.

As is, this is a mostly pointless post.

I tend to specify all my bin paths explicitly personally anyway, unless it’s just a bin crate.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.