[lang-team-minutes] the module system and inverting the meaning of public

So @withoutboats and I had a pretty interesting chat today about the module system and privacy, trying to dig down to first principles. I’d like to summarize that here.

A rational reconstruction of the status quo

First, the module system currently couples together three concerns:

  • Namespacing
  • Privacy scopes
  • Interaction with the file system

While in principle you could separate these concerns, there are a lot of advantages to tying them together. (And of course, there’s some flexibility here given that inline mod declarations are a thing).

One consequence of this coupling, though, is that when there’s a mismatch between these dimensions, you have to fight the module system a bit. For example, it’s pretty common to have a submodule in a file which defines a single public type, along with a bunch of private code – where the module itself isn’t exported, and instead the public type is reexported at a different path. In that case, there’s a mismatch between what you want for privacy/file system – i.e., a separate file defining its own scope of privacy – and what you want for naming, which is to export the name at some other path.

For these kinds of mismatches we tend to use “facades” and other patterns. Generally these all involve making a submodule private, while some of its contents are public and re-exported. This is basically a “design patter”, a way of using the tools of our module system to achieve a certain goal.

One consequence of this design pattern is that pub has two distinct meanings:

  • The item is world-visible
  • The item is defined “in the wrong place” and re-exported elsewhere, but you need to trace the re-exports to discover its visibility.

The pub(restricted) model makes the most sense with the first meaning of pub, since the (restricted) part is supposed to decrease the level of publicity. But because of re-exports, that it doesn’t work out perfectly.

What this means for module system improvements

There are a fair number of ideas in flight for how to simplify and streamline the module system. I think that what @withoutboats has been working toward is basically doubling down on the three-way coupling mentioned above.

Here’s a strawman proposal bringing these pieces together (which came out of discussion with @withoutboats):

  • Introduce implicit modules, as per the original post. In particular, discover modules via the filesystem organization, implicitly introducing appropriate mod delcarations. (A file ty.rs leads to an implicit mod ty; delcaration, modulo privacy, discussed next.)

  • Default the visibility of implicit modules to be the maximal visibility of the items they transitively contain. Again, this is as per the original post.

  • Keep the semantics of pub(restricted) as they are today, in particular the <= interpretation of privacy. At this point there’s widespread agreement that <= privacy is very important to provide, and that pub(restricted) gives a good model for specifying it.

  • Introduce #![internal] as a module-level attribute that can be applied to implicit modules. It has the effect of setting the visibility of that module as pub(crate), regardless of the items it contains.

What are the implications of this design?

  • You get = privacy by default. In particular, if you don’t use #![internal], the stated visibility of an item is its precise visibility.

  • You always get <= privacy, as was the intent with pub(restricted) and our overall privacy system.

  • You can express the facade pattern, by using #![internal]. However, the fact that you write this attribute helps mitigate some of the downsides of the facade pattern: a reader of the code gets a visible, local heads-up that the actual visibility of the items is going to be determined by a super module that re-exports them (though bounded by their local restriction). In other words, we’ve made the design pattern a bit more first-class, and given you a way to write down your intent. But the attribute doesn’t introduce anything fundamentally new; it’s just a way of specifying pub(crate) for the module.

  • Increases the coupling between files and namespaces, by discovering modules directly from the filesystem. (It almost always works that way in practice today, but you have to explicitly set it up. We can make it so much simpler and smoother.)

  • Increases the coupling between privacy and the rest of the system, by automatically setting the visibility of implicit modules based on their items.

Where this all heads, in my mind, is an approach where the module system just sort of disappears into the background. You don’t really think about modules at all; you just think about paths, files, and items contained in them. Things like the facade pattern, while still expressible, are a clearly delineated and more “advanced” feature, so you can start with a simpler mental model (where you get = privacy everywhere) and later learn how to tweak it with #![internal]. I think there’s a real chance from going from “Rust’s module system is complex and hard to learn” to “What module system? I just write code where it belongs and get the namespacing and visibility I would expect”.

To the extent that things are more “implicit” here, I would claim that we are just using already-explicit information that would otherwise be redundant. In the vast majority of cases, the module hierarchy exactly mirrors the filesystem hierarchy; why force you to repeat that structure? It can’t be for the sake of explicitness – it’s already explicit, just represented in a different way. Likewise with privacy: the appropriate privacy of a module is usually implied by the privacy of its items. And when that’s not right, we actually give you greater explicitness by allowing you to express your intent via #![internal].

This also doesn’t change any of the scoping rules; you still have to use an item to gain access to it. So all bindings for a module are discoverable within the file defining it.

11 Likes