[lang-team-minutes] the module system and inverting the meaning of public

I really like the idea of having mod be redundant by making use more versatile … much better than implicit modules, because I think it would introduce even more complications (which have been discussed in this tread already) than what it tries to solve. It might make the learning curve for beginners slightly better, but learning all the details of the module system would naturally become more difficult the more exceptions and special cases there are (which would probably be required because of backwards compatibility).

However, there are some problems and unresolved questions I see with this, and I’m not sure how to solve them:

  • Since use requires absolute paths (or self) you have to write use self::foo; to import foo.rs unless you are in the crate root. Since that would be the replacement of what mod foo; does today, it is unfortunate that it becomes longer. Also, does this implicit importing also happen when the use path includes super (files at arbitrary locations could be referenced that way)?
  • The behavior around glob imports has to be defined somehow. Probably use self::foo::* would mean that foo.rs or foo/mod.rs is imported, but potential submodules (foo/*.rs) are not imported, unless they are explicitly used elsewhere. Alternatively this could enable implicit loading/discovery of submodules, but that’s probably a bad idea.

As far as pub & friends are concerned, I really like the way it currently is, namely that pub means: make public to the parent and let the parent decide. It took me a while to understand this, coming from C#, but it fits the Rust philosophy of enabling “local reasoning”: I can look at a module to see what (public) child modules and items it exposes. I can look at the root module to see (roughly) what the public API will look like.

2 Likes

But that's not what it means currently.

I don't like this idea. Can't tell you how many times I created a file called email.py and got confused why from email.mime.text import MIMEText suddenly stopped working in another file.

2 Likes

I feel that this backwards compatibility approach is problematic (but it seems like it’s the best we can do given stability, which is a point against the proposal, even though I do like parts of it in isolation). With this:

  • We end up with two incompatible module systems (if they can be made compatible or mostly compatible, my concern here evaporates)
  • We have to teach both (the old one can move to an Advanced section, I guess, but then newbies will have trouble understanding their dependencies’ source unless/until the authors convert them)
  • Cargo decides to switch between them by a magic heuristic

The last one worries me the most. I don’t want to create a file in the wrong place, and suddenly Cargo decides my crate is using the old module system and everything breaks. Doubly so if I learned Rust after this switch and I don’t even know there was an old module system.

This seems a bit like the dyn Trait proposal, where we can technically keep things compatible by switching based on a heuristic, but really it is not in the spirit of stability for Rust 1.0.

Another thing that occurs to me is that this could largely be prototyped with a program that converts an entire src/ tree from new module system to current module system, by generating mod statements, munging use and removing #![internal]. This might help to see how ergonomic it is.

3 Likes

Its not a hard break though, in that the old system will still work in crates that have been passed the flag to activate the new system (in the sense that you can still have mod statements and so forth). I’m not sure what you mean by compatible; we can’t really make a system change more compatible than that.

Well the only way to make a change more compatible, is to not make a change that’s so disruptive that it requires a flag to activate. That’s my worry.

1 Like

OK, now I see what you're saying: the new module system doesn't take anything away, it only adds functionality, so that old code would continue to work?

But that contradicts this, from above:

About half of those followed a pattern that, in my opinion, we could continue to support, but about half of them simply wouldn't work with this proposal without restructuring the crates.

so maybe I've misunderstood the proposal? As I understand it, you want to change the meaning of pub, so old code would parse but it would mean something different.

It depends, I think on your perspective. If you just deleted all the mod declarations from your project, without indicating where facades were in use, then indeed some of your pub items that may have previously been hidden behind a facade would now be visible to the outside world.

The way I see it, the meaning of pub has not changed in this proposal, but the way you are expected to use it has. Today we frequently use facades that contain pub items that we do not intend to expose outside of the current crate (or outside of some module X in the current crate). When using the proposed mod-less system, that would still be possible, but would be discouraged (and perhaps linted against). The preferred way to do that would be to declare such items as pub(crate) (or pub(in X)).

We could do this same sort of linting today, under the current system. That is, we could lint if you have something declared as pub which is not reachable from the crate root, and thus encourage you to use precise publicity declarations. The only difference then between today's system and the one that @withoutboats proposed is that, in the proposal, the system would automatically create a module for each file based on the publicity of the items they contain and the presence of a #![internal] attribute (or some other indicator of a facade).

One thing I want to note. You can understand the system this way, but there are times where this meaning doesn't work. For example, fields and methods are both attached to types. In that case, if you declare them as public, the "parent module" can't control who uses them.

Even apart from this limitation, though, I've also found that "let the parent decide" sounds very elegant, but can be insufficient in practice. You wind up with a mix of things, some of which are intended for internal use within the crate, and some of which are part of its public, stable API, and it's hard to tell the difference at a glance. Witness e.g. this comment in Rayon's source, indicating a pub struct that is not "fully public". It's important to know of course what is fully public and what is not, since changes to public things aren't semver compatible.

1 Like

The more I write Rust, the more modules I have, and the more I think that the implicit modules proposal here would be really nice [*].

Still, my learning curve for modules was high, because they were already to implicit (when I say mod x, how does the compiler know where to look?). I think implicit modules would have made it even harder (sometimes I need to write mod, sometimes I don’t, plus the magic of the compiler knowing where to look).

So while I think this is a nice productivity improvement for those that do know the language, I would like this to be coupled with better error messages for those who are learning it. Implicit is awesome as long as it does what you want/expect. When it doesn’t, there is no code anywhere that you can “debug” to find out what’s going wrong “because that logic is implicit”.

My point being, I would like to hear what we can do to improve the experience when things do not work as an user expect (e.g. typo in the name of a module file without a mod declaration anywhere in the program).

[*] My only concern of making the module system more implicit is that we are kind of making the filesystem (or some meta file system) part of Rust the language. If we are ever going to produce a spec, this might result in a lot of “implementation defined”-speak to avoid having to define what a filesystem, paths, … actually are (there are a lot of weird platforms out there). We are probably implicit enough already for this to be unavoidable, but maybe it is worth it to think about how would one specify this in a filesystem agnostic way.

The compiler could detect this ambiguity, i.e. use foo when foo.rs or foo/mod.rs exists but --extern foo was passed by Cargo, and emit a warning or error.

This would make things more difficult if you actually want to have a module under the root with the same name as a crate dependency, but that's already unidiomatic; you're already forced to use an alias in the extern crate declaration. Having the answer under the new system be "just don't do that" doesn't seem like a big deal. (Rust 2.0 isn't coming anytime soon, so the old syntax will remain available if you really really want to do that.)

That said, spitball alternative possibility: use extern::foo; / use extern::foo::bar;

i.e. loading of extern crates would still be implicit when useing them or anything under them, but they would have their own namespace rather than being under the crate root.

1 Like

Makes sense…

I wonder if a system designed to handle this might want to take on some of the functionality of stability attributes, currently reserved for the standard library.

Like, you could have regular pub be equivalent to #[stable], something like pub(unstable) for permanently unstable code, but also some concept of feature gated items that client crates could opt into.

Probably a bad idea, but I thought I'd mention it.

I'd say yes, but only if the resulting path is within the crate root (compiler could easily check).

Just to be clear, you wrote "sometimes I need to write mod", but I think that you would simply never write mod, under this proposal, unless you were declaring a module "in-line".

Definitely -- it'd be interesting to game out what would be the most common errors people would encounter and ensure that the error messages guide them down the right path. We've tried to do this with the current module system, though I'm not sure how successful we've been.

I think the way we would phrase it is that, when the rust compiler starts, it uses some implementation-defined method to identify the set of input files and what modules they correspond to. As @anon32976453 wrote earlier, you could think of this as rustc being supplied with a list of paths (and probably it would have such a mode, though I imagine by default it would gather them itself).

Yes, I was talking about declaring modules "in-line". Beginners will write inline modules all the time (e.g. for tests), so they need to understand both explicit and implicit modules.

If it can be put it this succintly then I am alright. IIRC what C++ Modules TS does is to just say that the implementation must provide an implementation defined way of mapping sources to modules, and not much more. Whether this implementation defined way is passing flags to the compiler, writing some module map file, or just structuring the sources in a directory structure, is up for each implementation to decide.

.[quote="nikomatsakis, post:156, topic:4804"] As @yigal100 wrote earlier, you could think of this as rustc being supplied with a list of paths (and probably it would have such a mode, though I imagine by default it would gather them itself). [/quote]

I don't want to derail the discussion with this, but you just used the word path, which I think can be used to give a taste of why a spec should completely avoid the details. Otherwise one needs to define what is a path? And then somebody asks what happens in platforms where there are no paths? What happens in platforms with different kinds of paths? (e.g. soft/hard links in linux), etc. etc.

I'm wondering why this is the case. I rarely write inline modules, and never for tests. I guess it's a matter of taste -- I prefer to have my test code in another file.

In any case, I don't think "explicit and implicit" is the right terminology here. I would say "they need to understand inline and file-based modules". And I agree, but that is true regardless of what system we adopt.

1 Like

It might have been discussed already and I might have missed it, but one area where I’ve used inline modules in the past quite often is when they were generated by macros.

1 Like

That's already true today with the current system, so allowing more cases of implicit modules does not introduce a new kind of difficulty.

My concern is precisely the opposite: each system in itself makes sense but what happens when a user needs to combine both systems in the same crate? E.g. the user prefers implicit modules except for that one module where they need to specify a custom #[path] attribute?

I’d prefer a different trade-off:

###Physical layout

rustc foo.rs

Currently, this compiles foo.rs and all transitive files linked via mod bar; declarations.

This should become a warning to inform the user they are using the legacy system and advise them how to upgrade. Once upgraded, the same algorithm remains the default but now it’ll resolve to just foo.rs.

Rustc would then learn to accept a set of sources and that could also be specified in Cargo.toml per artifact and passed along to rustc. This can be made expressive enough to allow dirs and globs specified in addition to plain files and cargo would help manage that as it already manages other build configuration aspect such as optimization level. IDEs such as intellij can easily keep the Cargo.toml updated for the user and provide the magical it just works experience for the users.

Logical layout

Now that rustc already obtained an explicit set of sources we can define implicit modules without stepping on someone’s temporary “junk” files.

\\ with_implicit.rs
fn foo() {}
mod abc::def {
  fn goo() {}
}

This defines everything “outside” any module to be “internal” implementation details of the current file. (should be equivalent to an anonymous namespace in C++, and we should allow to parallel this with an anonymous inline module as well. iow:

mod foo {
   mod bar {
      mod {
          fn f() {}
      }
     f(); // ok, parent module has access
   }
   f(); // not ok, outside of parent module
}

Back to our previous example, two things of note:

  1. It is very important to allow nested module declarations in order to prevent rightwards-drift. e.g. mod abc::def {}
  2. As is today, the path is relative to where the module is “mounted”. Without any mod name; statements (which we deprecated above) this is reduced to be relative to the crate root. Thus this subsumes all the separate “advanced” usages into a single system.

Accessibility

  1. We already define internal modules without any additional syntax.
  2. pub can now become truly public (world-accessible) and the path is the original module path, as long as there are no anonymous modules in the way.
  3. crate can be added to mean crate-wide visible and the default “nothing” remains as private.

Note, we don’t need any accessibility modifiers on the modules themselves anymore and we can just ignore them or issue deprecation warnings. This accomplishes more with less syntax variation and a simplified and unified model.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.