[lang-team-minutes] the module system and inverting the meaning of public

nikomatsakis · February 25, 2017, 11:51am

I agree, it's worth thinking about.

I'm confused a bit by this paragraph. I believe that #![cfg(foo)] would be the way to do it; are you saying that this is insufficient?

I could see such an argument: I'm not sure of our current behavior, actually, but at some point we debated about making e.g. #[cfg(nightly)] mod bar ensure that the file for bar.rs is not actually parsed unless nightly , so that it could employ new syntax that stable compilers can't parse. It seems like we could support the same thing around #![cfg(nightly)], but it's a bit more intrusive -- that said, I think I would want to support it anyhow, since I'd want to be able to do #[cfg(nightly)] fn foo() { /* use nightly stuff here */ }, so this is perhaps "a feature not a bug". (We've also talked about having the ability to do a "rough parse" that would be used for attribute macros, where we parse enough to find the end of an item, but we don't require that it be fully valid Rust code. Seems like the same thing could be used here, since #[cfg] can be considered an attribute macro that is built into the compiler.)

eddyb · February 25, 2017, 11:53am

FWIW I'm perfectly content with implying some form of extern crate when --extern provided them, i.e. what Cargo already does. Cargo.toml is often redundant with src/lib.rs, yes.

Other redundancies are only apparent IMO, and have the potential to create confusion in people who take Rust on its own merit at the same time as people who are hung on other systems.

nikomatsakis · February 25, 2017, 11:59am

Can you clarify what you mean by "only apparent"?

petrochenkov · February 25, 2017, 12:02pm

This one annoys me too! The solution that I would like is the opposite to implicit extern crates though. I think we need to employ Cargo's configuration-by-convention once more. extern crate items are declared in the crate root in most cases, so Cargo may skim through the root module and collect extern crates (possibly with version attributes) ignoring everything else. Anything more complex (extern crates in inner modules or even macros) has to be specified in Cargo.toml manually, but this should be rare.

EDIT: One extra benefit of this system is that rustc's unused_extern_crate lint would simultaneously help to remove unused build system dependencies as well. Now unused crates are often still reside in Cargo.tomls as dead weight. EDIT2: Comparing the set of used crates with crates supplied with --extern would be more robust though.

withoutboats · February 25, 2017, 12:03pm

I think the extern crate stuff should be held off for now, I think that’s an orthogonal change to the system from what we’re talking about here.

nikomatsakis · February 25, 2017, 12:04pm

Ah, not exactly a backwards compatibility concern, but I wanted to raise that one other thing I think we should try to address is the "private-in-public" situation. I think we are pretty close to a plan there, actually, but there were some unknowns (primarily around what rules to use for impls, I think), and I think it's worth trying to consider said rules together with this proposal. That said, I have to go do some non-Rust related things right now... so I don't have time to do it now! Just wanted to note it down while I was thinking of it.

eddyb · February 25, 2017, 12:07pm

I can at least try!

IMO there are three necessary components for what may seem like a def-use relationship in these cases:

existence (outside the "universe", which is a crate in this case)
import definition (bringing it into the "universe")
and finally, use (referring to a definition)

In the case of crate dependencies, they are:

existence in crates.io, github, etc. or in rlib form
import definition in Cargo.toml that ends up as an "universe" input (through --extern)
(currently) a second import definition as extern crate
uses in paths

In the case of modules, they are:

existence on the filesystem (or nested in the source)
import definition through mod
uses in paths

hanna-kruppe · February 25, 2017, 12:10pm

I am saying that #[cfg(foo)] mod foo; works today (i.e., excludes foo.rs from the module tree, even if it may still be parsed — I hadn't actually considered that nuance) and is the prelevant style today, so it needs to still prevent the file foo.rs from being included as module under the new rules (anything else is a breaking change). And like all other use cases, I have no doubt that this can be achieved somehow. The only question is how much complexity, special cases, compatibility flags, or other hideousness this requires.

I fully agree with the arguments you gave above that this new module system alleviates common annoyances for all users, even advanced ones. If designed in a clean room for a new language, I would take it over the existing system in a heartbeat. Forgetting a mod line in particular happens to me almost every time I add a new module.

I am just very worried that, after all backwards compatibility issues (and other constraints like public-in-private rules) have been accounted for, the resulting system may be smooth sailing if you "stay in your lane" and use it as intended, but winds up much more complex and error-prone as a whole.

glaebhoerl · February 25, 2017, 12:12pm

How big of a problem is the case-insensitivity thing? As far as I’m aware these systems are still case-preserving, which seems like the more important property for our purposes?

(It might be weird on some level for Rust to impose a form of case-sensitivity on top of a case-insensitive file system, by requiring you to use the same case in Rust code as in the filename, while the OS itself imposes no such requirement, but it does seem like it might be possible?)

withoutboats · February 25, 2017, 12:13pm

I think I understand you, but to clarify I believe you mean that you're concerned foo would be included regardless of that cfg, because once the cfg strips the mod foo it would be an "implicit module"? I agree that that mustn't happen. I think that can be solved without exposing additional complexity to the user though.

I have sort of the opposite papercut of forgetting a mod file. I add the mod file before creating the file, save the supermodule, and then get an error message from syntastic that the module I'm about to create doesn't exist but I have a mod statement for it.

hanna-kruppe · February 25, 2017, 12:17pm

Yes, this is correct.

I don't know enough about how cfg stripping works to be sure, but I am inclined to say it's the easiest to solve of the cases I enumerated (as the source code already contains an indicator that the programmer has thought about the module). I would be much more interested in proposed solutions to the other issues.

est31 · February 25, 2017, 12:40pm

Its a backwards compat problem. E.g. in lib.rs you have mod foo; and on the disk you have Foo.rs. Thanks to case insensitivity it works today (I think it does?). But when implicit mod gets added, you will get a new mod Foo as well.

glaebhoerl · February 25, 2017, 1:02pm

I'm not sure it is. If you have an explicit mod foo; declaration, it asks the OS for a file named foo.rs and gets back Foo.rs. If there is a *.rs file which was not included by any explicit mod declaration, it gets implicitly included with the same name and capitalization as it has on the filesystem (so Foo in this case, if the mod foo; weren't also present).

(It does seem finicky and the sort of thing that has the potential to cause headaches with tooling.)

yigal100 · February 25, 2017, 8:06pm

I find @withoutboats’ analysis of the pain points to be accurate. There are two orthogonal layers for code:

Physical - the item fn bar() is located in (physical) file foo.rs
Logical - the item is inside logical namespace (“module”) foo.

Rust’s current system is indeed a leaky abstraction and Rust should choose whether the physical layout is identical to the logical or not. Both options are equally valid and internally consistent, we just need to choose a one single design.

Java for example chose the the former whereas C# chose the latter. As a C++ dev who switched to Java and then to C#, I never had problems with either.

Regarding case insensitivity: if a windows dev declares a module “foo” but uses file “Foo.rs” that is surely a bug. That means that the code will not compile on case sensitive OSes (all the Unices). If at all possible that should be a compilation warning.

kornel · February 25, 2017, 11:33pm

Don’t implicitly include all files. Include only used module files.

I see most responses assume that implicit modules would be like in Go and include everything from the filesystem. I think doing this is unnecessary and problematic.

Implicit modules should be included from the filesystem only if there is any reference to them, i.e. use modulename or modulename::item() somewhere in the code.

If someone wants to compile a module that isn’t otherwise used, they can use mod foo of course.

This way you get best of both worlds: mod is unnecessary in almost all cases, and junk files can remain on disk without breaking compilation.

kornel · February 25, 2017, 11:45pm

By not using it!

I'm strongly suggesting that only use of the module should make it exist implicitly. Merely presence of an unused, unreferenced file in the filesystem must not add it to the project, that'd be a mess!

So implicit modules should always require two things: the file & a reference to the module from the code being compiled.

That just works with mixed bin & lib projects. That just works with commenting out code. That just works with temp and unused .rs files.

kornel · February 25, 2017, 11:53pm

If I understand correctly that makes modules pub mod by default (and since implicit modules are more convenient than explicit, that means all modules will be public!?)

I never use public modules. So for me #![internal] would be like "use strict", <!DOCTYPE html>, and <?php — fixing by default mode with boilerplate mandatory in every file

Please make implicit modules private (or pub(crate)) by default, and require pub mod or pub use or #![pub] for the very rare cases where modules are used as part of crates intentionally-designed public API. Library's exposed public API can't be implicit!

BTW: #![pub(crate)] and #![pub(super)] seem more consistent with the rest of the syntax than #![internal].

withoutboats · February 25, 2017, 11:56pm

How could this work for a library which may never use any type from a module above that module?

kornel · February 25, 2017, 11:56pm

You’d use pub mod foo or pub use foo::Type then.

The way I imagine this:

implicit modules are crate-private by default (i.e. usable from anywhere in the crate, but not outside of the crate)
unused files are not compiled.
The two rules above mean you can use whatever you need, wherever you need it, without having to declare it, and without having to worry about it leaking to the public API or leaving unused junk in your app.
For designing your public API you use pub use to export things from small libraries, pub mod to export whole modules in large frameworks (public API should always be designed, not implicit by whatever files were laying around).

withoutboats · February 26, 2017, 12:06am

Many of the responses in this discussion are hard to square with my understanding of what the proposal is. It seems possible that a difference in our understandings is the role of pub(restricted) in making this system work.

As Aaron mentioned yesterday, pub currently operates with a dual meaning - it could mean either “public to the world” as opposed to a restricted visibility, or it could mean “public to my parent” - my parent might re-export it, but it also might not. Based on that first understanding, Niko had previously proposed linting against someone declaring an item pub if it isn’t actually visible in your API. Because pub(restricted) is unstable, I imagine a huge portion of the items in Rust crates would fall afoul of that lint today.

One of the major advantages of this proposal is that pub occupies a single meaning, instead of this dual meaning that it currently inhabits. If you don’t want an item to be a part of your public API, you don’t mark it pub, because that’s what pub means. You mark it pub(crate) or pub(super) or some other restriction.

I’ve gotten the distinct sense that people think this proposal is apathetic to the idea of a carefully crafted public API, but it is entirely opposite. The goal is to make it easier to control what is public and what isn’t, by making it a decision local to the item definition.

There are a few cases which don’t fit into this, in which you for various reasons want an item to be public, but not at its definition site. The #[internal] attribute is intended for resolving that narrower use case.

Topic		Replies	Views
My Preferred Module System (a fusion of earlier proposals) language design	10	2453	March 25, 2019
Yet another module modification proposal language design	13	1389	March 25, 2019
The Great Module Adventure Continues language design	243	16641	March 25, 2019
Module, SubModule, subdirs, etc language design	6	1130	June 10, 2023
Please welcome withoutboats to the language design team! announcements	4	4593	March 25, 2019

[lang-team-minutes] the module system and inverting the meaning of public

Related topics