pre-RFC: inline mod

  • Feature Name: inline_mod
  • Start Date: 2017-08-04
  • RFC PR: (leave this empty)
  • Rust Issue: (leave this empty)

Summary

One very common boilerplate in Rust is the facade pattern, where additional submodules are created to have code separation at the file level, but present that as a single module in the external interface. The futures crate is a great example of this. This RFC would introduce the inline prefix to the existing mod statement which would make that particular module anonymous, therefore, replacing lots of boilerplate with a single additional word.

Motivation

Iā€™m going to paraphrase Aaron Turonā€™s recent modules post:

Letā€™s take a concrete example from the futures crate. Futures, like iterators, have a large number of methods that produce ā€œadaptersā€, i.e. concrete types that are also futures:

trait Future {
    type Item;
    type Error;
    fn poll(&mut self) -> Poll<Self::Item, Self::Error>;

    fn map<F, U>(self, f: F) -> Map<Self, F>;
    fn then<F, B>(self, f: F) -> Then<Self, B, F>;
    // etc
}

Each of these concrete types (Map, Then and so on) involve a page or so of code, often with some helper functions. Thus, there was a strong desire to define each in a separate file, with the helper functions private to that file.

However, in Rust each file is a distinct module, and it was not desirable to have a large number of submodules each defining a single type. So, instead, the future module has code like this:

mod and_then;
mod flatten;
mod flatten_stream;
mod fuse;
mod into_stream;
mod join;
mod map;
mod map_err;
mod from_err;
mod or_else;
mod select;
mod select2;
mod then;
mod either;

pub use self::and_then::AndThen;
pub use self::flatten::Flatten;
pub use self::flatten_stream::FlattenStream;
pub use self::fuse::Fuse;
pub use self::into_stream::IntoStream;
pub use self::join::{Join, Join3, Join4, Join5};
pub use self::map::Map;
pub use self::map_err::MapErr;
pub use self::from_err::FromErr;
pub use self::or_else::OrElse;
pub use self::select::{Select, SelectNext};
pub use self::select2::Select2;
pub use self::then::Then;
pub use self::either::Either;

This kind of setup is known generally as the facade pattern, and itā€™s pretty ubiquitous in Rust code.

The facade boilerplate is needed to deal with a misalignment: each adapter is defined in its own file with its own privacy boundary, but we donā€™t actually want that to entail a distinct module for each (in the internal or external namespace hierarchy). That means we have to do two things:

  • Make the modules private, despite that they contain public items
  • Manually re-export each of the public items at a higher level

When first trying to navigate the futures codebase, you have to read the future module to understand how its submodules are being used, due to these re-exports. For the futures crate, this is a relatively small annoyance. But it can be a real source of confusion for crates that have more of a mixture of submodules, some of which are significant for the namespace hierarchy, other of which are hidden away.

However, this proposal is not trying to solve the problem of finding the location of an item, given its extern module-based path. The facade pattern is premised on hiding this location and any feature which facilitates this pattern fundamentally cannot help with this problem.

Guide-level explanation

In the case of the futures crate, the it would change to be the following with inline mod:

inline mod and_then;
inline mod flatten;
inline mod flatten_stream;
inline mod fuse;
inline mod into_stream;
inline mod join;
inline mod map;
inline mod map_err;
inline mod from_err;
inline mod or_else;
inline mod select;
inline mod select2;
inline mod then;
inline mod either;

The addition of inline to mod would desugar into the following:

mod my_mod;
pub use my_mod::*;

Items within the module are private by default, unless marked with some version of pub.

Reference-level explanation

inline mod my_mod;

is desugared exactly into

mod my_mod;
pub use my_mod::*;

Visibilities on inline modules are prohibited as confusing. Due to filtering properties of glob imports pub use my_mod::*; keeps visibilities of items defined in my_mod intact when reexporting, i.e. pub items will be reexported as pub, pub(crate) items will be reexported as pub(crate), etc.

The inline modifier should be implementable as a contextual keyword, which makes it backwards compatible (along with the fact that the behavior of existing code does not change).

Migrating crates from using the current facade pattern to this feature should also be backwards compatible, assuming the relevant changes are made to various privacy modifiers.

Drawbacks

  • Perhaps people prefer the current pattern because it is more flexible.
  • This proposal adds a small amount of complexity and the benefit may not outweigh the costs.

Rationale and Alternatives

  • This design is focused on removing boilerplate through the addition of a targeted feature, which is backwards compatible. Most other designs which accomplish the same thing, are not backwards compatible or propose additional functionality which is not targeted at reducing boilerplate.
  • Alternate designs:
  • The procedural macro system of Rust could be extended to allow an attribute on a mod statement like #[facade] mod foo; which would create the facade pattern automatically. This would result in foo being declared as an item in scope, but other than that would likely have the same effect. Though, this solution would require completely different RFC, with implications for macros 2.0, I imagine. Using #[inline] would be confused with the other attribute with this name.
  • The impact of not doing this is that the boilerplate would continue when implementing facade pattern.

Unresolved questions

  • None remaining.
8 Likes

I really like this proposal. Its a good approach to the facade pattern, as it proves that you donā€™t need to get rid of the mod keyword, which would be a sad loss in my opinion.

1 Like

Seems like this would also be nicely compatible with glob specifications for those who donā€™t want to specify all their modules:

inline mod *;
1 Like

I wonā€™t propose that as part of this RFC, but I agree that itā€™s nicely orthogonal.

I agree that simplifying the module facade pattern would be a useful feature. Though, I argue against introducing an inline keyword. The syntax of Rust is already rather complex, especially for beginners. I would only add new syntactic elements if the problem could not be solved otherwise and/or if the use is so prevalent s.t. other ways of doing it should be discouraged to get a more consistent feel across crates.

So as an alternative I propose that the inline_mod feature should be implemented as a proc_macro_attribute in an external library, e.g.

#[facade] mod foo;

Currently what is missing for this to work in nightlyā€™s proc_macros is a way to determine in the handler where the proc_macro has been invoked, e.g., module path and/or file system path. Such a feature would also be handy for other proc_macros. The proc_macro would then parse the submodule on its own to determine the items and reexport them as wished at the invocation site.

That way we could also easily allow other implementations or extensions such as @phaylon proposed without going through the whole RFC process again. It would just be another proc_macro library.

Interesting idea.

It is somewhat different than what the proposal above, in that this would introduce foo as a module and the macro would do the reexport automatically, rather than never creating foo as an item to begin with. Though, I suppose that distinction is academic, perhaps?

Iā€™ll add it to the list of alternatives.

Ah sorry too fast, you could not reexport the elements without the mod declaration. I leave the rest below for the sake of completeness. So yes it would introduce a foo item

Not necessarily. The proc_macro_attribute has full control over its output (which is not the case for custom derives). So the library could strip away the mod foo declaration. You could even support both with something like:

#[facade(hide)] mod foo;
1 Like

Ok, edited the post to include that. Let me know if I missed something.

Iā€™m completely unfamiliar with proc macros and their stability story, but using them here seems like it may have the right mix of cost/benefit over a language based solution.

I wouldnā€™t use #[inline] as the attribute is already used by the compiler with a different meaning, otherwise it is fine.

1 Like

Toucheā€¦ completely forgot that. Hilarious.

I like this.

Bikeshedding time: Instead of inline mod, how about we just collapse the existing keywords?

 pub use mod and_then;

Iā€™m suggesting it because this is a thing I actually tried, and was disappointed it didnā€™t work.

6 Likes

Facade pattern as described in the futures example is equivalent to

mod my_mod;
pub use my_mod::*;

, so inline mod can desugar exactly into this construction.

It would also solve the question about items private to an inline module naturally - they will stay private an wonā€™t be accessible from the parent unless marked with pub(super).

Reference-level explanation

$vis inline mod my_mod;

is desugared exactly into

$vis mod my_mod;
pub use my_mod::*;

Reference-level explanation (variation 1)

inline mod my_mod;

is desugared exactly into

mod my_mod;
pub use my_mod::*;

Visibilities on inline modules are prohibited as confusing.

Note

Due to filtering properties of glob imports pub use my_mod::* keeps visibilities of items defined in my_mod intact when reexporting, i.e. pub items will be reexported as pub, pub(crate) items will be reexported as pub(crate) etc.

2 Likes

I think having them private is needed to make this useful. In fact, the futures motivation you borrowed explitily states that it wants helper functions private to that file.

Doesnā€™t this specification result in my_mod being introduced as an item? One of the things I liked about the initial specification is that module ends up being completely transparent, which prevents code from accidentally referring to the reexported items via multiple paths. Does that seem like a desired property?

Probably not.

If you want a "public facade" (like std::collections), you can write pub inline mod my_mod; and it will need to create a proper item named my_mod.
If the facade is an implementation detail, you can write inline mod my_mod; and just not to refer to my_mod in the current module, others won't be able to refer to it anyway because it's private.

This also may be useful for disambiguation.

inline mod collection1;
inline mod collection2;

let it: Iter: // ambiguity on use
let it: collection1::Iter; // OK
let it: collection2::Iter; // OK

I really like the idea of files being anonymous modules which still have privacy from each other, and only folders being ā€œactualā€ modules. I think that should be possible, or even the default, in whatever we end up with.

That said, with data showing that the difference between ā€œuseā€ and ā€œmodā€ is very confusing for newcomers, I am not sure if still having ā€œmodā€ at all is a good way forward. Indeed this concept is fairly unique to Rust it seems (not entirely unique though, e.g. the Coq proof assistant has a very similar scheme).

Incidentally, not having mod and having files be anonymous could even mean that mod.rs is in no way special any more, again making teaching simpler.

1 Like

Where can I look at this data?

I personally find this hardly believable, partially because I still remember how I learned the language.

"Module ā‰ˆ namespace in C++, mod m; is a namespace put into a separate file to avoid huge files."
That's all, that was one of the simplest things in the language, and nothing seemed related to use at all.

What I found non-trivial is where exactly mod m; finds the file with code for namespace m, especially the m/mod.rs case, I had to carefully repeat the examples from rust-by-example before it clicked.

One more thing that I found surprising (but simple) is that names from outer modules are not in scope in inner modules. This is different from C++, but it's one of the cornerstones of Rust module system.

Still all these details were completely dwarfed (by orders of magnitude) by efforts spent on learning libraries, so I honestly don't understand many attempts to make things that are already simple "less confusing to newcomers", it's a drop in the ocean.

I donā€™t like this proposal.

I find the futures example flawed, as futures-rs does indeed not re-export all items from these modules into the parent module, only the main struct, but for example not the construction function new, which is exported as and_then::new.

The described example would fail to compile, because most of these modules define a function called new.

The whole setup of RFC sounds a lot of c-style textual replacement, which brings itā€™s own share of issues: itā€™s all or nothing and is prone to collisions, which a lot of similarly structured modules bring with them. Also, such the system is prone to accidentally introducing names into the namespace.

Reusing terminology like inline makes it hard to separate function inlining from the proposals usage of the word and will confuse beginners.

The usage of the facade pattern in the futures-rs points to where the described way to work is indeed good: Every module implements a richer interface than the exported one, which is subsequently made thinner through the re-exports.

1 Like

I agree with the rest of your post, but a considerable amount of newcomers to Rust are not from a C++ background.