Idea: façade crates

We have a number of crates which define an interface. This interface gets used by other library crates, but its implementation gets defined only in a root binary crate (or as part of dev dependencies). Examples include:

They follow the same general pattern, but use different ad hoc solutions, which often also include runtime overhead due to dependency on dynamic dispatch.

It could be beneficial to introduce a common way for implementing this pattern.

Proposal

Introduce new "facade" and "facade implementation" crate types. Crate can not be simultaneously both facade and facade implementation.

A facade crate is defined using the following declaration in its Cargo.toml:

[package]
name = "foo"
facade = true

# optional section, which MUST contain one dependency
[default-facade-impl]
some_crate = { version = "1",  features = ["foo", "bar"] }

This allows us to use #[facade] attributes in such crate. The first place where it can be used is on functions:

// getrandom
#[facade]
pub fn getrandom(dest: &mut [u8]) -> Result<(), Error>;

// log
pub mod logger {
    #[facade]
    fn enabled(metadata: &Metadata<'_>) -> bool;
    #[facade]
    fn log(record: &Record<'_>);
    #[facade]
    fn flush();
}

An item marked with #[facade] MUST be public.

Note that in the log case we remove the Log trait and replace it with a number of functions which describe required logger functionality. This is because application overwhelmingly use global logger.

But in the case of alloc, libraries may need to be generic over allocator, but use by default the global allocator. In such cases facade crates should define a facade type:

#[facade]
type GlobalAlloc: Allocator;

Facade implementation crates are defined in Cargo.toml like this:

[package]
name = "my-logger"
# The specified crate MUST be presented in the dependencies section
facade-impl = "log"

[dependencies]
log = "1"

Facade implementation crates can use #[facade_impl = "..."] attribute:

//getrandom_impl

// `getrandom::getrandom` MUST be a facade function with exactly the same
// signature. The impl item may not be public.
#[facade_impl = "getrandom::getrandom"]
fn getrandom(dest: &mut [u8]) -> Result<(), getrandom::Error> {
    // ...
}

// my_logger
#[facade_impl="log::logger::enabled"]
fn enabled(metadata: &Metadata<'_>) -> bool { ... }
#[facade_impl="log::logger::log"]
fn log(record: &Record<'_>) { ... }
#[facade_impl="log::logger::flush"]
fn flush() { ... }

// my_allocator
#[facade_impl="alloc::GlobalAlloc"]
struct MyAlloc { ... }

// Without this trait impl the facade_impl on type will result
// in a compilation error
impl alloc::Allocator for MyAlloc { ... }

Facade implementation crate MUST cover all facade items from its facade crate.

Crates which depend on a facade crate can use facade items as simple "concrete" items. When library dependent on a facade crate gets compiled, the facade may not have known implementation (e.g. when we run cargo check on a library crate). In such cases facade functions will be equivalent to extern "Rust" fn functions, while facade types are equivalent to existential types.

Facade implementation crates get used by binary crates simply by pulling them as dependencies:

[package]
name = "my_bin_crate"

[dependencies]
my_getrandom = "1"
my_alloc = "1"
my_logger = "1"

# This crate can depend on `getrandom`, `alloc`, and `log`.
# It will use the facade implementations under the hood.
my_lib = "1"

Library crates can use facade crates only as a dev dependency:

[package]
name = "my_bin_crate"

[dependencies]
getrandom = "1"
alloc = "1"
logger = "1"

[dev-dependencies]
my_getrandom = "1"
my_alloc = "1"
my_logger = "1"

In other words, a library crate can not define things like global allocator outside of tests and benchmarks.

Root crates can use items from facade implementation crates, e.g. for configuration or setting up resources.

If a crate specifies two conflicting facade implementations, it immediately results in a compilation error. If a facade crate specifies a default facade implementation and root crate specifies its own implementation of this facade, then the default implementation gets replaced. If a facade crate does not specify a default facade implementation and root crate does not specify a facade implementation for it, then it results in a compilation error.

Note that facade implementations are specific to facade crate version. In other words, a project may pull two loggers with different facade implementations:

# root crate
[dependecies]
// facade impl for log v1
my_logger = "1"
// facade impl for log v0.5
other_logger = "0.1"
foo = "1"
bar = "1"

# foo
[dependencies]
log = "1"

# bar
[dependencies]
log = "0.5"

In this case foo will use implementation from my_logger, while bar will use implementation from other_logger.

When cargo builds a dependency tree which includes facades and their impls, it performs the following dependency tree transformation:

// Naive dependency tree
my_bin_crate
  |- my_logger
     |- log
  |- my_getrandom
     |- getrandom
  |- foo
     |- log
     |- getrandom
  |- bar
     |- log

// Transformed dependency tree
  |- my_logger
  |- my_getrandom
  |- foo
     |- log
       |- my_logger
     |- getrandom
       |- my_getrandom
  |- bar
     |- log
       |- my_logger

In other words, facade implementation crates become dependencies of their facades. Their facade items get replaced by respective facade implementation items. It allows compiler to perform various optimizations, which may significantly improve performance in some cases.

Unresolved questions

  • Is it possible to migrate to facade crates in a backwards compatible way for alloc and std stuff?
  • Should we require that facade types must provide a dummy type (i.e. type GlobalAlloc: Allocator = DummyAlloc;), which will be used as a placeholder during compilation of library crates?
  • Should we allow generic facade functions? Such functions may require a dummy implementation as well.
  • Should we support facade implementation of several facade crates in one crate?
  • How to prevent circular dependency after the dependency tree transformation? In other words, log depends on my_logger for facade impls, while my_logger depends on log and may use non-facade items from it (e.g. it could be traits which must be implemented by a facade item).

Prior art

The main difference with previous proposals seems to be that Cargo now knows about facade crates, that's probably a useful addition.

Yes, it allows us to completely remove the complexity of the linked RFC associated with "implicit generics". Also it works with plain functions, so you don't have to introduce a trait in cases when you don't want to be generic over facade functionality.

I don't see how this proposal makes that part any easier for the compiler.

Actually, I think traits are the proper and standard “Rust way” to define interfaces that another crate must adhere to.

2 Likes

During compilation of a binary crate compiler knows concrete items, so there is absolutely no need to use generics. In other words, before compiling log it will compile its facade first and all dependents of log will use a concrete implementation of the log facade. After facade impls got "pasted" into log, everything else is handled by compiler as usual, i.e. log becomes a simple crate and handled accordingly.

With library crates (outside of tests and benchmarks) it's a bit more difficult since facade implementations are not known. In case of facade functions it's easy, as mentioned in the OP, compiler can simply treat them as extern "Rust" fn. There could be issues with generic facade functions, but I think we can start with simply forbidding them. With facade types we could require facade crates to introduce "dummy" types, which will be used as a placeholder (see the "unresolved questions" section) during library crate compilation.

In some cases, traits simply do not make much sense and only bring unneeded complexity to the table. Why would we allow generality over a hypothetical getrandom trait? Arguably, it does not make much sense to introduce a logger trait either. Currently we have to use the Log trait because log has to rely on dynamic dispatch bounded by it.

How do you add a new optional free function to a facade crate? With traits this can be done by having a default impl.

One way is to mark functions with a body with a separate attribute:

#[optional_facade]
pub fn foo() {
    // default impl
}

Facade implementation crates may skip implementation of such functions, in which case the default impl will be used. But I think need for it will be extremely rare.

I love this proposal. This is very much like the approach we've talked about using for allowing this kind of interface/multiple-implementations split.

A few details I'd want to tweak:

  • For simplicity, I think we could just use existential types that implement a trait, and not have functions as well. In particular, because you can always provide global wrapper functions around a global instance; those functions don't need to be facade functions.
  • Library crates should be able to set default facade-impls, just like facade crates. They just can't force the use of their preferred facade-impl. The rule should be:
    • A facade-impl overrides the facade-impl from any dependency, and the root crate overrides everything.
    • If multiple of your dependencies declares different facade-impls for the same facade, and you don't declare a facade-impl overriding them all, you get a compilation error.
  • facades should be able to have features. The union of all the features applied to a facade is passed to the facade-impl (namespaced in some way, to separate them from the features of the facade-impl).

(Also, from the bikeshedding department: I know that "facade" is a common description of this design pattern, but I think I'd sooner use a more straightforward term like "interface".)

3 Likes

Niko M’s “crate-level where-clauses” seems related to this idea, but not quite close enough for me to put them together… Still, an interesting related problem of “I want to use X but also have some additional assurances about X that its home crate may not provide”.

1 Like

There's a prior art in Haskell's backpack. Unfortunately backpack adoption was stymied by lack of support in Stack, but it could still inform the design of Rust's version of this feature.

And yes I also think that facade is a bad name, because it means another thing in OO design pattern. Maybe call it interface crates, and have some concrete crate implement one or more interfaces.

Like, you could have the interface crate async-executor and have tokio and async-std implement async-executor

Pondering: In a way, we have this behaviour crate-locally with type Foo = impl Bar;. So maybe a syntax aping that could make sense? With the specific type coming from the implementing other crate instead of the uses in the declaring crate. (EDIT: Oh, Nemo points out below this might be just me remembering the other RFC draft.)

I agree with Josh that the "what's the interface you can use and implement" should go through traits for this, to avoid needing to re-address a bunch of the same questions. For example, I'd expect 3245-refined-impls - The Rust RFC Book to apply to these.

5 Likes

This is exactly what the linked RFC2492 was designed around (extending the previous syntax existential type Foo: Bar;).

1 Like

I am not a compiler developer, but I think (non-generic) facade functions should be significantly simpler than facade types with trait implementations. As mentioned in the proposal, compiler can simply view them as extern "Rust" fn, while facade types would require either complications around implicit generality, or provision of dummy placeholder types in facade crates.

As I wrote earlier, I think that in many cases it does not make much sense to introduce traits, since they get implemented only once per project and there is no code generic over them. So it's could be even worth to first introduce only facade functions, while facade types would a future extension.

I don't think it's a good idea. Your second rule will mean that an added dependency somewhere deep in a project's dependency tree (e.g. on cargo update) can break build of the project. You could say that adding a facade impl dependency is a breaking change, but it will be quite surprising and brittle. In other words, your proposal is somewhat similar to hypothetical removal of the orphan rule.

How do you see a use-case for facade impls set by library crates?

I agree. It could be a useful extension.

I am not too attached to the name, so it could be changed in future refinements of the proposal.

The traditional approach to these use cases is external linkage since generating the final binary always requires one final linkage stage.

I see no tangible benefits here to going with a novel approach - the traditional way is a tried and tested solution and we don't really gain anything here by adding more complexity to the type system. As already mentioned above overriding the panic handler or changing the global allocator is really something that happens once for the binary and not something that needs genericity in a particular program.

So all we need is really just native support for extern "rust".

Even the suggested cargo integration here seems overly complicated. I'm all for optional integration with cargo for ease of use and ergonomics purposes but it shouldn't have to be a requirement for this feature nor do I see the need for specific crate types here. It should be possible to simply provide additional crate deps for the final binary. This could also be used in conjunction with other build tools such as Basel.

4 Likes

Just like #[global_allocator] works today, it's convenient to be able to just include a crate and have it hooked up automatically.

I agree that arbitrary libraries providing implementations of some facade that can conflict is iffy at best.

That doesn't cover the definition crate providing a default, though. Using #[global_allocator] as the example here, if you don't use #[global_allocator], std provides a default fallback which uses alloc::System as the global allocator. Preventing 3rd party global resource interfaces from replicating this behavior seems undesirable. In a perfect future, you could imagine defining #[global_allocator] in terms of this general functionality[1].

There's one key benefit you're ignoring: behavior in the failure mode when the implementation doesn't link up to the facade. If you're just using #[no_mangle] linkage, then the link error is a link error complete with entirely inscrutable error outside of the rustc sphere of control. If cargo/rustc understand whatever global resource definition, then we can provide useful diagnostics explaining what's actually missing and how to address the issue.

Cargo can't address the case where no implementation exists, but does actually have a feature to best-effort improve redefinition of the same symbol — the package.links key. If two crates use the same string for this key, cargo will give a somewhat reasonable error for the duplication. It's a bit of an abuse semantically — the key claims to be about declaring that you are the provider for the definitions for a native library by that name — but it functionally fits; any time you're defining #[no_mangle] symbols, you're "providing" a "native" library that can't be duplicated without causing linker conflicts.

Additionally, #[no_mangle] is fundamentally unsafe, because rustc cannot check that the signatures match between the definition and usage sites. Using a different system allows it to be typechecked and guaranteed safe rather than just held together by vibes.

Using a system that cargo/rustc are aware of also means you don't have to come up with globally unique unmangled names to use, either, because the build system can handle hooking up mangled names for you.

It's already completely defined to use extern "Rust" (but of course completely unsafe) and link to a #[no_mangle] function defined elsewhere in the same binary. What's not allowed is doing so to link together multiple Rust binaries (e.g. dynamic linking), but doing so within a single built binary is perfectly allowed.

The purpose of using a trait isn't to add new powers beyond "typechecked extern linkage;" it's to have an actual definition of what the API hole is without introducing entirely new concepts into the language. #[global_allocator] hooks up via trait GlobalAlloc. Behind the scenes that's defining and using magic known symbols __rust_alloc, __rust_dealloc, and __rust_realloc[2], but a trait is used to describe the set of functions that must be provided as a set.

Global resource interfaces fundamentally are a global dyn Trait object (plus vtable). Modelling it as such only makes sense.


  1. This probably won't actually happen because the Global allocator has extra special semantics probably not present when using the allocator implementation directly (namely, the allocations being considered trivially replaceable/removable for optimizations), but it could still be done (if Global is a magic wrapper around the actual resource plumbing which adds the magic semantics). ↩︎

  2. (names subject to being entirely unstable implementation details) ↩︎

3 Likes

Are you talking about default facade implementation? Note that the proposal includes this capability via the default-facade-impl section. If a binary crate will not define its own facade impl, then the default crate will be used, i.e. in case of alloc it would define a hypothetical system_alloc crate as its default.

I wouldn't say fundamentally. It's just the easiest way to implement them currently.

With this proposal crates which implement global resources move to the bottom of a project's dependency tree, thus compiler will be able to work with their code as with simple functions and types. For example, we can imagine a hypothetical Loggger trait with associated LOG_LEVEL constant. Such trait is not object safe, so it can't be used as dyn Trait, but it can potentially work with this proposal.

Well, Rust already "supports" it and getrandom even uses it on unsupported platforms. But as explained by @CAD97, it has various ergonomic and safety issues. This is why, for example, log does not use it and instead relies on dynamic dispatch.

Well, that's precisely what I was getting at. We need full proper support. Using no_mangle is basically going to C level imo. (Yes, I realise the ambiguity since system level equates C level in the popular platforms due to historical reasons)

There is some level of integration & cooperation between rustc and the linker afaik and I'm happy to see that used here. I agree with the point that otherwise the linker errors would be less human friendly. But I'd prefer is we can avoid requiring further & deeper integration into cargo that requires new specialised create types.

What I'm after, from a user's perspective, is the ability to provide rustc regular creates as link time dependencies that work in concert with true Rust Level external symbols.

Also, the current trait for the global allocator is an implementation detail of the library author I should not care about and there is little benefit to exposing it rather than its functions for improved error handling. If I see an error that there was no implementation found for allocate than it is plenty clear already what's missing from the rustc command line.

So in summary, the simplest and smallest feature here imo is:

extern "rust" fn alloc(...) 

Which I can see in the source code and docs clearly mentioned was an external item which yeah could potentially have a default implementation but I could override by using say a regular jmalloc crate that defines this function.

If it is missing I expect the error to point to this specific item. No magic symbols or unmangled / untyped C names so it is clearly understood within the confines of the Rust language conceptual model.

And yeah, we could add support later to also support fully extern traits if there's sufficient demand.. I just am sceptical this would be worth the hassle.

If the compiler has everything all at once it can optimize the memory layout and avoid a heap allocation (that's, like, converting the trait object into an enum). Since façade crates / interface crates would presumably be resolved by Cargo before passing anything to rustc, the compiler could just receive this information as input.

This may make no_std users more likely to use this mechanism too.

There’s no heap allocation necessary even without Cargo helping; &'static dyn Foo works fine. But if it’s provided as a symbol (or through some other compile-time mechanism) then compiling with LTO could devirtualize those calls (and possibly without LTO), and that is beneficial.

I don't think that is possible. The implementation crate depends on the interface crate, which means that the interface crate has to be compiled first and thus the information necessary can't be provided when compiling the interface crate.

What I think would work is to mangle the interface symbols as if they are in the interface crate and then directly use those symbol names in the implementation crate as the names of the symbol definitions. This avoids all indirection.