[Pre-RFC] Forward impls

This is a rough idea I’ve had for a while for solving the coherence issues I tend to come across. I’m skeptical of the potential/necessity of this RfC, but I decided to just post it anyway and see what others think.


  • Feature Name: forward_impl
  • Start Date: 2017-01-18
  • RFC PR: (leave this empty)
  • Rust Issue: (leave this empty)

Summary

Add the ability to “forward-declare” that an impl exists, to be used to sidestep the orphan rules in multi-crate scenarios.

Motivation

Currently we have a set of “orphan rules” that ensure that you can never come across a situation where two crates provide conflicting impls for the same type-trait pair.

Part of the reason the rules are strict is that they try to ensure the property that your program will not break simply by virtue of including another crate. If this was not the case we could simply allow any impl, do a global search during compilation, and error out when we find conflicts.

This is a good property to have. The programmer should not have to deal with errors that are fundamentally not their problem; errors that originate from other crates.

However, sometimes the programmer is responsible for these “other crates”. Often, crates get split up for better modularity, better compile times, or simply the ability to pull in only a part of the functionality offered by a crate. In these cases, the crates are maintained by the same person and are logically a single unit.

Examples of this pattern include the myriad of stdlib crates shipped with rustc, or the components/ crates in Servo.

For example, let’s say we want to implement Add<str> for str where the output is a String. This can’t currently be done, since implementations on str can only exist in libcore, but the output type is defined outside it. We’ve had similar issues crop up in Servo, which have usually been hacked around.

It would be nice to have an escape hatch for punching through the coherence rules in such cases.

Detailed design

Introduce a new item type, “forward impl”. This is basically a forward-declaration, and lets an impl exist split across two crates.

// libcore

forward impl StringAddition<S> {
    impl Add<str> for str {
        type Output = S;
        fn add(&self, other: &Self) -> S
    }
}

forward impl StringToOwned<S> {
    impl str {
        fn to_owned(&self) -> S;
    }
}

// libstd

fill impl core::StringAddition<String> {
    impl Add<core::str> for core::str {
        type Output = String;
        fn add(&self, other: &Self) -> String {
            // ...
        }
    }
}

fill impl core::StringToOwned<String> {
    impl core::str {
        fn to_owned(&self) -> String {
            // ...
        }
    }
}

Here, libcore declares that a pair of impls may exist. Other crates must operate on the assumption that the impl exists as far as coherence is concerned. However, from the point of view of typechecking and code generation, this implementation does not exist unless libstd has been pulled in explicitly. The generics on the impl represent types which the downstream crate must fill in.

The fill impls work like lang items, only one crate may define them in the whole dep graph. The intention of the system is not to ever have conflicts here. If I, the owner of crate A, add a forward impl in A, I should fill it in in crate B from the same “package” (e.g. “Servo components”, “Rust std distribution”, etc). Users are allowed to include A but not B, but they are not supposed to fill in the forward impl unless it’s an exceptional circumstance (e.g. implementing your own libstd).

In case one of the unbound types on the forward impl is in the inner impl’s trait or target, from the point of view of coherence from other crates this is to be treated as a blanket impl. So, if we have:

// this is a simplistic example; such a forward impl isn't necessary since
// you can already impl foreign traits on foreign types if you substitute local
// types in the parameters of the foreign type. But in more complex situations
// (e.g. more generics) this would be useful.

// crate A
struct MyStruct<T>;
forward impl SomeImplName<S> {
    impl SomeTrait for MyStruct<S> {
        ...
    }
}

// crate B
fill impl SomeImplName<SomeType> {
    impl SomeTrait for MyStruct<SomeType> {
        ...
    }
}

from the point of view of crates including A but not B, this acts as if there is an impl<T> SomeTrait for Struct<T> in A from the POV of coherence. Ordinarily, things like impl a::SomeTrait<LocalType> for a::MyStruct<LocalType> are allowed in foreign crates, however this will make them impossible since there’s no guarantee as to what that filling impl will substitute (and we assume that the filling impl will substitute any type, treating it as a blanket impl)

From the point of view of crate A, this impl simply doesn’t exist. crate A is free to write its own impls of SomeTrait for MyStruct<SomeOtherLocalType>. From the POV of B and crates including B, this impl does exist, so they too may write more such impls.

How We Teach This

Dum de dum do this later.

Drawbacks

None whatsoever. My rfcs are perfect.

But seriously, this feels a bit like a hack (like #[fundamental]), and it’s unclear how useful it will be.

It’s another niche/confusing feature, like #[fundamental]. On top of that, it makes it possible for your crate to break by the mere existence of another crate in the dep graph. This is already true because of lang items, but lang items don’t get used so this hasn’t been a problem, whereas I would like to see this feature get used. It should be fine as long as people use this feature with discipline – not defining fill impls when they weren’t the ones who defined the forward impls.

Alternatives

Just don’t do it. It’s not that pressing a need. I’ve felt it often, but usually it can be awkwardly worked around.

Unresolved questions

Do we even allow inherent impls to be forward declared? What’s wrong with traits?

What should the syntax be? My original proposal was:

#[forward_declare(StringAddition)]
impl Add<str> for str {
    type Target = _;
    fn add(&self, other: &str) -> Self::Target;
}

and

#[fills_declaration(core::StringAddition)]
impl Add<str> for str {
    type Target = String;
    fn add(&self, other: &str) -> String { /* body */}
}

In general the forward impl proposal is introducing a tricky context-sensitive keyword that I’d rather avoid.

impl forward might be better. Or impl(forward). Idk.

This doesn’t solve the problem when the types involved and the trait being implemented are in three different crates.

I am strongly disinclined against this feature.

This would break the guarantee orphan rules exist to uphold because it could make it possible for two crates to define the ‘fills_declaration’ for the same trait, making them impossible to build together. Though this requires a serious misuse of the feature by the shared parent crate, it is still something that could happen and lead to an Irreconcilable Ecosystem Split Disaster.

It is a lot of infrastructure & likely very confusing for many users. The gain does not seem worth this downside, setting aside the previous objection.

I think that between specialization and mutual exclusion we can still do a lot to solve this by enabling a greater variety of blanket impls. Until those designs are fully implemented and we know their success or failure at combating orphan issues, I am reluctant to explore new ways to violate coherence.

I agree that there is a real problem here, but I am not sure that this is the best solution. Still, this is an interesting wrinkle on the problem I had not considered before. BTW, I just opened up https://github.com/rust-lang/rfcs/issues/1856 in an effort to catalog the shortcomings we would like to address, since I did not find an existing issue.

This seems similar to the idea (which has come up in one of these long coherence/orphans threads before) to allow an upstream crate to delegate the right declare particular impls to particular downstream crates. E.g. core could say “I promise not to implement this myself, and if std wants to do so instead, please let it”. The advantage of this is that since the crate which gets to declare the impl is still unambiguous, there is no potential for conflicts. (The impl-declaring capability has been “moved” and not “copied”.) That might also be the downside: is it possible for an upstream crate to refer to a downstream crate at all?

2 Likes

So, crate names alone are not guaranteed to be unique in a particular program; but crate names + some metadata are required to be unique (e.g., crates.io creates metadata derived from the version number, but other people might use a distinct scheme). So the upstream crate could just authorize a "crate name" to supply impls, but that alone would not guarantee uniqueness.

In other words, you might have a crate X that authorizes a (downstream) crate Y to add an impl for X::Type. But then you have two copies of Y (say, versions 0.1 and 0.2), both of which use X 0.1. In that case, they are both trying to implement the trait for X::Type, possibly in different ways.

2 Likes

This smells vaguely like it might be a job for parameterised crates. After all, if there’s no issue with having both X:0.1 and X:0.2 in the graph, surely there should be no problem with having both X:0.1<Y:0.1> and X:0.1<Y:0.2> in the graph… Then, the default import for core would be extern crate core<self>; in std… or something. If someone wants to use core<totally_not_std>, well, that’s their business.

6 Likes

Last year, I posted a similar proposition for forward declaring impls (“requiring” was the term that I used, but forward declaring is actually better).

There was a this difference though: the crate that fills the declaration should provide nothing else than the impl. This is to disentangle dependencies; depending on a crate that provides a particular impl filling the forward declaration for functionality unrelated to the forward declaration becomes unlikely. I think that this would prevent ecosystem splits, but of course, it might make the forward declaring functionality a bit less useful. Anyway, it’s one more idea for preventing splits.

Edit: Oops, the proposals are different in other aspects too. Well, whatever; the idea that the crate providing an impl isn’t allowed to provide anything else stands on its own.

1 Like

This is fundamentally an extension of the trait system to support nullary type classes. Maybe recognizing that will give some context for whether it’s a good or bad idea?

The ecosystem split isn't when you want to directly depend on two different crates which provide this impl, but when diesel depends on crate which provides the impl, and rocket depends on a different crate that provides the same impl. Now you can't use diesel and rocket together.

I don’t see how this is nullary type classes, can you explain?

(My understanding is that Haskell does not have the same orphan rules as Rust does, and so the examples that motivate this are just allowed inherently in Haskell.)

One potential alternative option would be to allow users to tag the implementation of the trait with a name, and then require users to explicitly import the implementation of the trait in the module where they want to use the implementation. Using the string addition example in the RFC, libstd would add impl blocks to str like so:

// In `libstd`

impl Add<core::str> as StringAddition for core::str {
    //              ^^ ^^^^^^^^^^^^^^
    // This implementation of `Add` is given the name `StringAddition`, which must be explicitly
    // imported by the user in order to actually work. `as` is being used here to associate a name
    // with an implementation.
    type Output = String;
    fn add(&self, other: &Self) -> String {
        // ...
    }
}

impl core::str as StringToOwned {
    //         ^^ ^^^^^^^^^^^^^
    // Here, the `impl` block itself is being namespaced as `StringToOwned`. Again, `as` is being
    // used to associate the name and the user would need to explicitly import this implementation.
    fn to_owned(&self) -> String {
        // ...
    }
}


// In user code

// Without this import, the implementations of `Add` and `to_owned` for `str` in `libstd`
// effectively don't exist. Attempting to use them would result in an error.
use std::{StringAddition, StringToOwned};

Doing that would let the user sidestep the orphan rules, without risking code breaking because another, seemingly unrelated crate was included.

1 Like

This is an interesting idea but it pushes the problem into the namespace system; when two libraries expect you to import their impls (which aren’t coherent), you now can’t compose their behaviors inside the same module.

1 Like

All lang items are conceptually nullary type classes underneath, even if they’re not implemented that way. The compiler declares the classes and their signatures (which may include both functions and data types), and the whole lang-item system is dedicated to maintaining coherence in the face of another crate providing a “blanket” implementation (which is the only kind of implementation, as far as NTCs are concerned). This “forward-declaration” idea is the same thing, but exposed to the user.

This is akin to Idris’s “named implementations;” maybe looking at that can give some ideas?

That’s an interesting perspective. To make sure I understand this, you’re saying that the downstream client is providing impl StringAddition, and the upstream client is essentially saying impl Add<str> for str where StringAddition.

But it seems the significant thing is that these ‘nullary impls’ wouldn’t be required to obey the orphan rules (under the orphan rules, you couldn’t impl another crate’s nullary trait), which isn’t inherent in the idea of nullary impls. Perhaps nullary impls provides a less exotic syntax at least.

2 Likes

What is the advantage of these "named impl blocks" over the current semi-standard workaround of a "newtype wrapper"? Both disambiguate by creating a new name that the user has to explicitly import.


More generally, I very strongly agree with this point:

Forward impls are one of many interesting ideas to mitigate orphan rule frustration, but I think we need to finish the existing work on specialization before we can tell which of those ideas still fill a gap that needs filling.

Yes, that’s exactly what I’m saying. Of course, the syntax could be wildly different (bikeshedding is the soul of feature design, after all :sweat_smile:), but the semantics should be exactly that of NTCs.

Hmm, true. An workaround for that would be to require the universal function call syntax to specify the desired implementation if you wanted to use a trait with two conflicting implementations imported to the same module, but that could get pretty ugly. Another potential issue would be deciding which implementation to use if a user passed the implemented type to a generic function that’s unaware that there are two conflicting implementations - one option would be to use as $impl to specify the desired implementation, but that’s also pretty wordy.

Although, a wordy solution is still better than code unexpectedly breaking upon including a new crate.

In any case, all such NTCs can be emulated with existing traits by providing and relying on impls for (e.g.) () and nothing else.

Except for the orphan rule aspect, which seems like the most important part.