Orphan rules

@ ezyang

Also, your proposed rule at the end doesn't work for reasons explained in Niko's post.

It doesn't work on its own, but it does work when combined with the compiler explicitly enforcing non-conflicting impl's where possible. That was the point I was trying to get across. Most of the discussion here seems to be focusing on making everything work exclusively using orphan rules, but so far all of the orphan-rule-only solutions seem to make some unfortunate trade-offs (too liberal, too conservative, unintuitive and surprising restrictions...). I'm suggesting that a hybrid approach may do a better job of getting us the behavior we want.

I'm thinking I may need to make some diagrams to better illustrate what I'm getting at. But in a nut-shell, the compiler can quite feasibly enforce a lot regarding impl's and coherence with direct checks:

  • It can enforce that there are no conflicting impl's within a crate, and...
  • It can enforce that a crate doesn't create impl's that conflict with impl's in external crates it depends on.

That takes care of a huge amount of the problem right there. As far as I can tell, the only case not handled by it (at least, in a non-problematic way) is when there is a diamond in the crate dependency DAG. I gave an example of that earlier:

struct CrateABigInt { ... } // in crate A
impl<T> Show for (T, CrateABigInt) { ... } // in crate B
impl<T> Show for (T, CrateABigInt) { ... } // in crate C
// Then some crate D tries to use both B and C

But we can use an orphan rule to prevent such a case. And it can be an orphan rule that would normally be too liberal, because the direct checking that the compiler does will handle the cases that would normally be problematic for the more liberal orphan rule.

Hmm... I hope I'm making sense here.

On Thu, Jan 15, 2015 at 05:54:33PM +0000, Edward Z. Yang wrote:

Apparently I will never learn to only top-post when using discuss. Here is my reply to ezyang:

This proposal seems to be basically what I sketched out in that e-mail I sent out a long time ago (this is the same message that was linked from the original blog post):

https://mail.mozilla.org/pipermail/rust-dev/2011-December/001036.html

When I say “introduce incoherent traits”, this is basically what I am thinking of. Introducing the ability for traits to be declared where coherence doesn’t apply, but at the cost that the user must manage the impls more explicitly in the case of overlap.

As you say, though, I think this ship has sailed for the moment. We can come back and revisit this question later but right now is not the time to go mucking with the core trait system!

This is an interesting idea, but the rules as you wrote them are not how I would interpret “covered self || covered”. I think what you meant is perhaps something like this.

Given impl<T...> RemoteTrait<B...> for A.

We say that a type A or B... that appears in the impl declaration is “covered” if:

  • the type contains a local type constructor.
  • All impl type parameters T... that appear within the type are covered by a local type constructor.

An impl is legal if any of the following hold:

  1. A is covered

  2. A is external and contains no type parameters and at least one of B... is covered.

I believe this is sound, the argument runs roughly like this. Given two cousin impls:

  1. If A1 and A2 are both local types in their respective crates, then we know that A1 and A2 cannot be equal due to the basic coverage proof. After all, you could make another trait RemoteTraitX that has no type parameters and two impls impl<...> RemoteTraitX for A1 and impl<..> RemoteTraitX for A2. These two impls meet the “covered” criteria and hence if arielb1’s proof is correct, A1 and A2 cannot be equated.

  2. If A1 is local and A2 is not (or vice versa), because A2 contains no type parameters, it cannot possibly instantiate to A1, since A1 is defined in crate 1 and not visible to crate 2 (since the crates are cousins).

  3. If both A1 and A2 are foreign, then we know that at least one type in B1... and B2... is covered. These two must be disjoint due to the proof by arielb2. Put another way, if both A1 and A2 are foreign, then both impls meet the simple “covered” rules, and those were proven sound.

In terms of my tables this rule behaves basically the same as “covered everywhere” (which was the variation where you allow arbitrary type parameters after the first covered type). This makes sense because it is fairly similar.

+----------------------------------------------------------+---+---+---+---+---+---+
| Impl Header                                              | O | C | S | F | E | | |
+----------------------------------------------------------+---+---+---|---|---+---+
| impl Add<i32> for MyBigInt                               | X | X | X | X | X | X |
| impl Add<MyBigInt> for i32                               | X | X |   | X | X | X |
| impl<T> Add<T> for MyBigInt                              | X |   | X | X |   | X |
| impl<U> Add<MyBigInt> for U                              |   |   |   |   |   |   |
| impl Modifier<MyType> for Vec<u8>                        | X | X |   | X | X | X |
| impl<T> Modifier<MyType> for Vec<T>                      |   |   |   |   |   |   |
| impl<'a, T> FromIterator<T> for Cow<'a, MyVec<T>, [T]>   | X | X | X | X | X | X |
| impl<'a, T> FromIterator<T> for Cow<'a, [T], MyVec<T>>   |   | X | X | X | X | X |
| impl<T> BorrowFrom<Rc<T>> for T                          |   | X |   |   | X |   |
| impl<T> Borrow<T> for Rc<T>                              | X | X | X | X | X | X |
| impl<H> Hash<H> for MyStruct                             | X |   | X | X | X | X |
| impl<I:Int,T> Index<I> for MyVec<T>                      | X |   | X | X | X | X |
+----------------------------------------------------------+---+---+---+---+---+---+
O - Ordered / C - Covered / S - Covered Self / F - Covered First
E - Explicit Declarations / | - Covered Self || Covered.

I think what this rule gains is that it is less order dependent – it only makes Self quite special. What is looses is that you can’t have a case where you have a trait like Add that has two "self-like" traits PLUS auxiliary type parameters.

So for example given a trait like this:

trait AddHash<R,H> {
   // Goal:
   // - L,R should be roughly self-like
   // - H should be a "hasher"
   ...
}

I cannot write an impl like this:

impl<H> AddHash<BigInt,H> for i32 { ... }

I’m not sure if this is a big deal, but it wouldn’t work.

I just realized that I’ve been totally misunderstanding the problem. Apologies for the noise, please ignore my previous posts.

If I read this right (it’s quite likely I’m not), I’m not sure this enforces coherence? Consider:

// trait crate
trait Foo<A, B> { ... }
// crate 1
struct Crate1Type;
impl<T> Foo<Crate1Type, T> for i32 { ... }
// crate 2
struct Crate2Type;
impl<T> Foo<T, Crate2Type> for i32 { ... }

If I’m not mistaken, "at least one of B..." is covered in both crate 1 and crate 2, but the two impls are incoherent since the compiler cannot resolve the implementation for Foo<Crate1Type, Crate2Type>. (This is basically another version of the tuple argument on your blog.)

I think you probably need "all of B..." to be covered, rather than “at least one”.

OK, cool (and I assume you’ll use applicative semantics when trait composition is involved.) One thing that I still worry about, you essentially won’t be able to change whether or not a trait is coherent or incoherent for any existing traits, so adding this functionality in a future version of Rust won’t help out people who need something like orphans for core library traits.

I don’t have any amazing ideas, only some food for thought (mostly-disjoint courses, per paragraph, in a random order):

It’s worth remarking upon the fact that GHC fails to ensure coherence even in the absence of orphan instances. The problematic scenario is the same as the one we’re trying to solve here!

A way I like to think about type classes and coherence is that a type class Foo a is the combination of a dictionary type (Foo) and an open function (a: Type) -> Foo a which maps types to their dictionaries. Being an open function, its definitions (“arms” or whatever they’re called) do not need to be in one place, but may be interspersed throughout the program. (Correspondingly, Type is an "open enum" with “variants” interspersed throughout the program.) From this perspective, what we refer to as “coherence” boils down to ensuring that this function actually be a function - that it maps each input to no more than one output. (However, it will nearly always be partial, i.e. will lack an output for some inputs - when there is no impl.)

It’s not obvious to me that impl UnaryTrait for (A, B) is really “the same thing” as impl BinaryTrait<B> for A. It’s true that they are structurally similar from the perspective of coherence checking. But otherwise I think that this interpretation is just an accident of syntax and language. Syntax: Rust provides nice, privileged syntax for the built-in product types (including pairs). But product types have a particular meaning: what makes impl for Product<A, B> special here, as opposed to for Sum<A, B>, or for AnythingElse<A, B>? Language: we might want to read the first impl as "implement UnaryTrait for the pair of types A and B". But this is not actually right: "implement UnaryTrait for the pair type of A and B" is more accurate.

In the case of the MyVec and IntoIterator example from @nikomatsakis’s blog post, and all analogous cases, where you want both Add<MyVec<T>> for I as well as Add<I> for MyVec<T>, leading to overlap in the case of Add<MyVec<T>> for MyVec<T> (which impl should we choose?), what you really want to say is “it doesn’t matter, they have the same semantics”. But it’s not clear if there’s a good way to say this.

Along similar lines as the idea mentioned by @nikomatsakis, that orphan impls could be allowed in the “final” crate, which is the executable itself, an idea that has been bandied about a bit in the Haskell community to solve “the impl bson::ToBson for mysql::Value problem” is the ability to “forward declare” impls - to delegate the right to create the given impl, which would otherwise be considered an orphan, to a particular external crate. As the right is delegated only to a single, well-identified external crate, coherence is maintained. (This can be seen as a generalization of the “orphan impls in the executable crate” idea - where the right to create an orphan impl is delegated to “the main crate” by default, but can be overridden to specify a different crate where desired.) The big problem with this idea in the context of Rust, as far as I see it, is how in the world can a crate refer to the traits and types in the impl being forward declared, as well as the crate being delegated to, without depending on them? Haskell’s global module namespace otherwise feels like a misfeature, but it comes in handy here.

The long-time germ of an idea I’ve had myself for solving the same problem would be to allow “weak dependencies” between packages, along with “optionally present” items. If a package bee has a weak dependency on aye, then having aye is not a requirement to install bee, but certain (“optionally present”) items in bee (i.e. the ones which actually depend on aye) only become available when aye is installed. In practice, for example, bson would declare a weak dependency on mysql via Cargo, and have #[cfg(is_present(mysql))] impl ToBson for mysql::Value - or something like that. This raises various “interesting” engineering challenges, such as requiring bson to be recompiled if mysql is installed later on, which may or may not be surmountable. Notably this resolves “the triangle problem” at the technical level, but leaves it intact at the social level - the compatibility shim between mysql and bson must be provided and maintained by either mysql or bson, and the authors still need to come to an agreement about which it will be. That seems like a desirable thing, however. (Better than “disconnected” third-party compatibility shim packages from third-party authors floating around, I think.)

3 Likes

impl UnaryTrait for MyPair<A,B> and impl UnaryTrait for Result<A,B> are the same thing with respect to selection as impl UnaryTrait for (A,B) – selection doesn’t care about a type’s semantics, I used the 2-tuple only because it was the simplest binary type constructor I could find.

Yes, this. If part of the desire is to make rules that are "intuitive", a reader of the syntax will read an impl declaration like:

impl LHS for RHS

The LHS and the RHS will always be conceptually different to the user, and the coherency checking logic should, IMO, reflect that. This may have already been said already (I have to admit finding a lot of the notation used in this thread confusing, it's likely I'm just repeating @kennytm), but I'd think a rule that works is: "at least one of LHS or RHS is covered", using Niko's definition of "covered". So we'd get:

impl Add<i32> for MyBigInt : YES
yes: impl Add<MyBigInt> for i32 : YES
yes: impl<T> Add<T> for MyBigInt : YES
impl<U> Add<MyBigInt> for U : YES
impl Modifier<MyType> for Vec<u8> : YES
impl<T> Modifier<MyType> for Vec<T> : YES
impl<'a, T> FromIterator<T> for Cow<'a, MyVec<T>, [T]> : NO
impl<'a, T> FromIterator<T> for Cow<'a, [T], MyVec<T>> : NO
impl<T> BorrowFrom<Rc<T>> for T : NO
impl<T> Borrow<T> for Rc<T> : NO
impl<H> Hash<H> for MyStruct : YES
impl<I:Int,T> Index<I> for MyVec<T> : YES

There's no order dependency, but it would break Cow. On the other hand, the Cow definition:

pub enum Cow<'a, T, B: ?Sized + 'a> where B: ToOwned<T>

enforces that there's a type-relationship between T and B, so that it could be argued that covered(T) => covered(B) (though I don't know how complicated it would be for the compiler to represent that relationship for the purposes of this check). If this restriction weren't enforced, it strikes me as likely that the Cow type could violate coherence for the same reasons already described:

// crate 1
struct MyType;
impl<'a, T> FromIterator<T> for Cow<'a, MyType, T>;
// crate 2
struct YourType;
impl<'a T> FromIterator<T> for Cow<'a, T, YourType>;
// crate 3
// I know the syntax isn't correct here, but maybe the idea comes across.
let x: Cow<'a, MyType, YourType> = ...;
// which FromIterator<> implementation should be chosen for x??

This is not true. If you have MyVec and YourVec, and both of them can be added to anything that supports IntoIterator, and I do MyVec + YourVec, what type do I get in return?

So crates in Rust are actually a global namespace, or pretty close. In any case, this (and the idea of weak dependencies) is food for thought.

Sorry, to clarify: Your example concerned only two impls on MyVec, and in that case, it is true that it doesn't matter (semantically, at least) which one we pick. But (as my example is intended to demonstrate) it does matter as soon as you permit those same two impls to be used from more than one crate.

Yes.

Yes, though perhaps we can find a solution that is backwards compatible (such as the delegation that @glaebhoerl was suggesting). (Or some way to phrase the feature such that downstream crates must also opt-in to the "orphan-y" impls.) I think we can cross this bridge when we get there though it's good to keep our eye on it.

It's worth keeping in perspective, though: I consider Java to be one of the most, if not THE most, successful languages out there in terms of growing up a huge library of interoperable packages and components, and they managed to do so while having a much stronger form of coherence than we do.

Yeah, good point. You get this problem as soon as the impl for one of the two types dispatches on the LHS, and the other on the RHS. So either they need to both dispatch on the LHS, or both on the RHS, and the ordering rule ensures that they do so. This makes sense.

My gut response would be MyVec (the one on the left), which I guess supports choosing an order-based approach.

During a discussion with some Rusters, I discovered that you have a Deref trait which allows implicit coercion when the outer type doesn’t implement a function. It seems like this functionality could interact poorly with any future proposal to allow (limited) orphans into the language, since it would not, in general, be possible to tell locally if a trait is implemented.

That's not quite right. The coercion triggers if a type &U is expected and you are providing &T, and T derefs to U. It is not currently based on any analysis of bounds on T or U. You can read more in the RFC.

It also doesn't have any impact on what traits are considered to be implemented (unless those trait impls explicitly mention the Deref trait, which works like any other trait). I believe the coercion is entirely orthogonal to coherence.

I hope it’s ok to resurrect this old thread. I’ve been reading Niko’s blog post and this and some other discussions and I’m hoping that maybe someone could comment on the following questions:

When you define an orphan impl of a trait, is it actually reasonable to depend on this particular implementation in the rest of your code? Assuming that the trait properly documents all expected invariants, pre- and post-conditions, shouldn’t it normally be enough to know that there exist implementations for the types you care about and that the whole program uses a unique trait implementation for any particular combination of type arguments?

If this is the case, wouldn’t it be enough to have rules for selecting a unique impl when the final executable is compiled and linked together? All trait impls would be automatically exported into a global registry, i.e. they wouldn’t have a separate visibility, and the compiler/linker would be responsible for selecting a unique impl for any particular type parameter combination of the trait. At the crate level one could allow or even require developers to specify rules for the resolution and disambiguation of orphan impls.

Has this approach been considered? One drawback of this approach is that it would reduce the scope for separate compilation or reduce optimization opportunities in separately compiled code. Another drawback is that it would make the applied disambiguation rules, which may not be very visible, part of the effective semantic version of a binary. The severity of both issues depends a lot on the exact implementation.

Update: When I originally wrote this reply, I didn’t fully appreciate the problems this approach would create for separate compilation. If one allows orphan impls and requires a globally unique resolution of impls, the compiler can not be sure that a trait is not implemented for some type unless it has seen the whole program. So apart from the issue that an orphan trait impl may be replaced during compilation to enforce coherence, there is also the issue that any kind of overload or specialization resolution that hinges on the absence of a trait implementation for a type may be changed if a crate is linked together with a crate that has an orphan impl.

Traits: Defining Shared Behavior - The Rust Programming Language says:

This restriction is part of what’s called the orphan rule, which you can look up if you’re interested in type theory.

When I searched for "orphan rule programming type theory" I got the Rust lang book article that I was reading as the first result and this page as the second. While I read all of the blog post, this topic is categorised as a deprecated idea, and the issue that it links to several times, https://github.com/rust-lang/rust/issues/19470, is now closed.

Just wondering whether that Rust lang article should be updated?

Sounds like. We should update the Rust reference, really, and link to that. That said, the term "orphan instance" is used by Haskell, which I guess is where the book came from.

1 Like