Revisit Orphan Rules

I want to revive a discussion and an idea of @nikomatsakis related to orphan rules for traits. Currently, external traits cannot be implemented on external types. As a new Rust user, when reading this in the book I thought, “Oh, that’s very restrictive. Ah well, I will accept that it is necessary and see how it goes. I’m sure it rarely poses a problem!” Well, I ran into it almost immediately.

The most common reason cited for the orphan rule is the hashtable problem, however no one seems to mention why the solution proposed therein is undesireable. It looks like it was briefly mentioned in a previous discussion related to trait specificity over parametrized types, but tabled for the moment as too large a change when trying to get Rust to 1.0.

Current Situation
I would like to open discussion on this again. Let me first present the situation as it stands:

  • Crate A defines a public trait T.
  • Crate B defines a public struct S.
  • Crate C wants to implement T for S.

Current orphan rules prevent this, even though it would be completely coherent.

Current Workaround
Various proposals have been suggested for this problem:

  1. Fork crate A or B and add the trait implementation.
  2. Contact the owners of either crate with a pull-request to implement the trait T on S.
  3. Create a newtype, wrap and unwrap struct S with the newtype whenever you need this trait.
  4. In certain cases, crates may have implemented workarounds.

Problems With Current Solutions

  1. Fractures the ecosystem and adds the overhead of keeping your module up to date with the main crate. In addition, it duplicates effort among developers who may also need this functionality.
  2. Requires crate A and B to be aware of one another, and increases the requirements to keep their module up to date with some external crates. As the number of crates in the ecosystem grows this is totally unsustainable.
  3. Depending on the amount of of custom types within S, this may be highly nontrivial. In addition, it means that one ends up reimplementing a large amount of functionality (for example, a simple derive would not be possible on the newtype, which means a custom implementation would be necessary).
  4. This essentially is a macro for the newtype idiom, and requires duplicating the struct within crate C. It saves you from having to reimplement the trait, but it requires more work on the part of crate A to provide such a macro and still carries with it limitations which prevent interoperability with other crates.

Future Implementations
Part of the rationale for the orphan rules has gotten conflated with an argument about breaking changes (A or B providing an implementation would not be considered a breaking change, however it would conflict with crate C’s implementation). The solutions below will address this in such a way that fixing the breaking change would be as simple as handling a naming conflict, which already is a breaking change for crates which is not tracked by semver presumably because the fix is very simple (i.e. namespacing). Thus, for the time being, let us ignore this argument.

If we ignore the possibility of breaking changes due to future implementations in crate A or B, then the above can be allowed to compile without issue. There is only one impl T for S defined, if it had been copied into crate A or B the program would have compiled without issue, so the fact that it happens to be external to the trait and struct definitions is irrelevant. There is no ambiguity for the compiler to deal with.

Simple Solution for Conflicting Traits
How should we handle something like this:

  • Crate A defines impl T for S
  • Crate B defines impl T for S
  • Crate C wants to use both A and B

(Here trait T and struct S can be defined in some other external crate, it’s rather irrelevant. For a concrete example, refer to the hashtable problem, where A and B would be Module1 and Module2.)

As a simple starting point: compilation would return an error due to the ambiguous impls, and recommend the user specify which impl should be used in Crate C with some syntax such as

prefer impl T for S from A

The simplest effect of which would be to make this equivalent to commenting out the impl T for S in crate B entirely, forcing coherence by making this the only implementation. It’s basically been reduced to the situation above.

Objection: what if B depended on the specific functionality in impl T for S?
For example, what if, in the hashtable example, B tried to access elements of the has table not by using get, but instead decided to directly manipulate the underlying data since it was assuming the impl it had defined was the only way it would have added the data? The answer, in this case, is one of code design. It’s basically the same argument for why getters/setters are preferred to direct access. A trait should fully capture all aspects necessary for the activity it is abstracting, and access to the data should always go through the trait.

Advanced Solution for Conflicting Traits: impl per instance
Still, for various reasons, it might be useful to allow for two implementations without having to also clone the type. If I understand it correctly, this is the solution proposed by @nikomatsakis in the hashtable problem. In short, my understanding is that it would work like this:

  1. If an impl can be unambiguously determined (by whatever set of specificity rules are in place) then no additional syntax would be necessary. In particular, this means that implementing this change would be backwards-compatible with existing code.
  2. If multiple impls are present, and it is ambiguous which one applies, then the compiler would return with an error advising you to specify it. For example S:A::T would specify a type which uses the impl from crate A. This might seem noisy, but type aliases could be used to quiet it down.

For full details on the proposal, please read the excellent writeup here!

I would advocate that having a way to set the default for a given type would be handy. For example, the prefer impl T for S from A as suggested in the “simple solution” above.

Backwards Compatibility
Implementing this change is backwards compatible with existing rules. In particular, the set of source code which compiles corresponding to the orphan rules is a subset of that which compiles with the simple solution, which is a subset of that which would compile with the advanced solution.

Reasons for Implementation

  1. As mentioned, this seemingly arbitrary restriction was very offputting when initially learning Rust, and based on various blog posts explaining why it is necessary I would guess it is a stumbling block for others as well.
  2. Existing orphan rules still do not protect against external crates implementing external parametric traits, and this has happened in practice.
  3. The exact specificity rules which determine which particular impl applies are rather complicated. Implementing this is orthogonal to any specificity rules, as they would only effect the choice of “default”, and therefore it could be seen as a way to punt the decision of which impl to use to the user, postponing the need to upgrade the rust compiler’s ability to automatically choose a reasonable default.
  4. The list of recommended traits that modules should derive on public structs is already quite long. As new crates are added this will just get longer, and making recommendations (many of which are largely ignored) cannot be seen as a long-term solution. The whole issue of recommended traits would no longer be necessary.
  5. As mentioned above, the current orphan system requires crate A to be aware of crate B. As the number of crates increase, I do not see this being a sustainable model for a growing ecosystem. Even if they are aware of each other, it requires one of them to keep things up-to-date with the other just in case someone wants to combine the two as opposed to simply putting that responsibility on crates which do combine the two. With derive methods, this is often very straightforward. For more complicated implementations it facilitates someone from creating a shim crate, so that users who depend on interoperability between the crates do not have to duplicate effort, and this shim can be kept up-to-date as a shared responsibility between those who need it as opposed to putting this requirement on the original crate maintainers.

Addendum: pub impl
If I was desigining this from scratch, I would say that the syntax which aligns best with the language would be to require implementations to be pub if an external module wanted to use them. So, for example, if you just had one crate A and wrote impl T for S then within crate A it would be able to use trait T, but if some other crate B imported S from A the implementation would not also be imported. If, on the other hand, crate A had pub impl T for S then B could import trait T with some syntax like use impl T for S from A. This seems to fit best with the rest of the language’s pub features, but it would be a breaking change and makes types more complicated. I’d rather not let this pub aspect hold up the solutions discussed above.

4 Likes

I might qualify this further with "C is an application," or perhaps "the application D depends on C, and does not depend on anything else that implements T for S." (because otherwise, there is the potential for incoherence, which has more or less come to be equated with actual incoherence in the community)

Yikes! I didn't realize on URLO that the intention was to allow such situations.

For me, taking such a point of view is a non-starter. There is plenty of existing unsafe code whose correctness is predicated on knowing that specific, locally-defined impls will be used for some traits.

For instance, an almost unavoidable problem in unsafe code is that arbitrary implementations of common traits can panic (even something as simple as Clone), potentially leading to double-drops of data if this occurs around raw pointer access.

4 Likes

As mentioned above, the solution proposed by @nikomatsakis would allow for such situations. Crate B can use annotated types indicating that they must use the implementation provided in crate B. In addition, the default rules would automatically defer to the implementation within crate B, so it is backwards-compatible with crate B as it exists now.

The type-specification comes in if some crate C tried to override that implementation. The whole point of this would be that this would be allowed. The compiler would kick back an error in the cases of ambiguous implementations and they would be disambuguated just like namespacing overlapping function definitions are now. I don’t see why impls should be treated differently than overlapping function definitions, which are currently allowed and disambiguated using namespaces.

This allows, for example, some crate A to do some unsafe stuff depending on a locally-defined impl, crate B to also do some unsafe stuff depending on a locally-defined impl, and then crate C to come along and use both by saying something like this:

let a: S:A::T = {...} //uses impl T for S from A
let b: S:B::T = {...} //uses impl T for S from B

An instance where this might result in trouble is if you pass a to some function in crate B, such as

fn do_unsafe_stuff_assuming_local_impl(s: S) {
   //---snip---
}

This would have to be changed to

fn do_unsafe_stuff_assuming_local_impl(s: S:self::T) {
   //---snip---
}

(or something similar) in which case passing a would now fail.

Again - this would not possibly impact any existing crates, because such issues only kick in when ambiguity arises, which (due to the orphan rule) does not currently occur. Situations like this are easy to detect, and we could implement a compiler warning.

Overall, the characterization of the problem feels incomplete and in some places inaccurate to me, and the proposed changes seem like complete non-starters. In fact, they seem like obviously non-starters to me, so I'm wondering if I got something wrong or if this is just a really hard problem that's easy to misunderstand.

This is a pretty vague claim, and it's arguably true of every proposed solution to the coherence/orphan rule dilemma, including all of yours.

I would personally not describe today's orphan rules this way. Today, crate A knows exactly what all the code in crate A will do, no matter who is depending on it, and what's going on in crate B simply can't affect that (unless B passes things into A, but they'd have to be things which satisfy interfaces that A specified). The worst that can happen today is crate A failing to provide some B-related impls that users of A and B really want to use. That is a very real problem, but "requiring crate A to be aware of crate B" is overstating its severity and misleadingly implies that the status quo creates some kind of composition hazard.

In contrast, allowing B to effectively "comment out" part of A's code means that A no longer has no way of knowing whether any of its own code will work properly when it's used by... pretty much anyone, really. And yes, that's basically the same objection you bring up halfway through your post, so let's talk about that.

I suspect this rebuttal paragraph is the really key one, because to me this is a complete non-sequiter that doesn't rebut the objection at all.

It seems like you're assuming that for any impl Trait for Type { ... } block, the ... code is perfectly interchangeable with any other code that correctly upholds the contracts of Trait and Type,

Even in an ideal world where contracts can be perfectly specified and understood and you can trust all other crate authors to never get them wrong or maliciously exploit this feature (again, this is already a non-starter just because we can't assume all of that), this also implicitly assumes there is such a thing as the contracts of Trait and Type. What if in the context of crate A, impl T for S should do one thing, but in the context of crate B, that "one thing" is blatantly wrong and the impl needs to behave differently?

I think if this feature was actually implemented, you'd end up with everyone having to create "defensive newtypes" for any impl with an external trait or type just so they can be sure their own code is correct without auditing every other crate in the world for overriding impls.

The bigger problem is that this is global reasoning, which is not good for all the same reasons "globals are evil" is such a popular meme. There's simply no way to reliably compose multiple crates that use a global feature like this. What happens when both A and B declare their impl to be the preferred one? Who wins? If A depends on B, maybe A should win, but what if neither depends on the other and they're on opposite sides of a diamond? What if A and B are deep enough in a complex dependency tree that the exact shape of that tree potentially changes from build to build as new versions of intermediate crates are released with slightly different requirements?


Also, it seems like serde's Serialize trait is usually the motivation for starting an orphan rule discussion, but serialization is a special case with an extra special requirement that makes the entire coherence/orphan rule dilemma a giant red herring.

It's generally assumed that the serialized form of a type needs to be deserializable into any future version of that type. There's basically no way to guarantee this, unless there is a single owner of all past, present and future Serialize impls for that type. That is the real reason why the API guidelines recommend every crate implement Serialize themselves. It simply cannot be done correctly externally, even though it seems like it can. So even if the orphan rules were simply deleted tomorrow, that wouldn't help the Serialize case; it would only provide the illusion of having helped and encourage writing a lot of forward-incompatible serialization code.

Since Serialize seems to be the primary cause of orphan rule pain, but changing the orphan rules can't actually solve it, I am skeptical that we would actually benefit from any significant changes to the current system. Except for Automatic features. by withoutboats · Pull Request #1787 · rust-lang/rfcs · GitHub, I would really like to see that happen someday.

5 Likes

It can, but it presently doesn't. As is the case for many crates which assume such trivial things as "&slice[..] does not panic."


Here's a question. Let's say I have a function:

fn opaque(vec: Vec<f64>) -> Vec<f64> {
    vec
}

and somebody in another crate gives it a Vec<f64>:my::Hash. They then use the output somewhere that requires Vec<f64>:bob::Hash. Where is the type error generated? What if the function looked like this instead:

fn opaque(vec: Vec<f64>) -> Vec<f64> {
    vec.into_iter().collect()
}

does the program now typecheck? (If so, this introduces a degree of global analysis into the type system that is highly uncharacteristic of rust)

You are correct, when I say “crate A has to be aware of crate B” I’m not saying that in order to exist at all it needs to be aware of it. I’m saying that if someone wants to compose elements of A with B, they need one of either A and B to be aware of the other. The rest of my points about this issue stand with the added clause that some crate C is trying to functionality from A and B.

For your second point, see my reply above.

For the third, prefer impl T for S from A would simply make it so that, within crate C, S == S:A::T. There would be no conflict if other crates also had prefer statements, it’s not setting anything globally.

As for Serde, in my case it could have easily been handled by a derive statement, and it would have easily been handled by changing the orphan rules.

Well, no, crate C can do the newtype workaround without A or B knowing anything about each other.

afaik, the newtype workaround is always possible even when it's inconveient or tedious. Do you know of any cases where it's not even possible? (other than the serialize case I discussed already)

Hm, I feel like this will fall apart once specialization is added. Will have to think more about that.

Huh, I thought the cases we were interested in usually involved passing S into some other crate that then invokes S's impl of T. Because if only code within crate C needs to use the impl, then C can just newtype it and use the newtype directly, right?

(might be worth editing your original post to clarify this; even after rereading it now it still sounds like that post is describing a global thing)

I don’t see how this is any different from a normal

let a :Vec<f64> = //--snip--

Since opaque(a) moves a and then returns it, it would have type Vec<f64>:my::Hash. If you then try to do expecting_bob(opaque(a)), this is where a type error would occur, no different than expecting_bob(a).

In the second case, the .collect() is going to have type Vec<f64> as determined from wherever fn opaque is defined. So:

  1. If there is a way for the compiler to determine a default impl, it does it.
  2. If there isn’t, it asks to disambiguate.

If, within the module where fn opaque is defined, the type Vec<f64> does not have any trait associated with it, it just gets kicked back to the calling module, where it would be treated like any new Vec<f64> instance within that module.

The newtype workaround has lots of disadvantages which I mentioned in my original post aside from being inconvenient or tedious.

Specialization is orthogonal to what I propose here.

You can pass S into some other crate that invokes S’s impl of T, but the preference is not set globally.

Wait, did you say Vec<f64>:my::Hash is a type? Not the type Vec<f64> with some sort of annotation, but an actual separate type?

Making it a separate type appears to solve the immediate question posed by @ExpHP, but that also appears to introduce a brand new dimension of subtyping into Rust, which... I dunno if that's gonna work. And if it's not a separate type, then I think @ExpHP is right that this still requires a form of global reasoning to actually work.

Clearly we'd like them to be orthogonal, but just saying so is not an argument that it is so.

It seems clear that there's a non-trivial interaction at least. If crate A has an impl Trait of T, and crate B specializes that with impl Trait of BType, and my create C does something with BType:A::Trait, what happens?

I have no idea what should happen or whether there's an actual problem with "just use the specialized impl from B", but it seems like something we clearly need to think about more before we can be sure it really does work.

Yes, it would be a type I think, but since this is part of @nikomatsakis’s idea and he mentioned at the end of it that he had some thoughts about the implementation I will defer to him on that part of it.

Interesting. So are you imagining Vec<f64>:my::Hash to be a sort of sugar over the newtype pattern (in which case we’re back to @ExpHP’s post), or an actual subtype with all the complications that a new subtyping dimension brings to the table?

The fact that it is orthogonal follows from the statement that there are two steps:

  1. Without a impl annotation, some form of default impl is chosen. This is where any specialization rules would come into play.
  2. If the compiler cannot disambiguate, manual impl annotation is required.

This is, as I understand it, the core of the proposal.

Right. As I said, that is global analysis, which is highly uncharacteristic of Rust, a language that prides itself in having the signature of a function indicate all that is necessary to decide where it can be used. (which is a useful property that allows for decent error messages. Contrast with C++...)

fn func_1(vec: Vec<f64>) -> Vec<f64> { vec }
fn func_2(vec: Vec<f64>) -> Vec<f64> { vec.into_iter().collect() }

type VecFn = fn(Vec<f64>) -> Vec<f64>;

fn foo() {
    let funcs: Vec<VecFn> = vec![func_1 as _, func_2 as _];

    // hand off to some crate that accepts `&[fn(Vec<f64>) -> Vec<f64>]`
    some_crate::have_some_funcs(&funcs)
}

Now suppose some_crate::have_some_funcs does what I previously mentioned, supplying a Vec<f64>:my::Hash and expecting a Vec<f64>:bob::Hash. Where is the error message now?

2 Likes

I think I presented this poorly. My intention was that the "simple solution" would be easier to swallow, and then bringing up @nikomatsakis's more advanced solution would make sense. I will add additional details to my original post suggesting users read fully @nikomatsakis's proposed solution, which has more details. In particular, he states:

Next, we say that the value for a type variable is not just a type, like uint, but rather a tuple type : impl* of a type and zero or more implementations, one for each interface bound.

So yes, it would be a type.

I think niko’s gist might be a bit too old to use as a reference here. If I’m reading it right, it predates the existence of things like trait bounds on type parameters and trait objects, both of which clearly exist today and thus a huge chunk of the gist actually is part of stable Rust already. In fact, it doesn’t mention anything about multiple crates or breaking changes or orphan rules, so I don’t think it was ever meant to apply to the problems we’re discussing here. What’s suggested there wouldn’t even solve the opaque() case, because @ExpHP’s opaque() function is not a generic function, so there’s no type parameters to carry the impl annotations.

More importantly, that gist seems to only be considering generic code, because it’s all about whether and how generic code should select its trait impls at compile time or runtime (which is basically set in stone nowadays). Since the problems we’re discussing are not specific to generic code, I think that gist simply doesn’t apply.

If the goal is just to extract the “what if types came with a ‘use this impl’ tag?” idea and turn it into a concrete proposal relevant to today’s orphan rules dilemma, then I’m pretty sure we end up with either global reasoning or a new form of subtyping, as discussed above.

1 Like

Unless I've totally misunderstood, it is precisely about overlapping impls:

This looks reasonable. Now we create two modules, Module1 and Module2, which both share a reference to the same hashtable, but which have different implementations of the hash interface in scope:

It is not only focused on generic code:

fn bar(h: @ht::t<uint: Module2::hash, str>) { ... }

Here we are specifying uint : Module2::hash.

That much is true, because "overlapping impls" is a much broader topic (involving specialization, generics, traits, etc) than the orphan rules dilemma I thought this thread was about.

Hm, yeah I overstated that part.

It does contain that example of a function bar with a concrete type-plus-impl-annotation parameter, but it also appears to say that any non-generic function lacking the same exact set of impl annotations would simply not compile if it called or was called by bar, which afaik basically makes it sugar for a newtype. Which I assumed wasn't relevant because you appear to be proposing something more than newtype sugar, right?

I’d prefer this problem to be approached from the other side: allow traits to opt-in into being implemented without orphan rules restriction. Let Serde say that it doesn’t matter who implements Serialize.

1 Like

@kornel You'd still have to deal with issues of multiple implementations. That is a strict subset of the idea discussed here.

--

Edit: I'm out of replies for the day, so I'm posting a response to @Ixrec here as well:

it also appears to say that any non-generic function lacking the same exact set of impl annotations would simply not compile if it called or was called by bar

Yes, that's not what I was suggesting, but I don't see where the gist suggests this is the case.

--

I have to think more about @ExpHP's earlier post. It is a good point, and definitely raises a property of Rust that should be preserved.

--

Edit2: (since I'm out of replies -- sorry, this will make it hard to follow in the future so after this I'll stop)

@Ixrec the problem is that Module2::bar is being called with a hash type whose first parametric type is a uint : Module1::hash when it explicitly required a type uint : Module2::hash. The non-genericity of Module1::foo nor it's absent impl annotations has nothing to do with it, which is why I was confused by your statement. It could be made to compile as long as h had the proper type, for instance

    fn foo() -> str {
        let h : @ht::t<uint: Module2::hash, str> = ht::create();
        ht::put(ht, 3u, "hi"); // 3u.hash() == 3u here
        ret Module2::bar(h);
    }

should compile (using the notation from the gist). I think the confusion was on my part, I thought you were saying that foo had to be generic or specify the same type parameters as bar in order to get it to compile, which isn't the case.