[Pre-RFC] inherent methods from any crate

So, I’d like to propose a feature. Right now, if you define a struct/enum locally, you can define methods on it very easily just be writing impl MyType { ... }. Unfortunately, you can only do this for things defined in the current crate. The reason is coherence – if we allowed other crates to define inherent methods, we couldn’t check for conflicts. So you might have two crates that define a method draw(), and then it’d be unclear which one you are invoking.

I was thinking today that it would be cool to allow you to define inherent methods in other crates too, but only if they are restrcted to the current crate. We didn’t used to have a way to do that, but we do now: pub(crate). So you could imagine allowing inherent methods to be defined on foreign types, as long as they are at most pub(crate).

Some problems I can imagine:

  • What if there is a conflict with the base crate? Unlike with traits, we have no way to disambiguate which one you wanted. Probably this is just an error and you have to rename your method locally.
    • Problem with that is that it would imply that adding a new inherent method is potentially a breaking change for your users, since they might have done the same.
      • So maybe your local methods take precedence?
        • That would create a copy-and-paste hazard, but I’m not sure if that’s an issue.
      • Or else we come up with a way to say which one you want in a UFCS-style; still a breaking change, but closer to overlap with traits.

Anyway, what do people think?

8 Likes

You can kind of emulate this behavior currently with a crate-local trait.

I don’t see this being very useful unless you can also override the field privacy restrictions, see Add an #[allow(ignore_field_privacy)] annotation and Extending existing functionality

The problem is that defining a trait is annoying – you have to duplicate your fn definitions (hello .h files) etc. In practice, I think most people tend to just define free functions instead. I disagree pretty strongly about it not being useful unless you can override field privacy restrictions (and I am pretty strongly opposed to that idea).

Edit: to clarify, I am opposed to it because privacy is so absolutely fundamental in defining safe abstractions and reasoning about code.

3 Likes

This turns adding a method to a public type into a breaking change.

What would normally be a minor point bump (adding a feature) would now require a major point (breaking someone).

In Kotlin, there is a more lightweight syntax to declare a single method extension:

fun Type.methodName() { 
    body
}

So I personally use local (sometimes even function-local) extension functions liberally in Kotlin, while in Rust I usually define a standalone function, even for types from my own crate. OTOH extensions in Rust buy a bit more ergonomics because of automatic derefs.

Since the available methods are affected by which traits are visible, you can already (theoretically) change the meaning of code by carelessly moving it between modules, not to mention between crates. So unless the odds of such collisions become much greater with inherent methods than with traits, I don't care too much.

Perhaps there should be a warning for when a local inherent method collides with an upstream inherent method, if that is even allowed. It's super confusing and unnecessary (since you don't export it, you can always change the name of the extension method).

This is already true! If you add an inherent method which has the same name as some applicable trait method, but a different signature, code stops compiling because inherent methods trump trait methods. For std these issues are discussed and addressed in RFC 1105 (btw, I got that example verbatim from the RFC text). Virtually any change to the API surface can break someone's code, hence the RFC's distinction between minor changes that can break stuff but are trivial (perhaps even automatic) to repair, and serious backwards compatibility problems.

It seems like it should be possible to design the rules such that this reasoning remains sound (upstream adding a method never silently breaks, at most requires disambiguation) but it's not entirely clear to me how this would best be achieved.

I mean, by default a new method wouldn't be public, so it wasn't just pub(crate) that has held this back (e.g. we could have just required that methods not be pub). pub(crate) does make it possible to make this a more useful feature.

In itself this rule seems good, and I think the back compat hazard is not that bad, but I'm a bit worried about opening a Pandora's Box of allowing visibility to impact the coherence rules. This is just one rule, but I can see it leading to more rules that are 'roughly analogous' and 'don't bring in any additional factors,' but that end up compounding into a set of rules for coherence that are a whole rulebook to learn. Orphan rules are already quite complex.

I think it would be a good exercise to instead ask these questions:

  • "If we could make coherence depend on visiblity, what are the cases we could allow?"
  • "Is there a good, abstract rule that can make the visibility-based coherence cases easy for users to grok?"
  • "Do these cases collectively introduce an affordable amount of back compat hazard?"
1 Like

I like it, but just as a potential alternative, what about a shorthand that combines declaring and impl’ing a trait, for traits that have only a single impl? Perhaps trait MyTrait for MyType { ... }

1 Like

*drool*

I’ve lost count of the number of extension traits I’ve defined. It’s such a pain in the backside, especially if you’re working with types that need to be managed behind something like Rc.

Just for the sake of readability, it’d be nice to have something to distinguish these impls from “regular” ones… maybe pub(crate) impl Type { ... } to imply that this impl is only accessible from the current crate?

But yes. Yes, for _ in 0..8 { print!("yes, ") } yes!

I discussed various alternatives in that respect. It doesn't actually have to be that it is a breaking change. But I think you have to give the "local" methods priority for things to work, or else have some way to name a method more canonically.

(Also, it's already true that adding a method can be a breaking change in the sense that it conflicts with an existing trait method and hence requires some rewriting to recover from.)

Amusingly, in the Olden Days of Yore, we used to give impls names, and you could import them. This meant that you could trivially have multiple impls for a type. When we removed that feature in favor of traits, we always assumed we would invent a shorthand for defining a trait and an impl together -- but we never did. Yes, this is another alternative.

Somehow to me it "feels" a bit more complex -- in that it's a new syntactic form to learn -- versus this scheme, but I agree that in another -- perhaps more real -- sense it is a simpler extension. :slight_smile:

Yes, perhaps. Off the top of my head I can't yet see how it would affect the coherence rules for trait impls -- I suppose maybe there might be some logic around the visibility of the type?

I am pretty sure we do want some extensions to the coherence rules in any case, though I think I've been hoping those would take the form of negative impls, to address some of these sorts of problems: Coherence and blanket impls interact in a suboptimal fashion with generic impls · Issue #19032 · rust-lang/rust · GitHub

Implicit in all this is that free functions are like a red-headed stepchild… another path could be to try to fix some of their ergonomic disadvantages relative to inherent methods. Is there any reasonable way to go about that?

1 Like

From my experience with Kotlin, this tends to hurt readability a bit. I don’t remember any specific instances but I do remember being frustrated by looking at other people’s code, seeing bar.foo(), and then trying to call bar.foo() in my code only to have it fail. Extension traits have a similar problem but the boilerplate tends to discourage their use. In Kotlin, it everything was an extension function.

Also, this would make it harder to find out where a method is defined without tooling (like racer). Currently, all methods are defined on some type/trait mentioned in a use statement (possibly hidden behind a glob); this would end that guarantee. One possible fix would be to make extension functions file/module local but that would severely limit their usability.

However, I do remember extension functions being extremely convenient and I have found myself missing them occasionally.

2 Likes

I think the idea is nice in general, however, I am wary of having local methods override upstream ones. This enables a pattern where you can give a type crate-local meaning by overriding its methods. Then when someone comes to maintain the crate, they can’t just check the upstream docs, they must also check that the methods are not overridden locally.

I’m not sure how I feel about the back compat issue of adding a method upstream. I feel like this could be something which, while it is technically breaking, doesn’t count as such for semver, etc. The way I’d phrase this, is that you can have local inherent impls, but you do so at your own risk wrt back compat.

1 Like

I agree, any kind of priority difference would just be another point of complexity, not just for the compiler, but for developers as well, and any sort of language specification.

☆ So maybe your local methods take precedence?

I think that the "global" methods - the methods defined with the type - should have precedence, then the precedence rules would be same as for locally defined taits.

In a way local methods are a lightweight alternative for local traits, so the precedence rules should be the same.

I am totally on board with wanting to avoid subtle priority differences (though I don't think it will make any particular difference on the complexity of a spec or the method-resolution impl) -- but we do I think want to address the backwards compat concerns somehow.

That makes sense to me, except that there exists no light-weight workaround. TBQH, I'd also sort of like a way to give specific traits precedence sometimes. After all, in practice, rewriting from foo.bar(...) to Trait::bar(&mut foo, ...) isn't really something anybody wants to do, even if it is in theory light-weight (and not covered by semver).

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.