Idea: Paths in method names

Once again, we have a case in std of adding methods and people getting broken. I really don’t want to get stuck in the sad world where we can no longer add things without new a new extension trait every time, so I want to keep this allowed breakage, but maybe there’s a way we can make it a bit less annoying.

Right now, the answer to getting the wrong method is UFCS. That certainly works, but there’s a massive gulf here—to control name resolution you need to give up on method syntax entirely, both the chaining form and the autoref behaviour. So a nice consistent method chain like

v.iter()
    .filter(...)
    .flatten()
    .fold(..., ...)

Ends up becoming

Itertools::flatten(
    v.iter().filter(...)
).fold(..., ...)

(Thankfully no extra &s needed in this case)

The idea: What if a method name could be a path instead of just an ident?

That way the change is localized and simple, to just

v.iter()
    .filter(...)
    .Itertools::flatten()
    .fold(..., ...)

Currently code like that gets "error: expected one of ., ;, ?, }, or an operator, found ::", so hopefully that means it’s compatible. And it’s just restricting the search to one trait (or type, I suppose, if this needs to work with inherents) instead of all of them, so with luck it’d be low impact to do in the compiler. (And clippy would presumably be able to give a lint suggestion to remove the excess qualification should it not be needed.)

Thoughts?

22 Likes

This seems like a nice idea! I think we’d probably want to support projections as well, but that should be fine I think? v.foo().<Bar as MyTrait>::buz().

2 Likes

There might be complications with floating point literals. Currently, lexer has a special case for parsing 1.foo as a method call on an integer literal, as opposed to a float literal and a free-standing identifier. This special casing might be expanded to handle 1.< as well, but 1.< 2. is already valid Rust.

3 Likes

To be clear, are you suggesting that paths-in-methods would be used preemptively to avoid getting broken by std, or, uh, postemptively, to unbreak yourself after it has done so?

Back in the semver RFC, the solution to this kind of “minor breaking change” we were hoping to eventually adopt was to generate and store the ‘elaborated source code’ of the project – with everything desugared to use UFCS, all type applications explicit, and so on – and then in the event of such an error, an automatic fixit tool (integrated with cargo itself, probably) could consult the elaborated source stored for the last-successfully-compiled version of the project, and ‘lift’ the specific, correct disambiguation from it that’s needed back into the programmer-written source code.

(Notably, this would work for any such minor-version breakage from dependencies, not just std.)

Successfully bringing that up would be a ton of work, but it seems preferable to me to adding more arcane C++isms at the language level (and a fuller solution also). When an automatic fix actually gets applied you’d get the ugly UFCS syntax as a result, but that doesn’t seem like a significant issue to me – the programmer can go in and clean it up however they like (e.g. renaming something).

7 Likes

There are probably good reasons not to do it, but can’t we swap priorities of imported traits vs built-in methods for method name resolution? So if some trait is explicitly in scope, and it has same method as native one, choose trait’s method.

Pros: backwards compatibility is working by default.

Cons: it’s harder to use new built-in methods if you already have trait in scope, but even if you do need both, this is solvable problem by just moving use statements or falling back to UFCS. Either way, it will be still affecting fewer people than current backward compat breakage IMO.

Ah right, good point.

I think the intention here is the postemptive case.

That breaks backwards compatibility.

To me it seems the main issue is that the standard library is a dependency you can’t explicitly version.

I wonder if this could be solved with specifying a minimum compiler version. For example, have rust_version = "1.28" in your Cargo.toml and you only get APIs that are tagged for 1.28 or lower.

If I understand correctly, the problem is exactly in upper bound of version, not lower bound (if you're getting new version of API that has method m and method m is also provided via some trait you have in scope, your code will get broken).

I’ve actually wanted something like this often:

my_str.<&str as AsRef<Path>>.as_ref()

Right, what I mean is that if your crate declares “I use Rust 1.28”, then even if 1.29 introduces a conflicting method, your code wouldn’t see it. That makes sense to me since actually using the new method would imply that your minimum version isn’t 1.28 but 1.29.

For example:

#[since(1.29)]
pub fn m(&self) {}

Would make the m method unavailable unless you use version 1.29 or higher.

Can you please provide example where this would break? I mean - in theory, sure, but this seems quite unlikely and would unbreak future cases of name clash, so might be worth trying with crater run?

It breaks any instance where there is an inherent method and trait method with the same name. The inherent method is currently called.

That’s why I said “In theory, sure” and that this would need to be carefully tested.

If I understand situation correctly, right now any addition of a new method to the standard library is a potentially breaking change for all code out there that happens to use trait with same-named method on same type.

The proposed change would affect exactly same scope of cases (same-named method on same type in scope), and, while more risky at first, seems like lesser of two evils, as it would make any future additions to stdlib risk-free at the cost of potential one-time upgrade for some existing code.

The conflict used in the first post is a new method defined on the Iterator trait conflicting with a method defined on the Itertools trait defined in a different crate. Preferring trait methods rather then inherent methods isn't going to do anything there AFAIKT.

1 Like

@Centril on IRC proposed another option that I wanted to write down: If there's a conflict, always take the method from the one that was used last. That way the the problem here goes away (since additions to prelude traits are fundamentally imported before anything that conflicts with them), and it only applies in conflicts so it's be trivially-compatible. But it does introduce some spooky action at a distance, as rearranging uses could break your code. (And thus rustfmt probably could no longer sort them, for example, though rls still could by considering it a refactoring.) And a convention of "use things from std first" -- common in other languages -- would mean std additions are always hidden by other traits.

I was thinking postemptively, so the how to fix this comment becomes just "add the type, like .Statistics::min()". It would also become the easy-to-suggest solution to Rust error codes index - Error codes index

This is far more elaborate than I remembered. (And, TBH, kinda scares me.) I remembered it as "store published crates in elaborated form so your deps don't break when you update, but do get errors from your own code if you're using a different compiler", rather like cap-lints. (And that version doesn't have the uses-state-not-in-your-code-repo property.)

1 Like

I'm just going off memory too so you may well be right; I seem to remember both being discussed, don't know if either was more 'official'.

Why scary? If there's some kind of stale state problem or something the most likely outcome seems like it'd be that it fails with an error, with some special set of circumstances needed for it to succeed and do the wrong thing (e.g. resolve ambiguity the wrong way), and even then it's up to the developer to review the change and decide whether to commit it.

I don’t think that storing elaborated source is really practical, but I do think we could perhaps supply some form of annotations in which we ignore methods that “did not exist at the time of publish”. This would be used (I imagine) only for crates.io dependencies, so that they keep compiling. This would be relatively easy for std, where we have correct information about when each method was added. I don’t know think it generalizes well though.

Btw, regarding paths in method names, before UFCS, that was precisely what I wanted. I imagined something like this as the explicit form of UFCS:

foo.(some::Trait<...>::method_name)::<...>(...)

instead of today’s

some::Trait::<...>::method_name::<...>(foo, ...)

Many people felt that parens in paths were too complex.

Another place this is relevant: the estwhile fields in traits RFC, which lacked a UFCS-like form.

To elaborate on what I meant: I imagine this would still break upstream, so we would still want an ergonomic way to be more explicit. It doesn't necessarily obviate the need for this change, in other words, it just offers a fig leaf to allow unmodified source to keep working.

Maybe this is a dumb idea, but: Would rustc happen to keep track of the graph of public dependencies?

My thought is, if there are multiple candidates A, B, and C for method resolution, but one of them is clearly the “most specific” in the public dependency graph (e.g. the crate for C transitively has the crates for both A and B as public dependencies), then that candidate wins. Traits in the current crate always win.

This won’t allow resolution in the case of conflicting methods between two completely unrelated crates; but I think it should solve the common case where one trait (itertools::Itertools) tries to solve the shortcomings of another (std::iter::Iterator). For foundational libraries like std, it should virtually make the addition of a trait method a non-breaking change.

It does seem possible to define some sort of resolution order, but I'm not sure if crate ordering would really suffice. After all, you might easily be using some methods from libstd but others from itertools ...?

I feel like I would want the relative precedence for method resolution to be defined explicitly somewhere, and probably on a method-by-method basis -- that would sort of be a tool of last resort.