Has elision of types when implementing trait functions been discussed previously? It’s something I keep thinking about while writing trait implementations.
i.e. Instead of
impl From<X> for Y {
fn from(value: X) -> Y { ... }
}
Something I run into a lot is implementing a trait that’s already defined elsewhere, I have to go find the docs for that trait just to look up the function signatures every time I implement it for some type, when often I just remember name of functions and such. For the above example it’s simple enough to remember, but for more complex traits with more type parameters or associated types, it can often be impossible without looking it up. It just feels like unnecessary boilerplate when the trait is already defined elsewhere for the compiler?
I know I have wanted this, but I can’t recall whether/where it’s been discussed.
I like the value: _ form, as we already allow _ for type inference in other places. In some instances you might want to give a little more of a local hint, like value: Vec<_>.
I guess things like let value also infer types without any : _ placeholder, so that may be fine in arguments too. I’d really want to require some kind of -> _ for the return value though, because having nothing there is supposed to mean the same as -> ().
Consider, on the other hand, that this means that the type of this function is not knowable just from looking at the defining crate, requiring an extra jump to the declaration.
Or, imagine that you change the types in a trait you own, and your implementation for your own type uses function signature inference. There is a small chance that the change somehow manages to still type-check, so you now how to rely on tests to catch the change. For example:
trait T {
fn f(x: i32, y: i32) -> i32;
}
impl T for K {
fn f(x: _, y: _) -> _ {
x + y
}
}
If you changed the signature of f to be fn(String, &str) -> String, you now cause an allocation to happen in a distant function. This is a contrived example, but shows the value of repeating yourself in the impl for both reading code and changing it.
I don’t really buy that argument. The same thing could happen today. Another contrived example:
struct Example(i32, i32);
// Elsewhere....
fn f(value: Example) {
// Do something with value.0 + value.1
}
And now change Example to (String, &str) and you have the same situation.
The point is, any time you have type inference at all this is going to happen to one degree or another. If you change the signature of something, yeah, you’re gonna get side-effects of code using it.
And as this only allows for inference in trait methods (where the type is already fully specified publicly elsewhere), that type changing is a public breaking change (of the module) and thus should already be carefully considered. And if the semantics change drastically enough to break code, then imho it was a poor fragile design anyway (though it would of course be better to fail loudly).
It’d be interesting to have a rustfixable clippy lint to specify the unspecified type. Iirc @Centril had an idea to allow _ in more type places as a “error later and tell me what you expected”, as well.
Correct. I believe that type inference in function signatures, which are public, is a much bigger mistake than type inference elsewhere. See also my comments about readability.
This is only true for a public trait, or for a place where you believe in semver; if you believe in live-at-head, this becomes less true.
Infer-my-types-for-now-but-fill-them-in-later-for-me is fine, but only if there is no flag to turn off the error. If you can turn it off, people will turn it off in their crates, which will lower the readability of the ecosystem.
Enum variant members are public, and struct members are sometimes (though less commonly at crate level) public, so I don’t really see function signatures as any “more public” than either of those. And I disagree that it makes it any less readable than all the other forms of type ellision we already have.
I'm sure that this has been thought of before since traits are based on Haskell's type classes.
I personally think allowing type inference of where clauses and types in trait implementations is a good idea. I believe it should work both with _ and without by making $pat: $type into a valid pattern (see RFC: Generalized Type Ascription by Centril · Pull Request #2522 · rust-lang/rfcs · GitHub for a discussion). However, I also think that the time to consider this elision is not now as it doesn't fit with our current roadmap.
This is not like type inference in function signatures of inherent or free functions. The signature of the function is fully determined when it is inside a trait implementation. As such, it is not a matter of semver.
When considering the elision proposed here, we should remember that Haskell has had this ability for nearly 30 years and it existed in Wadler's first paper on type classes ("How to make ad-hoc polymorphism less ad-hoc"). Indeed, it is not even possible to provide a type signature in a type class instance unless you enable -XInstanceSigs. It's therefore safe to say that there's considerable experience with allowing this inference and in my view it works well in Haskell.
I suggested allowing ? as a typed hole everywhere. _ should be used for inference as normal but it is fine if the compiler says "nope" and tells you the type instead if we don't want to permit inference somewhere.
It’s another example of the conflict between wanting terse syntax to write, and verbose syntax to read. This pattern comes up every time syntax sugar is discussed:
When writing code, people already have the exact meaning they want in their head, so ability to understand the code in written form doesn’t seem important. OTOH it feels unnecessary to type things that the compiler already knows. The writer would like to write just the absolute minimum required for the machine to understand the intent.
When reading the code, especially someone else’s code, people don’t always know the context and full meaning of the code. Even if the compiler understands it, the reader may not. The reader wants the code to be self-explanatory as much as possible, even if it makes the code verbose.
These are opposing requirements, so I don’t think there’s any syntax that satisfies both without a compromise.
But I think we should start thinking about satisfying both requirements by involving code-rewriting tool like rustfmt to accept code written in a terse form, and output the code in a verbose form.
If you are familiar with the trait being implemented then the types and constraints don't add much extra context. Code verbosity in things that aren't important can also be in the way of comprehension when reading. Type inference in function bodies can also leave out too much context and make things harder to understand. In the end, I think the author should be empowered to decide what is important and what isn't. For example, you can provide types in function bodies if you like, or leave it inferred, it is your choice.
This misses my point. Having to jump crates to determine the signature of a function is too much overhead. You can't rely on tools like RLS or Kythe to be available to people reading code on either github or on a repo they cloned to their workstation.
Remember the fundamental rule of Readability: code is read ten, a hundred, a thousand times more than it is written. For each CL or commit you have dozens of people looking at what you wrote. Hence, one of these should win out more frequently in language design.
This assumption is too strong, and a stepping stone towards unreadability. I have a request bottleneck that my profiler has conveniently revealed for me. It's in a dependency's dependency. I want to figure out why this is a bottleneck, without having to understand the library's abstractions.
I do this type of thing all the time at my job, and even with world-class C++ indexing it's still painful. Do not assume your reader has read all your documentation before arriving at your implementation.
This assumes that you are not familiar with the trait, but if you are, e.g. if we consider Clone, From, Iterator, and so on, I think you don't have to jump anywhere. Moreover, traits are much fewer in number than implementations and type definitions.
Also, I don't think it's always necessary to determine the full signature as well to understand the underlying context and what the semantics of a function is.
How often is this done as a % of everyone's programming tasks? Who are we optimizing for...?
This applies equally inside function bodies. I'd argue it applies even more and is even a point against function abstraction in the first place. If a bunch of types are inferred and a bunch of methods are called and you don't understand those "before arriving at your implementation" then they will need to look those up, which results in having to "jump crates".
In my years of experience with Haskell, not having signatures in type classes has never been a problem in terms of readability for me.
This seems to suggest that we should only allow this type of inference for "well-known" (read: specifically-marked in std) traits, which, to me, is already a sign that this feature is already too niche.
I disagree. The type of a function should tell me everything I need to know about its sematics, ideally; further details should be obtained from its name and from its doc comment. I should rarely have to glance at its context, and never peek inside of it. Much like optimizing compilers, most of us are far better at reasoning locally than globally.
This sort of dogmatic view towards explicitness is necessary, because different people have different opinions on what should be explicit, and I want my code to be skim-able by people who need more details than I do.
I expect everyone has to dive into software they've never read all the time; every engineer I've ever worked with has to do this, and archaeology is considered part of the job.
I want Rust to replace C++ in large projects, like Chromium, that are changed by hundreds of engineers at a time. I want to rid large software of memory bugs. To support that, we need to optimize for readability, and avoid repeating C++'s readability mistakes. Incomplete information at function boundaries (including, but not limited to, C++'s unconstrained, macro-like templates) is soundly one such mistake.
What point are you making here? Of course understanding a function will require looking at some dependencies, but this set should be minimal. Just because I need to look up the types of the arguments does not mean I should also be looking up the trait that those arguments came from. If I have some types to start with, I might only need to do minimal look ups if I can follow along with the transformation.
let type deduction is, in some cases, a readability concern, but that ship sailed long ago. We should not be making this problem worse.
I can argue the dual: in my years of experience with Scala, incomplete signatures resulted in a lot of code that was very unpleasant for future!me to read. I thought that signature inference was great for a long time, but have grown to strongly dislike it, because running a type checker in my head is hard. That's the compiler's job.
That said, what I describe above is not my specific experience, but the consensus of the engineering community I am part of, which agrees that readability trumps all (except for performance).
I think you know full well my opinions on this type of language feature; continuing this argument will not be productive.
I think this strategy would have even greater value. When tooling like RLS, rustfix & Co assist with creating the code instead of assisting with reading it you get:
More terseness/productivity when writing, as the IDE can populate the trait implementation with stubs for all required associated items.
More casual interface training, as the users will actually see the involved types when creating the implementations, and can remember them for usage.
More direct and local information when reading code, as the types are available when the code is read outside of an IDE, like on Github, in mailed diffs, in paste services and so on.
I believe that investing more time in editors and improving generating stubs like this is the right way to go. If I understand all the points correctly:
Pro’s:
Easier to read for people that are already familiar with the specific trait in question.
Easier to type without having to search up the signature.
I think having good IDE support for generating the stubs would probably defeat point two. I think the verbosity of the type parameters actually makes it more clear what is going on. I guess most of the time when reading through trait code, something is probably wrong (otherwise you wouldn’t be looking at it I guess?), so additional information would help in understanding how things work. If you do not know the type of a value it makes it harder to determine what the function is actually doing. Of course editors could show the types as well, but I personally I prefer it the other way around.
I think that keeping full type information in trait method implementation is ideal for consistency as well, as they’re required in function headers everywhere else. (That said, pattern and type ascription generalization may provide a nice avenue to change this.)
Dynamic snippets for impl blocks is definitely a great tool to help eliminate the effort to write this local information.
That said, it could be interesting to allow trait implemetations to “underbound” types to just what they use locally (say, Vec<_> if you only need .len()), which means local types could elide what information isn’t needed locally.