Idea: "language warts" RFC repo


#1

Motivation

As any long-lived program evolves, it is likely to accrue inconsistencies in its syntax. Some may come from logical conflicts in its design, others from grand designs, implemented iteratively, that lost their champion. Others may be vestiges of a bygone era, no longer required due to language or technology innovation. Perhaps others seemed great in design, but had a greater negative effect on quality than imagined.

These inconsistencies fall into a category of “language warts”. They’re undesirable, but ultimately cosmetic problems that don’t interfere with the day-to-day application of the language. Over short time frames, it is difficult, if not impossible to justifying excising a wart, due to its diminutive effect on programming. Over long time frames, the collection of these warts combined can give the impression of lost cohesion.

It is impossible to predict which features may introduce warts, or when the “critical mass” of warts may occur. It can be said, however, that it is likely, if not inevitable, that warts will occur. The goal of the “language warts” RFC repo is to collect examples of “language warts” nominated by the Rust community, to be dealt with in a future epoch.

Related

Continuing the discussion

I do not understand why only uncontroversial warts should be considered. The one-repo-per-RFC approach starts with a spitballing/brainstorming phase. The idea of the “language warts” RFC repo would be for this spitballing phase to last past the next epoch. With such a long lifetime, it’s conceivable that even the most controversial change would lose its edge. If anything, I’d assume the sudden merge of several wart fixes would present the most controversy, as it would be easy to argue that the epoch is “forking” the language.


[blog post] Proposal for a staged RFC Process
#2

https://github.com/rust-lang/rust/issues?q=label%3Arust-2-breakage-wishlist+is%3Aclosed


#3

That’s what the third “related” link was about: Why `*const T` and not `*T`?


#4

Here’s some further material!


#5

I was just thinking about (a, b, c) -> { a + b + c } instead of |a, b, c| { a + b + c } for closure syntax. I like the way it mirrors function decorations however I don’t know if there would be any parsing difficulties.

An another note I think AsRef and AsMut should have been called AsRef and AsRefMut. Everywhere else we add _mut for the mutable version and this set the president for ref as an exception. Also I think that StringSlice is a lot more clear than str for string slices. Call is better than Fn mostly because it avoids the confusion about the difference between Fn and fn.


#6

Better yet, IMHO, would have been “Lambda”, “LambdaOnce”, and “LambdaMut” (or “Closure”, “ClosureOnce”, etc.). You don’t normally need to type these types, so, there is no real reason for the names to be short.


#7

I wouldn’t call such maybe-not-ideal naming cases a language wart. They’re just regular bikeshedding (e.g. I’d object to Lambda being jargon, and not as nice as widely understood Callable).

Contrast this with struct syntax which IMHO is a real wart. It makes parsing of if a weird special case, and type ascription ambiguous (structs are the only place where ident: Expr can exist, as opposed to ident: Type everywhere else). Real shame, because Rust avoided C’s mistake of “dangling else” parsing problem, and made its own :frowning:


#8

Note that the point of the Fn* traits is (was?) that you can implement them manually for types. This is still unstable afaik, but considering that these traits are akin to operator overloading, I do agree that Call is the better choice here.


#9

I would say Call, CallMut and CallRef or similar.


#10

Interesting, thanks!

I also wrote my own personal list of annoyances a year ago. Looking back, I’d say I stand by everything.

Hopefully Rust 2 will sort out the worst ones, and doesn’t turn into some half-done “we lost the courage to fix things” release along the way.


#11

I actually like that the Fn traits have such short names. ^^’


#12

Why? How often are you mentioning the name? If infrequently, what makes a short name better? You’ll read it more than you’ll write it. That seems to argue for a long/descriptive/clear name? Why have the Fn* traits be so closely named to fn when they aren’t the same thing?


#13

@gbutler Whenever I define a function that takes a callback :slightly_smiling_face: Mind, I don’t dislike the name Call. If Rust 2021 wanted to change that I wouldn’t object. Unlikely to happen though because it would entail a lot of churn for negligible gain.


#14

I agree with that, but, the same could be send about most “Wart” type of changes. That being said, I see no reason to rename them, just the name is somewhat unfortunate as it requires it to be explained that fn and Fn are not the same thing.


#15

Pretty often. It is one of the most frequently used traits I’d say :slight_smile:

I’m very happy with it being called Fn, tho I would have also been OK with Apply.

(Maybe start a new topic on this?)


#16

I would like to comment on several of your annoyances.

Generics with <>

Misuse of [] for indexed access. Having both () and [] doing roughly the same thing, especially since [] can be used to do arbitrary things, doesn’t make sense. Pick one, use the other for generics.

So if I understood you correctly, you propose to use () for indexing in addition to method calls and [] for generics? I think it’s not a good proposal, as [] is almost universally used for indexing and slicing in other languages, and having buffer(10) will be quite confusing, as most of programmers will think about functions, not indexing.

:: vs. . is kind of unnecessary.

If you are talking about self-less methods, then I kinda see the point, but I like the coherence which makes Foo.foo() and Foo::foo(bar) equivalent, i.e. . is a nice hint that self is implicitly used in a method call.

Closures could be made to look much closer to functions, but somehow aren’t.

Any concrete suggestions?

“associated” functions in trait impls. I’d prefer separating them from normal functions and drop the self where possible.

You mean trait methods with default implementations? Do not forget, that those methods can be overwritten by implementations, which is often used for optimizations.

Arbitrary abbreviations all over the place. It’s 2017, your computer won’t run out of memory just because your compiler’s symbol table stores Buffer instead of Buf.

The unneeded naming explicitness makes code harder to write and a little to read. I hope you don’t want to use function instead of fn and begin .. end instead of { .. }.

Also, having both CamelCase and methods_with_underscores?

I personally don’t see a problem with it. And I would’ve passionately hated CamelCase for method and function names.

iter(), iter_mut(), into_iter() … decide prefix or postfix style and stick with it.

Choice of prefix or postfix mostly depends on how nice it will be to read. For example listed methods can be read as “iterate”, “iterate mutably” and “convert into iterator”, while iter_into() would’ve been confusing. Is it “iterate into something” or what?

Type bounds are Sized by default, with some weird special syntax to opt out (?Sized).

Sized trait bound is a rational default , otherwise we would’ve seen a lot of noisy Sized bounds in our code. And with such default we need ?Sized for opting-out. And I think it’s a coherent choice with !Trait bounds.

/// for normal documentation, //! for module level documentation. Documentation already uses Markdown, so maybe just let people drop a markdown file in the module dir? That would make documentation much more accessible when browsing through GitHub repositories.

And would’ve been a disaster for those who writes code. It’s quite important to see documentation when you work with source code, and no, IDE is not an answer.

Also, documentation can cause compiler errors … that’s especially fun if you just commented a piece of code for testing/prototyping.

Can you elaborate? Also for commenting code you usually use // or /* .. */, which have nothing to do with documentation.

Type alias misuse: In e.g. io crate: type Result = Result<T, io::Error> … just call it IoResult.

As a workaround I usually use io::Result in my code, also you can write use std::io::Result as IoResult;. But yeah, I agree that IoResult would’ve been a better choice.

Macros are not very good. They are over-used due to the fact that Rust lacks varargs and abused due to the fact that they require special syntax at call-site (some_macro!()).

How special syntax results in “abusion”? And I don’t see a connection between “over-used” and “macros are not very good”. I agree that const generics, varargs and maybe some other features would’ve significantly reduced macros usage, but it does not tell us anything about quality of macro system. (though I certainly would like to see a good procedural macros system on stable)

println! and format! are very disappointing given that they use macros.

How do you propose to implement it instead? Don’t forget about formatting string checks.


#17

Thank you, everyone, for your contributions to this thread. It seems there are quite a few “warts” (or, at least, debatable warts) in the language already. Keep them coming!

@nikomatsakis, @aturon: Assuming the Rust core team decides to experiment with a TC-39-like RFC development process, I’d like to nominate the language warts repo to be a (perhaps permanent) test-case.

With that end in mind, I’d like to turn the conversation towards logistics.

Repo Structure

Something worth considering is how the language warts repo should be structured. I’m thinking something like the following:

  • README.md - Explains the goal and/or motivation of the RFC repo. Includes this file overview.
  • CONTRIBUTING.md - Instructions on use of GitHub issues, pull requests, etc.
  • LICENSE-APACHE.md - as in RFC repo
  • LICENSE-MIT.md - as in RFC repo
  • ideas/*.md - A specific idea under consideration. For the language warts RFC, this directory lists the warts themselves, one per file. The idea file would illustrate the wart, summarize the discussion-to-date, and link to the github issue thread and related discussions.
  • proposed/*.md - contains ideas that have achieved consensus, and may be included in the next RFC.
  • accepted/*.md - contains ideas that have achieved consensus, and have graduated to an accepted RFC.
  • closed/*.md - contains ideas that are closed to discussion. This could be because it was rejected by the relevant team, cannot be implemented, conflicts with a proposed or accepted wart, was made irrelevant by other changes, is troll bait, etc
  • 0000-language-warts-{edition}.md - the proposed RFC, in the same format as the current RFC repo. If the repo produces multiple RFCs, this may be a directory. {edition} is the name of the edition that fixes the warts. This RFC won’t be created until the edition is decided elsewhere.

Worth noting: The purpose of proposed, accepted, and closed are largely administrative. Ideas may move between folders when justified. (This applies especially to closed ideas.)

The RFC would use GitHub issues to discuss the “seed” warts (those listed in this thread, which I’ll copy into the repo). New warts could be discussed in their PR before merge, and in issues afterwards.


#18

I perceive the as-operator as a language wart.

It can be used to transmute pointers into ints, lossy value conversion (for example between floats and ints), bitwise reinterpretation of integers… its kinda somewhere between std::mem::transmute and the into trait without a guarantee of correctness. Its also not backed by a trait.

I see where its coming from (c-style casts), but imo its a footgun and confusing for everyone who is not coming from C.


#19

Agreed. Some shorter-term progress:


#20

My personal one is “mixing operator overloading with semantics”, that is, + is implemented by the Add trait which conveys addition, < is implemented by PartialOrd which conveys a partial order, etc.

This basically makes it very weird to do anything with operators when the mathematical meaning of what the implementation should do does not match with the meaning that the language attaches to them, e.g., using + or / for set operations, + for string concatenation, using < with SIMD types to return a “vector of bools”, … For example, if I wanted to implement < for a type, e.g. as part of a DSL, but where < does not imply a partial order, right now in Rust I just can’t. If I implement the trait, then there is no way for me to state that the rules of PartialOrd do not really apply to my type.

If we could go back in time, I would have pushed hard on naming the traits Plus, Less, … to just that a type implements an operator, without attaching any meaning to what the operator does. Those who want to attach a semantic meaning, can easily do so by just implementing a different trait (e.g. trait Addition w/o : Plus).