Idea: "language warts" RFC repo

This is a design choice. You want more flexibility, other people want code simpler to understand. Rust has chosen to tie their syntax with semantics and that’s one of the ways to “tame” operator overloading in programming languages.

Do you have a link to where this deliberate design choice was made? I'd like to read any rationale, because we have been trying to come up with semantics for PartialOrd for a while now so that the is_sorted RFC can make some progress, without much success. Any discussion that gave semantic meaning to PartialOrd's operators could help us a lot there (last discussion about this is here).

Also (playground):

fn main() {
    println!("{}", "1".to_string() + "2");  // prints: 12 instead of 3
}

looks more like concatenation than addition to me. So what does impl Add for T mean? The trait is called Add which implies addition, but in some cases it implements concatenation instead, and the trait documentation doesn't really state anything about the trait semantics.

So sure, tying operator overloading with semantics is a possible direction that a programming language might go, but doing so without concisely stating the semantic meaning (e.g. can I rely on associativity when using Add to constrain a T ?) and then given them different meanings depending on the type, and giving some operators meanings and others no meaning, that would be a pretty weird design direction to choose if you ask me. So I am very skeptical that any of this was deliberate, but if it was, I'd like to know why it was decided to go this way.

2 Likes

looks more like concatenation than addition to me.

I'd argue that the design error is the overloading of Addition for concatenation, not that operators are tied to semantics.

Imo,. push_str is the better solution for that use case and I'd be happy about a depreciation of this Add implementation.

I’d argue that the design error is the overloading of Addition for concatenation, not that operators are tied to semantics.

How so? I think that the current String implementation of Add makes sense for someone that understands string addition as string concatenation. Also, because the semantics of the Add trait semantics are so loosely specified, this implementation does not violate any pre-conditions of Add, so it is valid. It feels that String is lying a little (*), yet many types implementing Add do, so this single deprecation wouldn't really solve much.

Currently, Add is to loosely specified to be a useful generic constraint. This wouldn't be a problem if this was the intent, but the fact that it comes with a bit of semantics requirements on its implementations suggests otherwise. At the same time it is too strongly specified to allow types for which + makes sense to actually implement it without kind of having to lie.

We could constraint Add to mean addition in such a way that it is useful as a generic constraint, but then there would be a lot of implementations of Add that would become invalid. For example, we could specify it to denote a monoid, yet that would still allow String to implement it for concatenation, and that would not allow floats to implement because floating-point addition is not associative. And there lies the root of the problem: from every day life and math we are used to use the same operators to denote completely different things that work under completely different rules (e.g. + for integer and floating-point addition) depending on context, that is, the types involved.

So I think that Add should just have been Plus, and be used to overload +, period, without any semantics added to this. This would be pretty upfront that using Plus as a generic constraint is pretty much meaning less. Each concrete implementation can then add whatever semantics they want, and if someone wants to add and use semantics generically they should be using a different trait.

And Add is simple. PartialEq/PartialOrd tie the comparison operators with some semantics that are currently not fully specified, and in very inflexible ways: e.g. < must return bool (which breaks for types for which < returns something else), types can only implement one PartialOrd (meaning that floats, which always have had two different partial orders must choose and there is no clear way to order them using the other relation), etc.

@leonardo 's comments criticized C++, but in C++ orderings are a property of relations between types, not of types themselves, which means that floats implement two different ordering relations, and that's it. Operator overloading of relations in C++20 is sadly more sound, useful, and convenient, than in Rust.


(*) mainly because concatenation is an operation on its own, and + is used for String concatenation, but + for integers means something else (instead of integer concatenation, which is also a thing).

If + gets deprecated I'll ask ~ and ~= to be used, see below :slight_smile:

I was writing about the widely criticized C++ << used for I/O.

I prefer D language for this, it uses ~ and ~= for string concatenation and string appending, because I prefer + to be used for commutative operations. (but old Rust used tilde for something else).

2 Likes

But its not actually string concatenation. Its „copy the content of a view into a string to the back of a string buffer, possibly reallocation it“

That may be nitpicky (and i dont want to imply that i said something new to anyone here), but is actually important in the context of a systems programming language. And there is certainly a lot of squinting involved to define an operation between two different types as a monoid operation...

I guess it's a... monoid action? Is that a thing? Like a group action, but for monoids?

I'm going to call it a thing. It's a thing now.


Edit: erp, nope, it's not even that, because you can't add &str + &str either

1 Like

I was wondering whether the language/libs teams feel that it’s worth spending some time on fixing some of these warts? Especially in the case of names in the standard library, it’s relatively straightforward to have some slow deprecation schedule (introduce new name in 2018 edition, warn-deprecate in 2021 edition, remove in 2024 edition) that improves cohesion in the long run at the cost of some churn. I tend to be optimistic about cost/benefit on these sorts of things, but that may not be widely shared.

2 Likes

That'd be redundant... std::io::IoResult. I myself use it like io::Result, which I find clean.

2 Likes

agreed... I found it glaring in my early days of Rust

interesting thought

2 Likes

Actually, closure syntax already uses ->. It doesn't demarcate the body, but rather, it allows one to supply an optional type annotation:

let func = || -> i32 { 4 };

No, they could have used lowerCamelCase. One of the negative effects snake_case is that it creates additional overhead for function names with more than one word and therefore tilts the scale in favor of single-word names, even if a two- or three-word name would have been more expressive.

Yeuch. For the purposes of feeling "heavy," I consider camel case to be many times more fearsome, because for certain letter combinations, every time I type them there's a chance of it coming out DoubleUPpercase, which is one of the most brutally annoying mistakes to have to fix.

(Ironically, this almost never happens to me with underscores, even though they require shift on an American keyboard, because their location on the top row slows me down just enough to ensure that I release shift. I suspect that somebody with an Italian layout might have some choice words for me...)

I would have preferred it if the behavior of macros would have been the responsibility of the macro author, with the general rule “if the user needs to know that something is a macro, your macro is wrong”. The current situation just punishes reasonable macro authors that write macros with predictable behavior at the expense of authors that want to abuse the hell out of it.

IMO, "predictable behavior" is an impossibly high standard for the vast majority of macros. Quite honestly, I feel that even the standard library itself violates the principle of least surprise in a number of its macros by implicitly borrowing the arguments (see format_args!, write!...). I'd hate to think that any function could do this! (or worse, that any function call might lazily evaluate its arguments!)

Macros masquerading as regular functions just ain't cool.

$ echo hi
hi
$ echo hi >stdout.log 2>stderr.log
$ time echo hi >stdout.log 2>stderr.log

real    0m0.000s
user    0m0.000s
sys     0m0.000s
$ wtf?

At the very least I would have expected that println! and format! would have allowed to directly refer to the interpolated values: println!("{name} is {height} tall") instead of println!("{} is {} tall", height, name). The current API is very inergonomic and error-prone.

Python finally got this just in the last year, and using it for the first time felt like Christmas morning. Once you've written in a language with proper string interpolation, you can't understand how you ever lived without it.

4 Likes

Also take a look at D language, it uses () for type/const arguments, and !() for instantiation (plus a special rule of just the bang when there's just one argument).

Why don’t you like the D syntax solution? It’s sufficiently clean. My least favourite approach here is the C++ template syntax.

Perhaps you like Scala syntax a lot, but I don’t like Scala much, and I don’t think it’s a good idea to conflate function calling with array access. Even if in mathematics you can think of them as the same thing, in a system language, that often has to access memory at low level, it’s good to have a syntax that clearly tells apart the two things.

Lot of the supposed “Rust warts” in this thread aren’t warts in my opinion (despite Rust has some warts, I write posts about the problems since Rust V.1.0 and there are many things I didn’t write about yet because I don’t think I have enough political power to change anything). Each one of them will need discussions, the good ones will need long discussion threads, and often the change will not be regarded as important enough to break compatibility.

3 Likes

There are other ways to get rid of turbofish besides changing off of <>, like using a different syntax for function calls and variable access. Rust is able to avoid the turbofish in types because the parser knows the difference between types and expressions and types can’t have comparisons, so if calls and variables had different syntax, it would allow turbofish to be avoided in all cases, because function calls can’t have comparisons and variable access can’t have generics.

But that would also be super-breaking, and add noise to the common cases, and turbofish can be avoided most of the time using inference anyway.

The turbofish is still totally a wart, though.

1 Like

Thus far, the impression I've gathered from the core team is that a major version hasn't been contemplated in any serious manner. If it were to occur, the scale of changes accpeted in a major version would be determined like everything else--through community discussion, RFC development, and consensus among the Rust teams. Organizing and discussing the language warts is the first step along a journey that won't bear fruit until at least the next edition (and likely longer). It's too early to know if there should be a "Rust 2.0", much less delimit what it may include.

There will never be a Rust 2.0.

1 Like

I don't know what the language of the year 2050 will look like, but I know it won't be called Rust 2.0.

--- Tony Hoare, 1982

6 Likes

To clarify for those less familiar with past discussions:

Although the editions RFC does not explicitly state “there will never be a Rust 2.0” (in the sense of a breaking change that, unlike editions, offers no interoperability between pre-2.0 and post-2.0 crates), it was almost universally interpreted that way (for example, this comment), and the stability commitment that implied was one of the proposal’s huge benefits, and I believe the acceptance of the RFC does indicate a commitment to never having a Rust 2.0 in the above sense. I think something truly catastrophic would have to happen to Rust to make us seriously consider a 2.0 now.

5 Likes

I think we should explicitly separate two different categories here:

  • language warts that are just plain bad/annoying/made the wrong trade offs and may or may not be fixed
  • lessons learnt from the design of the rust programming language that could inform the design of other programming languages

If we were a long while before 1.0 i’d “vote” for [] instead of <> for types. But a change that impacts nearly every single line of rust code ever written… nah, won’t happen.

It’s good to think about it nonetheless, but it’s a different concern than a repository of small nitpicks than have the chance of being fixed.

5 Likes

If we're going to have operators tied to semantics, under a consistent Trait system for them we could help code readability because the kind of operator you're seeing would be associated with certain properties. Taking the idea from @leonardo here:

These are some ideas that gives me: [square brackets are optional restrictions that I think could also make sense]

  • +: commutative (and associative) operation [on a group, also defines an inverse - and so on like described here].
  • ~ associative operation, as the one in a semigroup. These semantics are the ones of string concatenation.
  • * defined with similar semantics as +.
  • (example) # associative, invertible operation over a group, such as the one defined by square invertible matrices and quaternions.

More operators of this kind could be defined, probably. Which ones am I missing? Probably at least 3 :sweat_smile:

Operators that give boolean results can be seen as binary relations. This allows us to define semantics for such relations:

  • = is reflexive, symmetric and transitive. This means it defines an Equality Class (trait Eq)
  • <= is reflexive, antisymmetric and transitive. It defines a Partial Order (trait PartialOrd)
  • < means a<b <=> a<=b && !a=b

There are some operations mentioned in other posts that work over e.g. a vectorized input: Vec<f32> x Vec<f32> -> Vec<bool>. I don't know how the semantics of such operations reflect into algebra, so I'll need help there :sweat_smile:. I'm sure there is a general concept behind them :slight_smile:

1 Like