Associativity, commutivity, etc. for standard operators

ckaran · August 31, 2021, 6:35pm

I just ran into an interesting issue that I hope has already been handled by both the language and compiler teams. Namely, what assumptions does rust (as a language) and rustc make about the standard operators and optimizations that it's permitted to perform with regards to a type's associativity, commutivity, and distributivity?

This has an impact on statements like the following:

let d = a + b + c;

Based on the fact that String implements Add, I'm guessing that the compiler never assumes that Add is commutative, but I don't know if it makes any assumptions about associativity or distributivity. Are there any rules regarding this that is guaranteed by the language for each operator?

If there aren't any hard & fast rules, and it depends on the type, would it be worth it to have some new marker traits for associativity, commutivity, and distributivity? Would such information be of use to the compiler for optimization purposes?

mcy · August 31, 2021, 6:57pm

I'm guessing that the compiler never assumes that Add is commutative

At the trait level, no. Once monomorphization has occurred, the backend (LLVM, for the most part), is violently aware of commutativity, associativity etc. of operations. Every compiler worth using has the entirety of Hacker's Delight programmed into its strength reduction passes.

Combining this with inlining, you already get what you're describing for any numeric type you can build.

Would such information be of use to the compiler for optimization purposes?

Unclear. It would probably be more valuable to the compiler for it to be aware that certain operations are "pure", allowing more aggressive reordering/deduplication. This isn't specific to the operator-overloading traits, which are really just syntax sugar.

Note also that Rust cannot in general rewrite f(a, b) into f(b, a) even if it knows that f is commutative; ISTR that we made function arguments sequenced-after each other (but I could be wrong).

ckaran · August 31, 2021, 7:37pm

Yeah, but the operators are overloadable for any type you care to define... including ones that aren't associative, commutative, or distributive. So, my real question isn't for integer types where this a known fact from mathematics, but for homegrown types. If a + b + c always desugars into something like a.add(b.add(c)), then the order of operations is clear, and the problem goes away. But if desugaring is ambiguous and arbitrary, then things get to be more complex.

Even for numeric types things can get complicated (permalink):

fn main() {
    let a: u8 = u8::MAX - u8::MAX + u8::MAX;
    println!("{:?}", a);
}

This runs without any errors.

fn main() {
    let a: u8 = u8::MAX + u8::MAX - u8::MAX;
    println!("{:?}", a);
}

permalink

This won't even compile:

Compiling playground v0.0.1 (/playground)
error: this arithmetic operation will overflow
 --> src/main.rs:2:17
  |
2 |     let a: u8 = u8::MAX + u8::MAX - u8::MAX;
  |                 ^^^^^^^^^^^^^^^^^ attempt to compute `u8::MAX + u8::MAX`, which would overflow
  |
  = note: `#[deny(arithmetic_overflow)]` on by default

error: aborting due to previous error

error: could not compile `playground`

To learn more, run the command again with --verbose.

So, I guess that the operators by default cannot be assumed to be associative, distributive, or commutative.

As for being useful for compiler optimizations... I honestly have no idea if it would or wouldn't be useful, my real concern was that the optimizer might be erroneously assuming that types that implement the standard operators also satisfy some other combination of properties. As long as no such assumption is being made, then everything is fine.

ExpHP · August 31, 2021, 10:16pm

The desugaring should be fully specified by the operator precedence and associativity table in the rust reference.

Virtually all operators in rust are left-associative except for assignment. a + b + c will always desugar to a.add(b).add(c).

SkiFire13 · September 1, 2021, 7:27am

ckaran:

Even for numeric types things can get complicated (permalink):
fn main() {
    let a: u8 = u8::MAX - u8::MAX + u8::MAX;
    println!("{:?}", a);
}
This runs without any errors.
fn main() {
    let a: u8 = u8::MAX + u8::MAX - u8::MAX;
    println!("{:?}", a);
}
permalink

This won't even compile:

The reason is that the compiler does a bit of constant propagation, and that uses checked arithmetic, which is not associative, distributive, or commutative and can fail (like in this case), but when it doesn't it is equivalent to wrapping one with all its properties.

scottmcm · September 1, 2021, 7:40am

The only place where rust is allowed to assume something about a trait implementation is about Clone::clone on types which are also Copy: 1521-copy-clone-semantics - The Rust RFC Book

It never assumes associativity, commutativity, nor distributivity for general types. (Though of course during codegen when it knows the definitions of things it might prove that certain things are true and optimize based on that.)

H2CO3 · September 1, 2021, 8:10am

Once instantiated with a specific type, the optimizer can deal with that specific type perfectly well.

ckaran · September 1, 2021, 2:22pm

THANK YOU rust team for thinking about this and solving it long before I ever thought about it! So, we're guaranteed that rust will never do the wrong thing as all operations are mathematically sound.

On to the other part of my question: would knowledge that some type has certain properties be useful to the optimizer? Would knowledge that something is associative and distributive allow the optimizer to emit concurrent or parallel code, or do other, currently impossible, optimizations?

mathstuf · September 1, 2021, 2:53pm

I think, as mentioned above, the pure-ness is the more relevant thing here. Even knowing that (a+b)+c returns the same value as a+(b+c) doesn't mean that there isn't global state that can witness the change in behavior (logging, weird mutex stuff, etc.). Without this guarantee, the function calls cannot be reordered.

ckaran · September 1, 2021, 3:09pm

But if there is global state that can witness the difference, then the type isn't associative. If the programmer uses a marker trait (e.g., Associative) and then breaks the contract that implies, then they've cause UB. The only thing that the compiler needs to do is the same thing it does for Send, Sync, etc. as it checks up through the composite types.

mcy · September 1, 2021, 3:14pm

Telling the compiler that a certain operation is associative isn't deeply useful. LLVM will likely infer it when it is going to be a useful lemma, which only matters when it can perform other strength reduction transformations based on that reordering. Like, if I tell the compiler that my_cool_function(a, b) is associative, what is it supposed to do with that information? How can it measure that reassociating it is better (for complex user types, not for integers, which it violently understands on a deep level).

elidupree · September 1, 2021, 3:20pm

An example that comes to mind is if your function computes both (a+b)+c and a+(b+c). Then an optimizer that knows it's an associative pure function would be able to compute it only once and use the result twice. (Sure, the programmer might be able to manually deduplicate it, but this situation might occur due to inlining in a situation where manually optimizing it would make the code a lot more verbose, etc.)

EDIT: I should clarify that I don't think this would be a useful feature for Rust - I'm just answering the theoretical question.

ckaran · September 1, 2021, 3:37pm

Good example! Much better than the one I was thinking of...

OK, but why isn't this useful? If you're making arguments based on Amdal's Law, or arguments on the level of effort required, etc., etc., etc., I'm fine with that, but I can't tell if that is what you're arguing, or if you're saying that rustc is so good that this added information wouldn't help it any.

elidupree · September 1, 2021, 3:50pm

In my case, I'm thinking: This seems like it could be a useful feature in a language that has an explicit concept of pure functions, which the compiler can check - but Rust doesn't have that, and adding it would be a major undertaking. It sounds like the alternative you're suggesting is to make an unsafe promise that your function is a pure function with certain properties - but that seems very risky. As-written, carelessly adding a dbg! to an Associative function would cause immediate UB! That's terrible.

A middle option would be to make Associative merely permit the compiler to replace instances of (a+b)+c with a+(b+c), and vice versa (and have Pure permit the compiler to replace foo(args); foo(args) with foo(args)). Then, mistakes would not be UB, just unspecified ordering, and it could even be used in safe Rust. This seems vaguely plausible as a useful Rust feature, but I don't know enough about compiler internals to know how much work it would be to make the compiler able to take advantage of such permission.

elidupree · September 1, 2021, 4:04pm

After a bit of further thought, there are still a few gotchas to doing that -

If you do (a+b)+c; some_non_pure_function(); a+(b+c), the compiler does not necessarily know that a is the same value in the second expression as it was in the first.
You'd want all pure functions to be tagged as pure if possible, which would create a motivation for library authors to add "pure" to their functions, much like the const rollout, but if it's not compiler-verified, this creates a source of new bugs, even if they're not UB.
It's not completely implausible that Rust might add compiler-verified pure functions at some point in the future, so a feature like this might want to consider how it would interact with that, if that was added in the future.

ckaran · September 1, 2021, 4:28pm

So, something way down the road, if at all. That's fine, I have no evidence that this idea would be a major improvement over what is currently being done, and I suspect that there are other additions that are both easier to implement, and more productive.

elidupree:

It sounds like the alternative you're suggesting is to make an unsafe promise that your function is a pure function with certain properties - but that seems very risky. As-written, carelessly adding a dbg! to an Associative function would cause immediate UB! That's terrible.

A middle option would be to make Associative merely permit the compiler to replace instances of (a+b)+c with a+(b+c) , and vice versa (and have Pure permit the compiler to replace foo(args); foo(args) with foo(args) ). Then, mistakes would not be UB, just unspecified ordering, and it could even be used in safe Rust. This seems vaguely plausible as a useful Rust feature, but I don't know enough about compiler internals to know how much work it would be to make the compiler able to take advantage of such permission.

Yeah, I was thinking that these marker traits would have to be unsafe. There are a lot of corner cases that need to be addressed that I don't think the compiler can truly verify, so we'd be relying on crate authors to follow the rules, with lots and lots of warnings about how if they don't follow the rules, then UB awaits.

Given how easy it is to mess this up... hmmm.... maybe it isn't a good idea after all. Rust is supposed to be safer than C/C++, but if someone makes a mistake somewhere, it will cause absolute misery trying to figure out why the release mode is 'broken' when the debug mode is working correctly. And it will lead to a tons of spurious bug reports that have nothing to do with the compiler, and everything to do with how the traits were misapplied.

OK, I'm convinced; even though this could be useful provided all applications of the traits were correct, it will cause untold harm if misapplied. Better to wait until the compiler is able to derive all of this on its own than to take the risk of hoping that programmers know what they're doing.

scottmcm · September 1, 2021, 5:20pm

Note that this is only true for an unsafe trait.

For any trait that can be implemented with safe code -- which notably includes Add and Ord and Hash and Iterator and such -- then a poorly-behaved implementation cannot cause UB. It's allowed to not do something useful, like if Hash/Eq are wrong then HashMap doesn't have to work, but it cannot do something memory-unsafe.

So, for example, Rust's sort cannot rely on totality for memory safety, but it doesn't have to actually sort the slice if Ord isn't total.

mathstuf · September 1, 2021, 6:05pm

This is off-topic, but it knocked a memory loose. Wasn't there an issue with custom Any::type_id implementations being allowed to cause UB by lying and returning some other type's TypeId (and therefore getting cast improperly)? How was that resolved?

mbrubeck · September 1, 2021, 6:20pm

This was never a real issue, because the blanket impl of Any conflicts with all possible user impls, but it was brought up as a potential point of confusion in: Any trait should be unsafe · Issue #62303 · rust-lang/rust · GitHub

ckaran · September 1, 2021, 6:24pm

I know that the implementations are not supposed to cause UB, but my concern is that if someone marks something as Associative and they are wrong about that, then the optimizer will use that information in a manner that produces UB. Ultimately, the compiler will not be at fault, it will be the programmer that made a false claim. However, detecting and explaining this issue to the programmer would be extremely difficult; I suspect that if we could detect the issue perfectly every single time, then the compiler would be able to do the right thing on its own, much like it does with other marker traits.

Topic		Replies	Views
Coherence rules for generic binary operator traits language design	16	2099	March 25, 2019
Blog series: Lowering Rust Traits to Logic language design	26	8446	March 25, 2019
Feature Request: Interrelated Traits language design	5	783	February 8, 2022
Improved type inference for operators compiler	3	805	March 25, 2019
[Idea] Type erasure in Rust through parametricity language design	7	4298	March 25, 2019

Associativity, commutivity, etc. for standard operators

Related topics