Associativity, commutivity, etc. for standard operators

scottmcm · September 1, 2021, 6:48pm

Well that would be a compiler bug, unless Associative was an unsafe trait. (Under the general overarching semantic rule of "safe code can't cause memory unsafety".)

CAD97 · September 1, 2021, 6:50pm

This was Error::type_id, which is still unstable and doc(hidden) due to this, but was temporarily stable to implement, leading to unsound implementations being possible in safe code.

ckaran · September 1, 2021, 7:40pm

I'll admit it, I'm trying to create a counterexample in my head, but I haven't been able to. I'm hoping that what you're saying is correct, and it's not because we can't figure out a bad counterexample...

quinedot · September 1, 2021, 7:48pm

That's just a requirement of the language. Such bugs have arisen before. Sometimes it's just a simple patch to fix it, other times the language grows new features to address it.

RalfJung · September 8, 2021, 5:06pm

This RFC says

An implementation that does not uphold this shall not result in undefined behavior; Clone is not an unsafe trait .

So it's not really Rust assuming anything I would say. This is more like the PartialEq contract.

scottmcm · September 8, 2021, 7:16pm

It does allow the x.clone() -> (*x) transformation. Whereas I don't think the compiler is allowed to use Eq to replace x == x with true.

Maybe I should use a different word for that?

RalfJung · September 8, 2021, 7:35pm

I don't think it allows such a transformation, since the RFC explicitly says it is not UB to violate the property.

SkiFire13 · September 9, 2021, 8:01am

That transformation is not UB though, it's well defined, and it is already used in the stdlib, for example when specializing [T]::to_vec for Copy types.

HeroicKatora · September 9, 2021, 11:13am

The difference being it is part of the implementation, not the compiler. The choice of transformation is done by core::vec::Vec and not part of the language semantics. That doesn't mean it's right. A more cautious approach would obviously be to only specialize on types that deeply derive Clone (i.e. akin to needs_drop) but still I see the interface of vec being consistent enough even with that specialization. And it's only that type, not arbitrary user types or any replacement for Vec that you use for that consistency.

scottmcm · September 9, 2021, 4:34pm

I guess I have trouble seeing how one could write code that would be sound normally, but UB under that transformation. Because if Clone::clone(&x) -> *&x makes it unsound, then wouldn't vec![x].clone().pop().unwrap() also end up being unsound since that'd do the "same" transformation?

RalfJung · September 9, 2021, 11:23pm

UB is a property of a program, not of a transformation, so I am not sure what you mean -- there is a category error in your post.

My claim concretely is that: it is not okay for the compiler to do x.clone() → (*x). If the user calls a library function which uses specialization to do a copy instead of cloning, that is part of the contract between that library and its clients, so language semantics (and UB) are irrelevant. But if the compiler silently replaces explicit calls to a clone function by a mere copy, that means the optimized program has a different semantics than the unoptimized program, and that would be a compiler bug.

Similarly, a library can say that it expects PartialEq::eq and PartialOrd::partial_cmp to be coherent with each other, and can replace calls to one by the other in a library update or using specialization without this being considered a "breaking change" -- but that does not mean the compiler is allowed to replace calls to one by the other.

Soundness is a property of the safe API surface of a crate (and it roughly means "no safe client can cause UB even when invoking this crate in arbitrary ways"), so again I am not quite sure how to interpret this statement.

So let me try a concrete example: If I have a MyType: Copy, and its clone implementation prints to standard out, and I call that clone function, then the final program produced by the compiler must print the same thing to standard out. There is nothing that would give the compiler license to remove that print (as would be the case if the clone call would be replaced by a simple copy). Do we agree on this?

mbartlett · September 19, 2021, 12:12pm

The RFC says this:

If T: Copy, x: T, and y: &T, then let x = y.clone(); is equivalent to let x = *y;.

From my perspective, that would give the compiler license to remove the print.

withoutboats · September 19, 2021, 1:20pm

I think the RFC has two readings. I'm not sure which is the consensus of the active members of the language team:

It's not a breaking change for a library to stop calling a clone impl on a type that impls Copy and instead use the copy impl (or vice versa). This was the original motivation for the RFC; Vec's Clone impl was changed to stop calling clone and start using Copy semantics for Copy types; any change to program behaviour because of this change to std is not considered a breaking change.
The compiler is free to replace clone calls with memcpy for types that implement Copy, even if that would change program behaviour. This would be a stronger interpretation than the first reading, which could impact people writing unsafe code.

Personally I'm not convinced adopting the second reading is a good idea. But it wouldn't lead to UB in correctly written code, because a user who is relying on a call to clone() actually calling clone on a Copy type to uphold an invariant would have to have written some incorrect unsafe code under this reading (even if they control the type and its impls and the code where the clone call is).

The reason I think its probably not a good idea is that obviously it isn't great to have a list of special optimisations like this that the compiler is allowed to perform that users have to be informed of somewhere, even if the only reason they would run into them is that they wrote some pathologically bad code.

Aloso · September 19, 2021, 4:09pm

Here's my opinion:

When my code contains the line x.clone(), I absolutely expect that the clone method is called. When I write x.blorb(), I also expect that the blorb method is called.

When a function has the same behaviour as a memcpy, then the compiler can replace the function with a memcpy, but only then. The compiler is not allowed to change the behaviour.

The standard library's implementation can be modified in a way that changes behaviour of types that implement traits incorrectly. But I want to emphasize that this is a library change, not a compiler optimization. It has the potential to affect API stability, but not soundness or language semantics.

Clone is specified so that x.clone() is equivalent to *x for any Copy type that implements Clone correctly. The standard library is allowed to produce incorrect/unexpected results when a trait is implemented incorrectly. It is also allowed to make changes that are only backwards compatible iff the relevant traits are implemented correctly for your types. This might break a program that was previously working, but incorrect.

toc · September 19, 2021, 5:34pm

In an abstract sense only. I typically expect functions to be inlined where appropriate and I'd like to be able to write generic code with T : Clone without having to double specialize on Copy because of a performance gotcha. This was exactly the problem in Vec, in theory the compiler should be able to pierce the veil and realize that a (heavily misdirected) for loop of clones can be one big memcpy but in practice it was missing this and losing out on a basic optimization.

I think it's okay to make the performant thing the most ergonomic thing at the cost of making some wrong code even more wrong. There should probably be a warning added for impl Clone for AlreadyCopyable either way.

Aloso · September 19, 2021, 6:01pm

Yes. I'm talking about language semantics, not generated machine code. Semantically, an inlined function call is still a function call, even if no callq instruction is emitted.

RalfJung · September 22, 2021, 2:44am

That contradicts the part of the RFC where it says there is no UB.

You cannot just change the behavior of the program "by fiat" -- you cannot just proclaim some axioms and expect them to make any sense. Instead the Rust Abstract Machine should specify what happens step-by-step, and nothing there would allow the compiler to remove this print, except if the program somehow has UB or a special case is added for calls to Clone::clone to be different from any other function call ("any call to Clone::clone, if the receiver type is Copy, may non-deterministically be skipped and a memcpy executed instead", or something weird like that). I am strongly opposed to the latter. Certainly an RFC that alters the Rust Abstract Machine needs to do so explicitly and with the intent of fundamentally changing the language, not just as a side-effect of something that looks entirely like a library RFC.

Yes, this is what I think the RFC is intended to do.

I would personally go so far as to say it is a bad idea.

ckaran · September 22, 2021, 11:59am

Is there currently a mathematically complete, unambiguous definition for this? Just to be clear, I'm not trying to be snarky, I've been wanting to see something like this for years, pretty much since I started learning Rust. I would love to see a definition like this as it would answer questions that I have pretty much constantly... It would also answer questions like above, explaining what transformations are legal, and which aren't.

CAD97 · September 22, 2021, 1:23pm

Congrats, you're asking the guy who got a PhD for writing one (congrats again, @RalfJung!)

https://plv.mpi-sws.org/rustbelt/stacked-borrows/

This model isn't official, but is the working model implemented by miri. You can also check out wg-unsafe-code-guidelines for most of the open questions.

ckaran · September 22, 2021, 1:37pm

Congrats on the Ph.D. @RalfJung! I knew you were working on it, but I didn't know you finished it, good job!

Topic		Replies	Views
Coherence rules for generic binary operator traits language design	16	2099	March 25, 2019
Blog series: Lowering Rust Traits to Logic language design	26	8446	March 25, 2019
Feature Request: Interrelated Traits language design	5	783	February 8, 2022
Improved type inference for operators compiler	3	805	March 25, 2019
[Idea] Type erasure in Rust through parametricity language design	7	4298	March 25, 2019

Associativity, commutivity, etc. for standard operators

Related topics