How hard would it be to tweak the compiler to randomly reorder the derive attributes, and then do a crater run? I'd love to see how much code depends on the current behavior that's out in the wild.
That does not bring me joy. The following also doesn't bring me joy:
#[derive(Debug, Debug)]
struct Foo;
Compilation output from the playground:
|
1 | #[derive(Debug, Debug)]
| ----- ^^^^^ conflicting implementation for `Foo`
| |
| first implementation here
|
= note: this error originates in the derive macro `Debug` (in Nightly builds, run with -Z macro-backtrace for more info)
But at least the compiler lets me know what's going on so I can fix it. BUT is there a legitimate reason to try to derive something multiple times in a row? I know that it's possible to write a macro that will generate code that will compile, but it feels like a mistake that should be discouraged.
My take on this issue
Given that everyone that's answered on this topic so far is (AFAIK) an expert in the language, the fact that there is ambiguity of interpretation makes me... concerned. So, can we resolve the ambiguity in a sane manner?
My vote is for the following rules:
- Attributes in a composite derive are treated as an unordered set. The compiler is free to reorder the attributes as it sees fit, and is required to turn the attributes into a set. Attributes within a composite set are not permitted to see the output of attributes in the same composite derive. All permutations must result in the same output.
- Separate derives are effectively within nested scopes. Outer scopes can see the results of execution of inner scopes, but not the other way around.
- The innermost scope is 'executed' first, replacing the scope with the execution results. This continues until all derives are executed.
So, given that, we can look at the following example:
#[derive(Clone, Clone, Copy, Debug, Eq, PartialEq, Ord, PartialOrd, Hash)]
#[derive(Quux)]
struct Foo;
is semantically similar to the following (if braces could be used this way)
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)] // Scope 2
{
#[derive(Quux)] // Scope 1
{
struct Foo;
}
}
- By rule 2, each derive line defines a separate scope.
- By rule 3, scope 1 will be executed first, producing the output of
Quux
macro.
- Once the
Quux
macro is done, its output becomes available to the derive macros in scope 2.
- By rule 1, the duplicate
Clone
derives are reduced to a single instance. Also by rule 1, we can reorder the derive macros in whatever manner is convenient to the compiler. Finally, note that since the macros within a composite derive can't see each other's output, it is entirely possible to spawn futures for each derive macro, and construct their output concurrently. I'm not suggesting that is necessary or a good idea, just that the rules permit it.
The one issue that I see with this is how deriving the same macro multiple times in a row should be handled. E.g.:
#[derive(Quux)]
#[derive(Quux)]
#[derive(Quux)]
#[derive(Quux)]
struct Foo;
My vote is to not try to do anything clever, and just expand the macros using the rules above. Once the macros are all expanded, if there are any conflicts that the compiler will emit an error as normal.
Given all of that, are the rules unambiguous? Are there any real-world issues that anyone can think of by implementing these rules? More importantly does anyone disagree with the rules, or should I start writing an RFC along these lines?
FWIW, since this is a change to the current behavior, even though that behavior appears to be an accidental quirk of the compiler as it currently is, my personal feeling is that settling these rules will require waiting for the 2024 edition before they could be stabilized (assuming that everyone here is interested in doing so, the RFC is accepted, etc., etc., etc.).