Editions and macros/code generation


#1

Hello

In general, the idea of being able to mix crates from different editions in the same project is a nice one and works great.

Except one thing I’ve noticed. There are places where code from one edition may „leak“ into a crate of other edition. This happens for example with code generated by a build.rs script (if I use for example prost-build, it doesn’t know what edition I’m going to include! the resulting code into) or with macros (both procedural and macro_rules! ones), and then compilation errors and other beasts may get unleashed (I had to deal with some when migrating to the 2018 edition and had to find hacks and workarounds).

Currently, the solution seems to be trying to produce code that compiles in both. But this doesn’t feel nice and will probably not be a reasonable strategy with more than two editions.

I haven’t seen any plans/solutions, but that might be because I didn’t search well enough. Are there any? Or was it considered unimportant?

Brainstorming ideas

I have a few ideas, I’d like to hear if they make some kind of sense and if it would be possible to do something of that (or similar) to help with the above problem. If so, what can I do to help out?

Knowing the edition

The build.rs and procedural macros could get eg. an environment variable and decide to generate either one or the other edition code.

Similarly, it would be possible to add a #[cfg(edition = "...")] attribute. That would allow a macro/build.rs to generate both variants and let the compiler pick one accordingly.

Both these things should IMO be relatively easy to do, seem non-intrusive and make it at least possible for crate authors to deal with support of multiple simultaneous downstream editions, if not getting some kind of future-proofing.

Overriding the edition per scope

Currently, the whole crate is compiled under the same edition. This would allow to somehow mark a scope generated by a macro that it is in some (possibly) different edition, eg:

// This is generated by the macro or build.rs
#[edition = "2015"]
mod generated {
    // No matter what is around, the code here gets compiled under the 2015 rules.
}

This would make it possible to „future-proof“ the generated code against crates in future yet unknown editions. But this might be hard to do inside the compiler (I don’t know) and if the macro wanted to paste some piece of code it gets from its user’s crate, it would need to switch back and it would need to know somehow from which edition the tokens come from (that could come from a property on the TokenStream/individual tokens, or maybe an automatic $edition variable in macro_rules, similar as there’s $crate).

Automatic macro edition hygiene

That’s basically an automatic version of the above ‒ the code created by the macro would be compiled under the edition of the macro’s crate, but pasted tokens would preserve the edition of the crate they come from.

But honestly, this one seems a bit crazy to do even to me ‒ probably very hard to do (I imagine people who know how the compiler works inside streaming now), and harder to decide what it would mean if the tokens’ editions just mixed arbitrarily (as opposed to eg. on a scope boundary). And it would probably be backwards-incompatible change?


#2

I thought this was what was supposed to be implemented, is it not what we ended up with? (Some links to the issues you encountered would be useful to see what causes them).


#3

Unfortunately, I didn’t have the time to look into it more at the time, so this is the only thing I remember so far: https://github.com/danburkert/prost/issues/140

This one is caused by code generated from build.rs (by calling into prost-build) that relied on a way how procedural derives are imported in the 2015 edition. There’s a solution to generate a bit different code for the 2018 edition, but I haven’t found a way to detect the edition from within the build.rs scripts. And obviously, the macro hygiene doesn’t/can’t apply in code generated into a file this way.

I possibly assumed that the same problems happens with macros (and I wasn’t proven wrong by any documentation I’ve read regarding the 2018 edition), so maybe only the build.rs problem exists.


#4

Macros, as in macro_rules!, work perfectly cross-editoin. Edition is checked per-span, so the same as other hygeine.

The problem with code generation is when you can’t attach span information. This applies both to proc-macros (which only have call-site hygeine currently, and will be solved when def-site hygeine is usable) and especially for generated-for-include! code, which doesn’t currently have any way to provide span information.

The potential for #[edition = "2018"] { .. } depends entirely on when during parsing/processing it would need to happen. The tracked-edition-by-span means that the post-parse architecture exists already, it’s just the complexity of generating the parse tree. Keywords differing between editions makes me fear that this anotation would have to be handled in the parser directly. (But it could be.)

The current position seems to be that it should always be possible to generate code that compiles without warnings in all editions you can support (cc @steveklabnik).

I think that the ideal end-state solution is compile-to-Rust-TokenStream. If the build.rs steps can generate TokenStreams with correct Span information (including at least edition, and ideally the ability to point into other files), the problem disappears and code (pre) generation gains the superpowers of procedural macros. (This will require some way to serialize TokenStream with Span information if it wants to persist across compiles rather than doing the generation work every time.)


#5

Spans in proc macros 1.2 are “call-site” only for the purpose of name resolution.
All the utility information like edition info, or macro backtrace info is still taken from the def-site.
So, edition hygiene works correctly for proc macros (unless they are unlucky and fall back to pretty-printing or something).


#6

OK, so the problem mostly exists just for the build.rs scenario, which is good.

I think that the ideal end-state solution is compile-to-Rust-TokenStream. If the build.rs steps can generate TokenStreams with correct Span information (including at least edition, and ideally the ability to point into other files), the problem disappears and code (pre) generation gains the superpowers of procedural macros. (This will require some way to serialize TokenStream with Span information if it wants to persist across compiles rather than doing the generation work every time.)

This sound interesting. But it has two downsides:

  • It’s probably far in the future (this’ll take a lot of work and a long time to stabilize, I guess).
  • It’s much more heavy-weigth solution. You just need to know how Rust looks like to spit something out to a .rs file, while for the full version, you need to discover the API, learn quote!… The entry barrier is higher.

Cargo already sets bunch of environment variables when it runs the build.rs. If I would like to add the edition one, does it require RFC, or just merge request (and discussion there) to cargo repository?


#7

An alternative that may allow reaching the same endpoint is for proc-macros to be able to depend on non-Rust-syntax-compatible files, e.g. prost’ current build.rs integration could potentially be done as a proc-macro if it had some way to access and declare its dependency on the .proto files.

That’s likely to be similarly far in the future, so I think having an environment variable telling build.rs what edition it is going to be compiled under would be useful now (though, as mentioned earlier, at least at this point it should be guaranteed that you can generate warning-less code that will compile under both editions, as that’s the mechanism by which rustfix works, only once there’s a third edition should there be actually incompatible changes).


#8

I was actually thinking in similar lines. I wonder if it might even work today ‒ if I get it correctly, the function-like macros don’t need a valid rust syntax, they only require the input to be tokenizeble to rust tokens. The chance that proto is is quite high, so maybe one could do:


mod proto {
    prost_compile!(include!("../proto/myproto.proto"))
}

But that indeed is not a general solution.