Help stabilize a subset of Macros 2.0!

Good morning Rustaceans! Hot off the heels of the Rust All Hands in Berlin last week we’ve had numerous discussions about the stabilization of Macros 2.0 and what we can have ready for the Rust 2018 release. I’d like to detail what we discussed here as well as solicit for your help in stabilizing a “Macros 1.2” in the upcoming 1.27 release possibly.

All-hands recap

In Berlin we had a session specifically about Macros 2.0 and what, if any, we could stabilize for the edition release. While “Macros 1.1” is stable today it doesn’t include features like:

  • Macros through attributes (aka #[my_proc_macro])
  • Custom bang-style macros (aka my_proc_macro!)
  • Span information, everything is lost today with to_string()
  • Rich diagnositics beyond panic!
  • Hygiene information to avoid name collisions programmatically instead of “hopefully” via __foo

These features have gone through a number of RFCs to date and have seen quite extensive testing throughout the community. In our discussion, however, we specifically ruled out two large features for the Rust 2018 release:

  1. Declarative macros (aka macro macros)
  2. Hygiene

Both of these features were deemed as too high risk to stabilize at this point, but fear not as Macros 2.0 is large and has many other features internally! We came up with a design (which I like to call “Macros 1.2”) which is aimed at stabilizing a subset of Macros 2.0 functionality which is sort of the 80% of what crates use today.

In summary, we concluded that the following subset is possible to stabilize for the Rust 2018 release:

  • Importing macros through the module system
  • Attribute macros, only attached to items but not modules.
  • Custom bang-style macros, but only invoked in module contexts, not as expressions or inside functions.
  • No hygiene information through explicitly requesting “call site” hygiene, aka copy/pasting code.
  • A large chunk of the proc_macro API to enable preserving span information on tokens.

The subset above we concluded was imminently stabilizable with no major open questions (unlike hygiene and declarative macros). And with that, let’s dive into these pieces!

Macros and the module system

The first item on this list is the ability to import procedural and other macros through the module system (aka use) rather than #[macro_use]. This dovetails very nicely with the edition’s attempt to remove the need for extern crate and is in general much nicer to use than #[macro_use] as well!

The tracking issue for this issue has now entered FCP for stabilization and the main takeaways are:

  • You’ll be able to import attributes via use as well as procedural bang-style macros
  • You’ll also be able to import Macros 1.0-style macros (aka macro_rules!)
  • Importing 1.0-style macros is unhygienic, so if the macro invokes another macro you’ll have to import that as well
  • You won’t use use to import macros within your own crate but will rather instead rely on today’s scoping rules.

While not a perfect stabilization the warts are specifically related to 1.0-style macros which are long-term going to be phased out with macro macros. In the meantime the experience of using 1.0-style macros changes very little and we’re mostly just rationalizing use with the backwards-compatibility of today’s macro system.

Macros and items

We discussed this at the work week and felt that the only part of invoking a macros 2.0 macro was in the item position like a function attribute or an invocation inside of a module scope. This appeared to cover most use cases of procedural macros in the majority.

Concerns related to hygiene were brought up related to expression macros (or those that could expand to expressions). I’m not personally too privvy on these details but @nrc may be able to fill in more if others have questions about this! I think the general gist though is that “copy paste hygiene” is relatively straightforward for items where there’s already not a huge amount of hygiene in Macros 1.0, but expression copy/paste hygiene is tricker.

Lack of hygiene information

We universally agreed that hygiene was a good thing we’d like to stabilize one day. We basically all agreed as well though that it’s realy hard to stabilize hygiene and the compiler is definitely not ready to do so. As a result we wanted to stabilize a forward-compatible solution with hygiene but still not actually stabilize anything hygienic today.

In rustc hygiene information today runs through Span, where each Span indicates what syntactical context it came from as well as the literal bytes in the original source it corresponds to. This is expressed as well in the proc_macro API with Span having constructors like call_site (copy/paste hygiene) and def_site (not accessible by expanded code hygiene).

Our conclusion here was that we would only stabilize the Span::call_site() function and no others. This way if you ever manufactured a span you’d be explicitly opting-in to copy/paste call-site hygiene. Eventually once we stabilize hygiene it’ll either be through spans or a separate field, and this way we should have avenues for inserting hygiene information, if necessary.

This is also highly related to…

proc_macro API changes

And finally the last piece of stabilization that would be required here would be the proc_macro API itself. @dtolnay, @nrc, and I took a long hard look at the current API with an eye towards stabilization and came up with a reorganized API which is now available in today’s nightly! The crate should provide everything it did before, only in a reorganized and more forward-compatible and ergonomic fashion. The main changes are:

  • The TokenNode enum has been removed and the public fields of TokenTree have also been removed. Instead the TokenTree type is now a public enum (what TokenNode was) and each variant is an opaque struct which internally contains Span information. This makes the various tokens a bit more consistent, require fewer wrappers, and otherwise provides good future-compatibility as opaque structs are easy to modify later on.

  • Literal integer constructors have been expanded to be unambiguous as to what they’re doing and also allow for more future flexibility. Previously constructors like Literal::float and Literal::integer were used to create unsuffixed literals and the concrete methods like Literal::i32 would create a suffixed token. This wasn’t immediately clear to all users (the suffixed/unsuffixed aspect) and having one constructor for unsuffixed literals required us to pick a largest type which may not always be true. To fix these issues all constructors are now of the form Literal::i32_unsuffixed or Literal::i32_suffixed (for all integral types). This should allow future compatibility as well as being immediately clear what’s suffixed and what isn’t.

  • Each variant of TokenTree internally contains a Span which can also be configured via set_span. For example Literal and Term now both internally contain a Span rather than having it stored in an auxiliary location.

  • Constructors of all tokens are called new now (aka Term::intern is gone) and most do not take spans. Manufactured tokens typically don’t have a fresh span to go with them and the span is purely used for error-reporting except the span for Term, which currently affects hygiene. The default spans for all these constructed tokens is Span::call_site() for now.

    The Term type’s constructor explicitly requires passing in a Span to provide future-proofing against possible hygiene changes. It’s intended that a first pass of stabilization will likely only stabilize Span::call_site() which is an explicit opt-in for “I would like no hygiene here please”. The intention here is to make this explicit in procedural macros to be forwards-compatible with a hygiene-specifying solution.

These changes are all reflecting in proc-macro2, syn, and nightly itself as of today.


Final push for stabilization

My hope is that this is the final cut of the proc_maco API before stabilization. If all goes well I’d like to FCP this subset of macros 2.0 towards the end of the current nightly cycle (1.27) and then stabilize for the 1.27 release. If discussions show that we’re not quite ready for stabilization then we will definitely switch to a different schedule.

I think that this subset is going to enable quite a few use cases seen in the wild today to work on stable Rust while also providing a solid base for us to extend and grow with hygiene/diagnostic/etc information in the future. In the meantime though we need your help in testing out the new APIs. If you’ve got a procedural macro crate or custom derive consider updating to syn 0.13 or proc-macro2 0.3 and give us you’re feedback! We’re particularly interested in bugs and ergonomic issues with the current API.

Here’s to hopefully stabilizing Macros 1.2 in Rust 1.27.0!

37 Likes

If possible, it would be great for info on how some of these things work to go into the rustc guide. The chapters macro expansion are currently a bit lacking (mostly because I wrote them :stuck_out_tongue: )…

1 Like

Can you clarify about the hygiene stuff?

  • What exactly does it mean to “stabilize hygiene”? It seems like a fairly nebulous goal.
  • What are “copy-paste hygiene” and “call-site hygiene”? Is this the hygiene that macro_rules experiences today or different?

Moreover, can you elaborate on the concerns that make it more palatable to stabilize proc macros in module-level item position than in any other position? It seems a bit problematic to me if some macros can only be used at module scope. For example, if let’s say something like lazy_static! became a proc macro then I couldn’t use it to declare a function-local static, which is a break from the way other kinds of items work, and it means a user has to care what “kind” of macro it is, which is weird.

For another example, it’s my understanding that std::assert! is becoming a proc macro due to RFC 2011 so if you can’t invoke that as an expression that seems… uh… bad. :slight_smile: And if the answer is that compiler-internal proc macros will be special (I mean, they already are, obviously, like format! and stringify!), I’d just say that it still shows there is clearly a use-case for function-like proc macros.

1 Like

This is exciting! At a high level, what amount of rocket’s proc macro usage would this allow to work on stable?

1 Like

Oh, another random thing that may figure into proc macro APIs: what about the general desire for “collector” macros, that is, some kind of facility for aggregating info from annotated items and feeding that into another macro (see pre-rfc, and RFC 2318 proposes a restricted step in the same direction). This can be useful for things like test harnesses to grab all the tests, or web frameworks to grab all the routes, etc. You can currently hack this in with global variables in the proc macro crate and careful ordering of macro invocations, but that’s obviously brittle and doesn’t work at all in the presence of incremental compilation. Is there a solution for this use-case on the horizon?

The nebulousness is one of the reasons as to why we're not stabilizing it!

Sorry what I mean in those two phrases is the same thing, and it's the sae as macro_rules sort of except that there basically isn't any hygiene. It's as if the exact tokens produced are simply copy/pasted into the module, name conflicts and all.

I think @nrc or @nikomatsakis may be able to help out with this. Remember though that this is a slice of stabilizations. We will continue to stabilize more features over time, such as macros in expression positions. Just because they may not be stable in this first pass doesn't mean we're removing support for them.

Well, certainly expanding to expressions means that the lack of hygiene becomes much more visible as a concern.

That's a good point. We might be able to accommodate expanding to items within a fn (as opposed to expanding to an expression) without undue concern over hygiene, since they don't share name resolution scopes.

Wow! I was not expecting this.

Conveniently, item macros also happen to be the case where proc-macro-hack falls particularly flat. After what I was working on last week, this is more or less a dream come true. I am off to go make a guinea pig of myself!

I spoke with Sergio about this, and it’s a lot of it. He still needs some of the stuff not planned here, but he also might be willing to champion that work, which is what’d be needed to get some of it to move forward.

Hmm, okay that certainly seems like a step backwards since we often talk about macro_rules hygiene (imperfect though it is) as avoiding one of the mistakes of C macros. But yeah that makes sense why we don't want to have expression macros yet!

I get that but I worry about trumpeting "Macros 1.2" with so many caveats (you'll get name resolution conflicts, you can't use them in functions, concat_idents! still doesn't work, etc).

4 Likes

The fact that parts of the design space (e.g. hygiene) are still nebulous while we are stabilizing other parts is a bit worrying. While everything is unstable, we can play around to our hearts’ content, but once we stabilize something, remaining design choices are forced to work around it.

Personally, I see hygiene as a major part of macros in rust, and I feel like we should be really reluctant to release macros 1.2 without it. As much as I want macros 2.0 to be available, I also want it to be a clean, polished, and mostly-complete feature set when it is released, as I’m sure all of us do (especially those who worked so hard to build it :slight_smile: ). This all leads to the next thing:

IMHO, there are a couple things that make it hard to contribute to macros currently:

  • Things seem to have kind of bypassed the normal RFC process. The original Macros 2.0 RFC was probably the vaguest RFC I’ve ever seen. It basically says “we will do a macros 2.0 of some sort” and that’s all. I am not really aware of any public discussion of the design issues and the design space.
    • In fact, I’m not even aware of what the design issues are…
    • As a result, I am not aware of any list of unresolved questions, design points that were rejected and why, etc…
    • So it’s hard to know what exactly needs doing/polishing…
  • Documentation is sparse. There are only a hand-full of people TMK who actually know how macro expansion works in the compiler and fewer still who know how hygiene, proc-macros, custom-derive, and other magical systems really work and fit together.

Personally, I would really like to see a full-blown macros 2.0 RFC which details exactly what features macros 2.0 has, what changes to the compiler are necessary (and of course, as I mentioned before, it would be great to get content in the guide about how macros work under the hood), and what the stages of stabilization should be.

I’m sorry if this comes off as a rant. I really do appreciate all of the work that has gone into making this work, but I feel like there is an unusual lack of transparency and clarity around what exactly macros 2.0 should contain, which naturally makes me feel nervous about any macros 1.2…

23 Likes

Will this change have a negative effect on lints?

Currently, Clippy uses spans / hygiene information to decide whether to analyze a particular expression. But if the call-site span is used here, then I'm concerned that Clippy might lint generated code as if the user wrote it themselves.

I don't have a convenient way to check this right now, but I want to be sure that this isn't a problem before stabilization!

2 Likes

I agree with those saying it sounds a bit scary to stabilize something without hygiene. In particular, the bit about invoking another macro and having to import that as well sounds a bit scary and fragile.

Also will be very good if Rocket can finally join stable!

I suppose macros 1.2 will also do away with the restriction about only allowing a single proc_macro from a crate? (I forget what the exact restriction is, but it was always this weird unergonomic paper cut.)

I upgraded Askama to syn-0.13.0 already last week, was a trivial change, so all good on that front.

4 Likes

I’ll try upgrading rental to syn 0.13 soon. Hopefully it goes more smoothly than the upgrade from 0.11 to 0.12, which was pretty harsh.

In any case, with this I’ll probably be able to remove procedural-masquerade as a dependency, which will be nice, and hopefully significantly improve error messages as well.

It copy pastes the tokens not the characters so it is still better than C.

1 Like

FWIW, C also copies and pastes tokens and not characters.

3 Likes

Hmm... doesn't rustc actually copy/paste token trees ? That means you cannot mess up precedence, as you can in C. For example,

C:

#define MULTWO(a) (a * 2)

MULTWO(4 + 1) // `6`
MULTWO((4 + 1)) // `10`

rust:

macro_rules! mul_two {
    ($a:expr) => { $a * 2 }
}

mul_two!(4 + 1); // always `10`
mul_two!((4 + 1)); // always `10`
1 Like

I already expressed my concern about stabilising the use-based macro importing in GitHub ( https://github.com/rust-lang/rust/issues/35896#issuecomment-378495910 ), but I want to rehash it here.

With the new use-based import, one is able to selectively import just one macro, which is great. However, because the macro_rules! macros don’t have hygiene with regard to other macros, all the macros the imported macro expands to also need to be in the caller’s scope. Expanding to other macros that are considered implementation details is actually a quite common thing to do, so I’m concerned of the ergonomics of the use-based importing, especially if #[macro_use]is going to fade away together with the extern crate syntax.

An example: To use slog's info macro, I imported it like this: use slog::info;, but after that, I had to import also these macros for it to work: use slog::{o, kv, log, record, b, record_static};. This makes the new syntax, as nice it is, actually ickier to use than the old one – something that needs to be considered if we actually want people to move using the new syntax.

I was told that as this is how macro_rules! works, there’s not much we can do about it. However, I keep thinking that wouldn’t it be possible to keep backwards compatibility while enabling a kind of plan b:

If a macro_rules! macro is imported using the new use mechanism AND the normal expansion fails with “cannot find macro foo! in this scope” error, it tries to expand it again AS IF the macros from the crate the expanding macro originates from, were in scope. (That is, as if the there would have been a #[macro_use] attribute.)

Why this is good:

  • It behaves like it used to be when using the old import mechanism; new functionality is only enabled with the new use style importing.
  • It behaves like it used to be when the macro expansion succeeds, so it’s backwards compatible with code that compiles at the moment.
  • It doesn’t import all of the macros wholesale like #[macro_use] does, as they are considered only inside macros, when the normal expansion fails.
  • It enables the use of the new kind of syntax, without the use slog::{o, kv, log, record, b, record_static}; kind of ergonomics downsides.
  • These rules would apply only for macros_rules! macros, so in the future, macro macros could, and should be actually hygienic.

Downsides:

  • There might be macros that expect some other macros to be imported from other crates by the users. This scheme doesn’t magically help in this case. (The user still needs to import the other macros so nothing changes, though.) I don’t expect this to be a common case, though.
  • It doesn’t make macro_rules! macros actually hygienic, that’s something that can be hardly changed backwards compatibly. It just makes importing them easier and cleaner.
  • I’m unaware of how much of churn in the compiler this would require.
4 Likes

I think @nrc or @nikomatsakis may be able to help out with this. Remember though that this is a slice of stabilizations. We will continue to stabilize more features over time, such as macros in expression positions. Just because they may not be stable in this first pass doesn’t mean we’re removing support for them.

This is a purely practical decision, there is no theoretical distinction. In practice, not having hygiene at the statement level is terrible and can lead to many subtle bugs (as C macros have taught us). However, at item level, we think there is much less opportunity for hygiene bugs because names are typically intended to be used at the use-site, for example consider two example macros:

macro foo($y: expr) {
    let x = 42;
    x + $y
}

macro bar() {
    fn x() -> u8 {
        let z = 42;
        z
    }
}

foo is intended to be used at statement level, x is an implementation detail of the macro and should not be nameable where foo is used (and indeed hygiene ensures this). bar is intended to be used at item level and in this case x is intended for use where bar is used (i.e., other functions in the same module as the use of bar should be able to call x. In this case z is an implementation detail, but the usual scoping rules prevent it being named.

Obviously there are more complex examples where things can go wrong, but we believe that these can be corrected in future versions of macros backwards compatibly.

Note that it is not really about whether a macro is used in a module or a function, but whether a macro expands to an item or a statement, so expanding a macro to a function or static inside a function should be supported under macros 1.2. However, it is not clear to me if that distinction is actually possible to support.

Til