[Pre-RFC] Extend "macros 1.1" to support `foo!` style macros


#1

I like the simple but forwards-compatible design which has been put forth and accepted in “macros 1.1” in order to support custom derive macros in a stabilize-able way, but I also want to be able to use the same tools in order to implement procedural foo! style macros. This is a pre-RFC for a (I think) small addition to “macros 1.1” which adds support for these “bang” macros.

I’d love feedback as to why this is a bad idea, or ideas for improvements. I don’t plan to propose this as a PR or expect that this will be accepted until at least after unstable “macros 1.1” support lands in rustc, but that seems to be happening soon, as there is already a PR open with a basic implementation.

Summary

Extend the subset of “macros 2.0” (A.K.A. “macros 1.1”) defined in rfc #1681 to also support traditional-style foo! macros, while maintaining a small enough additional surface area to ensure that it does not pose a future maintenance burden on the compiler.

Motivation

Currently, traditional foo! style macros are supported through the macro_rules! system. This uses a pattern matching style to allow defining custom macros in user defined code. Unfortunately, some types of macros are impossible to support with this style, as they require procedural code running capabilities. In addition, many complex macros can easily cause the macro recursion limit to be exceeded while parsing, and slow down compilation time, due to their complexity.

As with “macros 1.1”, this RFC does not aim to architecturally improve on the current procedural macro system. Namely, this system will not support hygiene. The goal with this RFC is to allow for parallel experimentation with custom derive and procedural macros built on top of the TokenStream system which was originally proposed in “macros 1.1”, and will be extended over time to bring it closer to “macros 2.0”.

This feature would likely be stabilized separately from custom derive, as the need for it in the ecosystem is less serious, however, it will provide a platform for experimentation, and ensure that the TokenStream APIs which are developed also work well for procedural macro writers.

A stabilized version of procedural macros, even one as limited as this one, would enable new types of macros which wouldn’t have been possible to be defined in stable userspace before. For example, a user library could add utf16!() to define UTF-16 string literals (which are useful for efficient C code interop with existing code, such as gecko, which uses UTF-16 extensively rather than UTF-8), or define new macros like format_args!() and regex!() which parse string arguments, and use them to generate code.

Detailed design

Like custom derive macros defined in “macros 1.1”, these procedural macros will be defined as functions which have the signature:

fn(TokenStream) -> TokenStream

and are annotated with the attribute #[rustc_macro_bang(foo)]. For example, the following is an implementation of a procedural macro definition for foo!:

#![crate_type = "rustc-macro"]
#![crate_name = "foo"]

extern crate rustc_macro;

use rustc_macro::TokenStream;

#[rustc_macro_bang(foo)]
pub fn foo(input: TokenStream) -> TokenStream {
    let source = input.to_string();

    // Parse `source`, and build up new source code which should replace the 
    // macro in the resulting program.
    let source = foo_impl(&source);

    // Parse this back to a token stream and return it
    source.parse().unwrap()
}

Like in “macros 1.0” this attribute may only occur within “rustc-macro” crates, and can be imported from the “rustc-macro” crate with #[macro_use].

It can be used within a consumer crate as follows:

#[macro_use]
extern crate foo;

// ...

foo!(...);
foo![...];
foo!{...}

The TokenStream which is passed to the #[rustc_macro_bang(foo)] function may or may not contain the enclosing [], {}, or () of the invocation site (See Unresolved Questions).

Drawbacks

This adds a small amount of complexity to this temporary “macros 1.1” system. Depending on how far off this system is from the future “macros 2.0”, this may be undesirable. If this feature is stabilized, it will mean that it must be supported into the future.

This macro system also doesn’t support hygiene or other features which we would hopefully want to be able to support in macros in the future. Unlike custom derive macros, foo!-style macros often define variables which could potentially conflict with other names in the environment. Macro writers using this system will have to be careful not to create conflicts.

Alternatives

The main alternative to implementing this or something like it which exposes the procedural TokenStream transformer for foo! style macros, as well as custom derive, is to not implement it. This has the disadvantage of not providing any of the advantages explained in the motivation section, but will not add extra complexity to our existing macro system.

Unresolved questions

  • Should the TokenStream which is passed to the macro contain the (), [], or {} which enclose the parameters to the macro? If it does not, then the macro will be unable to distinguish between the different formats, like an existing macro_rules! macro. Is this desirable?

  • All of the unresolved questions related to “macros 1.1” also apply to this RFC. The answers which would be used for “macros 1.1” will likely also apply to this system unless there is a good reason to act otherwise.

PS. This is one of my first RFCs - so I probably messed some stuff up. I’d love feedback :slight_smile:


#2

This is definitely the long-term goal (supporting foo! macros) – once the “Macros 1.1” approach has been proven out, I’d be in favor of extending support beyond derive-style macros. One question mark for me is how we are going to handle macro naming – I’d like to be moving towards the “new style” of importing macro identifiers like any other identifier (also for derive-style macros, actually). But I agree that the underlying “technology” of invoking a foo! macro is the same as a derive-style macro, and it’s natural to think of adding support for those.


#3

The main reason why I brought up this RFC is that it seems to me like derive style macros are so similar to foo! style macros, that it would make sense to develop them (in unstable rust) in parallel, so that we can make sure that things which make sense for derive style macros also make sense for foo! style macros. This would have, from my understanding, a fairly small maintenance burden, and allows another set of users currently stuck on the very-unstable plugin system to move to this less-unstable system.

With regards to naming, and the old vs new style of importing macro identifiers, I think that the ship has sailed with regard to maintaining the old style #[macro_use] system. Given that we will already have to support this style of macro imports forever due to macro_rules!, and also derive style macros if those are stabilized, it seems like a small cost to also use it for these early stage foo! style macros. I would also love the new style macro imports, but those seem further off. If we get something like that in tree before we got derive style macros stabilized, I would say that we never stabilize this old system, but if we already need everything for derive style macros, it seems to me like we should just handle foo! style macros too.


#4

I think the intention is to deprecate both macro_rules! and 1.1 derive macros once the 2.0 system has been implemented and stabilized. Though Rust will have to support these systems as they exist to avoid breakages, it doesn’t have to support them in the sense that new features for macros wouldn’t have to consider them. Obviously ideally the upgrade path to 2.0 shouldn’t be too dire.

I think derive is a particular situation where whole use case categories of Rust were stuck dealing with very serious churn because serde was based on compiler internals. I don’t know of any similar case for function style procedural macros, and I’d like to avoid expanding the scope of features that are stable but planned for obsolescence.


#5

There’s clearly no need for this, we can just wrap the entirety of each source file in a dummy struct declaration with a “derive” plugin that does whatever processing is required. :wink:


#6

I am aware that we intend to deprecate this feature in the future, but the 2.0 system seems far out, and it seems strange to avoid implementing a foo! system until then if all of the tools are already available and destined for stabilization.


#7

I’m aware that we can do some pretty awful hacks with macro_rules! and custom derive plugins, but I would rather have a way to do this type of procedural macro writing without resorting to hacks like that.


#8

I’m strongly opposed to this, at least in the short-term. We should put all of our energies into macros 2.0 to get a properly working procedural macro system, rather than trying to expand the 1.1 approach.

So, the main reason for this is hygiene - hygiene is basically unimportant for custom derive, in fact you nearly always want to be manipulating the hygiene to be identical (or at least very close to) the unhygienic result. For function-like macros, hygiene is really important. IMO we should never add any kind of unhygienic macro to Rust (either function-like or attribute-like), they are way, way too dangerous. It is a happy coincidence that custom derive works pretty well without hygiene and that it is the most urgent thing to address.

The reason we are doing macros 1.1 at all is that there is very strong motivation - in particular the de facto standard way of doing serialisation in Rust requires custom derive (I personally feel very bad that we ended up in this situation, but it’s too late for that now). So I’m all for the macros 1.1 approach here, in particular because there is no real downside. With function-like macros there is a downside and there is not a strong motivation, so I’d rather not.

There is also the adjustment to consider, we already have an uphill battle ahead to persuade people to move to macros 2.0 when it comes, having a stable alternative will only make that harder.


#9

Hmm. What you’re saying makes sense, for sure. What does it mean more concretely? It seems like it means hacking on the rustc_macro library so that we can generate tokens that carry scope information? What other changes are you envisioning.


#10

A few minor things, but the big thing is developing hygiene - implementing sets of scopes and the support library functionality in rustc_macro. Also adding functionality in general to rustc_macro to allow direct manipulation of tokens rather than using strings (a lot of this is there already thanks to cswords), gettinge experience with libs like Aster using these facilities.


#11

What you’re saying makes some sense, and perhaps we want to hold off on moving foo! style macros onto a stabilization path. I imagine that it might be useful to have something like this in nightly as an unstable testing ground for what a better TokenStream API would look like which preserves Hygene. Having a more-stable-than-plugins-but-still-unstable testing ground with a function signature similar to what we want for macros 2.0 for testing, even if it never gets stabilized in favor of the finished macros 2.0 solution, might be nice. It also provides a transition path for crates which need procedural macros to get off of plugins and into a system which will be easier to move to macros 2.0.

If that is the case though, we probably don’t want to bother landing this until we get an RFC accepted with a first pass on hygene for procedural rust macros.


#12

For the record I found a trick that makes it possible to use foo!-style macros with macros 1.1: https://github.com/tomaka/vulkano/issues/256


#13

I also have a prototype if a version of rust cpp which uses the same trick. I was mostly wanting a nicer way to do it without having to have so many crates (rust cpp would need the codegen pass, the macro by example and the custom derive crate)