Discussion: Adding grammar information to Procedural Macros for proper custom syntax support in the toolchain

Actually, it's already the case that a macro invoked as macro!() will have its arguments formatted if they look like fully syntactically valid Rust code. (Though the exact rules as to what syntax is allowed can be a little opaque.) Invocations as macro!{} get left as-is. I don't see/use macro![] enough to determine what heuristics it uses for when to format its arguments.

I've been bit by rustfmt deciding to remove a for<T> from a macro invocation for some reason before. It quietly just works most of the time and its only really noticable when things go wrong or suddenly change.

Try running rustfmt on this example.

input
macro_rules! m {($($t:tt)*) => {}}

m!(
    fn f ( ) -> i32 { 0 }
);

m!{
    fn f ( ) -> i32 { 0 }
}

m![
    fn f ( ) -> i32 { 0 }
];
output
macro_rules! m {
    ($($t:tt)*) => {};
}

m!(
    fn f() -> i32 {
        0
    }
);

m! {
    fn f ( ) -> i32 { 0 }
}

m![
    fn f() -> i32 {
        0
    }
];
2 Likes

For Slint, I have developed at LSP server. Some editors such as vscode support having several language servers for the same file. So rust-analyzer takes care of everything, while slint-lsp takes care of what is in the slint! macro.

Oh, interesting... thanks for clarifying that for me.

I'd somehow convinced myself that it was the case that eprintln! wasn't formatted correctly, but others are, but now that I'm actually testing that hypothesis I see that I was wrong.

... I guess I'll have to keep a closer eye on when rustfmt doesn't do anything. I didn't think we used brace-style macros in our code that much, but maybe we do more than I realize.

For a small (or even a complex, self-contained) DSL such as JSON or Slint, this approach of using a separate formatter makes a great deal of sense. For something like impl_scope! (essentially just Rust with a couple of tweaks), it doesn't.

This approach would be simpler and likely be sufficient for impl_scope! (some macro input might be rejected by a strict Rust parser, but most would not be).

2 Likes

OK, circling back: the use case where I ran into this was with cfg_if, which is supposed to look like real Rust code, but isn't actually. That's why I thought it was an issue with rustfmt, as opposed to "hard technical problem to get right".

So in this case, the cfg_if crate would somehow need to communicate to rustfmt what inside a macro is considered a "normal" AST, which is... well, hard.

cfg_if::cfg_if! {
    if #[cfg(target_arch = "wasm32")] {

        // stuff in here will be ignored by rustfmt
        // because it's inside a braced macro

        let loader = super::wasm::load_aws_config().foo().bar();

        loader
    } else {
        aws_config::defaults(BehaviorVersion::latest())
    }
}

Isn't if cfg!(target_arch = "wasm32") the way to handle this rather than a macro to allow if #[cfg(…)]? Maybe I'm missing the use case for cfg_if here…

Looked at the docs…seems that it also supports having function/method definitions inside of the blocks, so it works at top-level, trait, and impl "scopes" rather than just inside of functions.

if cfg!() requires all the branches to be compilable at the current target, cfg_if!() doesn’t.

cfg_if!() could use less weird syntax though (like

cfg_if! {
    if cfg(target_arch = "wasm32") { ... }
}

or even

cfg_if! {
    if target_arch == "wasm32" { ... }
}
2 Likes

The reason cfg_if doesn't get formatted is because rustfmt by design doesn't do macro expansion, name resolution or anything else that requires looking beyond a single file. If rustfmt was able to perform macro expansion, it should not be too hard for rustfmt to figure out how to format the code inside the if's (though not the if condition). There is no need to add actual grammar information support to proc macros for cfg_if and other macros that copy-paste the part that should be formatted verbatim to the output.

There could be some sort of conventional syntactic indication that the code inside of a macro invocation is considered to be normal Rust code (like maybe macro! {{ }} (double braces)) or something. It would enable formatting at the cost of slight syntax weirdness while keeping rustfmt simple and separated from the compiler.

Using () parentheses instead if {} does exactly that.

3 Likes

TIL. The formatting for cfg_if!()-like usecase is a bit awkward though:

cfg_if!(if cfg(target_arch = "wasm32-unknown-unknown") {
    do_something();
} else {
    do_something_else();
});

Going to comment here rather than clutter up #8. Thanks for starting (continuing?) this discussion.

serde_json::json!, tokio::select!, sqlx::query!, leptos::view! are all extremely common macros within their respective domains. I'd classify tokio::select! in particular as being quasi-syntactical, as in practice it's almost guaranteed that some version of it is going to end up used in any non-trivial async Rust codebase. The reason I bring this up again is because I suspect the impact of the pain of not having proper formatting support for these is greatly underestimated and we need more concrete data on the subject so we can motivate rustfmt devs to prioritise the issue.

I also think there's a thread here to pull at around the precedent of dioxus-cli, leptosfmt etc. being standalone tools. In particular leptosfmt being a drop-in replacement for rustfmt is quite interesting. It seems to indicate a different path to take towards sustainably integrating custom macros into the formatting ecosystem. Instead of trying to modify core tools like rustfmt and rust-analyzer directly, they could introduce an extensions system that macro developers could use to provide third-party plugins, similar to what cargo has. Of course, that adds its own complexity, but it's bounded complexity, while the use cases it enables (even outside of formatting) are unbounded.

Clippy suffers from a very similar lack of extensibility, which has also motivated drop-in alternatives like Dylint.

Does anyone foresee any particularly difficult challenges with a plugin approach?

2 Likes

The most challenging part will probably be agreeing on a stable API surface for handling the plugins. And the lack of name resolution can still cause surprises. But having plugins to handle nonstandard macro formatting is certainly more doable than teaching rustfmt about "common enough" macro names (like how vec! is handled).

A secondary thing is that "silently" running new binaries during compilation has security implications. Say you've done cargo install cool-cli. It's fine, and you've even used cool for a month or so. Now you upgrade your Rust toolchain, and surprise, you also installed a second binary, rustfmt-vec, which now gets run during compilation without you ever having known about it, because cool-cli was compromised to also distribute that formerly silent payload.

So some sort of opt in is probably desired in order to mitigate that, even if only for current editions. It may be annoying, but one option that would also mitigate surprises around textual activation is to list all active fmt addons in rustfmt.toml keyed by the macro name that they're activated for. So something like:

[macros]
json = "rust-json-fmt"
select = "tokio fmt"
query = "sqlx fmt"
view = "leptos fmt"

As for what the interface would actually be, think of how you can define an appropriate narrow waist. I think an appropriate one could be for rustfmt to slice to just the macro! { … } text, remove any block indentation contextually required, pass that through the addon via the same CLI that rustfmt use, and then readd the contextual block indentation before splicing it back into the formatted source.

That feels like it could be reasonably straightforward to implement unstably (like most of rustfmt.toml cfg is) and experiment with in rustfmt, if someone wants to propose that to the team and do the work needed to actually implement it.

2 Likes

Do we have any new progress now?

There's not been any new development here.

Well, there's been some movement towards preparing cfg_match! for stabilization. rustfmt will do the mod discovery that it does for cfg_if! now, but there's no special formatting handling for those or any other macro.

Also still relevant is that rustfmt doesn't do any kind of name resolution.

1 Like

I wrote a program to format json! macros

First parses the original input tokenstream into a syn tree,

then dump the parsed syn tree to json string.

Then write it to a specific location in the. rs file.

Not fast, but use syn as parser.