Support for macros in suffix / postfix position?

With a trivial way in which any custom piping can be implemented for any given type for:

let shared = state
  .to(Mutex::new)
  .to(Arc::new)
  .to(Some);

not having at least somewhat of a similar solution for macros, which necessarily require forced prefix syntax, feels rather out of place - to say the least.

The RFC on the matter has been stuck in a limbo for over 6 years now. The topic has been brought up time after time again. A separate library has been made to cover up for the lack of any significant attention in this regard.

From the discussions I can see so far, there doesn't seem to be any major downside / drawback / issue to the idea itself, either. It's the exact implementation / syntax specifics that different people have different positions on.

Can we attempt to settle them once and for all, instead of sporadically commenting on the topic every once in a while, continuously postponing the underlying specifics to some other time?


Option #1

Do not introduce any additional syntax to the existing macro declarations. Merely allow the macros with one argument or more to be used in a postfix notation - as long as a provided AST node matches the expected argument of the macro itself.

macro_rules! debug {
  ($e:expr) => { dbg!($e); }
}
fn use() {
  let example = 2021.debug!(); 
}
  • Pros: no change in the existing syntax.
  • Cons: auto-suggestion pollution in the code-analyzer/LSP recommendations.

Option #2

Same as #1, yet with a prior #[macro_suffix] / #[macro_postfix] annotation, required to use any given macro in its postfix form. Macros not explicitly marked as such will not be processed by the compiler and throw a compile time syntax error, as they do today.

#[macro_suffix|macro_suffix(only)]
macro_rules! debug {
  ($e:expr) => { dbg!($e); }
}
fn use() {
  let example = 2021.debug!(); 
}
  • Pros: with a manual attribute opt-in LSP pollution would be minimized; the attribute itself could provide documentation, outlining the details of the underlying implementation; the #[macro_suffix(only) could enforce the use of macro only/exclusively in postfix position.
  • Cons: with no reference to the $self, the #[macro_suffix] alone might be confusing.

Option #2.5

Combines options #2 and #3 in one.

#[macro_suffix]
macro_rules! debug {
  ($self:expr) => { dbg!($self); }
}
fn use() {
  let example = 2021.debug!(); 
}
  • Pros: with a manual attribute opt-in LSP pollution would be minimized; the attribute itself could provide documentation, outlining the details of the underlying implementation; the #[macro_suffix(only) could enforce the use of macro only/exclusively in postfix position; the $self syntax closely mirrors the standard associated methods, with the support for both the postfix call expr().macrofy!() and the regular macrofy!(expr()) call.
  • Cons: verbosity.

Option #3

Require an explicit $self declaration as a first argument of a given macro, alongside the type of node the macro will be processed for, in its postfix format. As with #2, any macro that doesn't declare an explicit ($self:?) as a first argument will not be processed in a postfix position.

macro_rules! debug {
  ($self:expr) => { dbg!($self); }
}
fn use() {
  let example = 2021.debug!(); 
}
  • Pros: the $self syntax closely mirrors the standard associated methods, with the built-in support for both the postfix call expr.macrofy!() and the regular macrofy!(expr).
  • Cons: potential confusion as to whether the piece of AST passed will be evaluated once or multiple times - at each placement, whether it is bound by value/ref, as lvalue/rvalue, etc.

Option #4

Similar to #3, yet without an explicit node type for the $self, with an optional :self allowing for a custom name in place of the $self itself.

macro_rules! debug {
  ($self) => { dbg!($self); }
  // same as
  ($e:self) => { dbg!($e); }
}
fn use() {
  let example = 2021.debug!(); 
}
  • Pros: clean, idiomatic, consistent with the associated methods.
  • Cons: unclear as to whether the code is to be processed is expr, ident or anything else.

Option #5, @idanarye + @DragonDev1906

In addition to the macros-as-suffix, explicitly binding the passed in expression either by value (self) or reference (&(mut) self) an additional let / match in postfix position would allow to match against deeper nester patterns. The end result may look something like:

macro_rules! a(&self) {
    .() => {
        .match {
            this => {
                println!("{:?}", this.a);
                println!("{:?}", this.b);
            }
        }
    };
}
  • Pros: clear explicit binding by value/ref; inline pattern matching.
  • Cons: requires additional syntax implementation, alongside suffix macros themselves.

In the spirit of (pre) RFCs, feel free to either :+1: or :-1: the functionality itself (1) and your own preferred implementation of it (2), alongside your line of reasoning on the matter.

If possible, make an example of a project you have personally worked on, wherein having such a feature right would greatly help/streamline/spare you from unnecessary time/effort/cognitive load.

I'll try my best to organize the incoming pros/cons in at least somewhat comprehensive of a manner, in the meantime.


Quick poll, with up to 3 selectable options - in case you'd be fine with more than one impl.

  • Option #1
  • Option #2
  • Option #2.5
  • Option #3
  • Option #4
  • Option #5
0 voters
1 Like

(it's best to have the primary content here, if you want to discuss it here)

The main thing your notes fail to consider is how exactly the "receiver" for the macro gets captured. Does $self just expand to the entire receiver expression? If so, then the macro could evaluate $self a non-one number of times, or after evaluating "arguments" to the macro, modifying control flow outside of the visibly delimited macro invocation. If the receiver expression is evaluated first and $self names the result of that expression, how is that done? Is $self a value (C++ rvalue) or is it a place (C++ lvalue), and if it's a place, how is it determined what binding mode that place is referenced? Nowhere else in the language is it possible to use one place subexpression in more than one containing expression, but a place bound to $self would enable such.

Separately, field/method name lookup is currently always type directed. By what rationale should macros break that tend and do name lookup differently in this one position? Sure, type-directed macro expansion can be quite problematic since macro expansion can change name resolution even before considering type inference, but that's a reason to not do postfix macros too hastily, not a reason to violate what would otherwise be intuitively expected.

2 Likes

Got it, I'll keep that in mind. Moved the whole message here, in the meantime.

It'd imagine so, the exact same way the rust-analyzer currently automatically captures the expression preceding the dbg suffix it automatically expands into, as so:

Before/After

image image

Forgive my ignorance on the matter, but is there any other macro that actively evaluates expressions prior to / while preprocessing their code? Isn't the whole point of macros to merely streamline / rearrange / replace one part of code with another one? Why would there be any need to evaluate $self or the arguments in it? Or am I missing the implicit "hygiene" part of it, here?

How is it done within the rest of macros nowadays? Why not simply mirror that behavior?

In addition, would this (perfectly valid and necessary) line of concerns extend to the first two possible implementations as well? If not, why not just stick with the #2 - keeping the overall complexity and additional considerations to the minimum, while preventing the pollution of the existing LSP autocomplete with a whole bunch of user-defined macros, as previously mentioned?

Intuitively, enforcing one single evaluation (1) alongside the value category (rvalue/lvalue) mirroring the expression itself (2) and the binding mode derived from it just as well (3) seems the most reasonable. Forcing a single end placement (4) to disallow repeated uses, which would necessarily expand into repeated evaluations for anything other than literals, is also perfectly viable.

For any use case, requiring repeated access to $self, manual rebinding is always an option:

macro_rules! debug {
  ($self:expr, $($val:expr),*) => {
    let expr = $self;
   $(
      println!("{}: {}", &expr, $val);
   )*
  }
}

Agreed. Wouldn't a compile-time enforced precedence tackle this issue, without much additional hassle? Or would disallowing/ignoring any macros-as-suffix, whose names match the existing associated method of a given type, be too heavy of a burden on the compiler - despite the capability of hygienic preprocessor, currently in place and (as far as I understood so far) quite capable of custom evaluation/binding/setup - as required by each individual use case?

To clarify using some code (if it helps):

It does just rearrange the code, yes. But that's the issue: Compiler Explorer Most people probably wouldn't expect some().long().function().chain().doing().something() or even just something() to be evaluated twice, just because it is followed by a macro. And then there is the question of whether only something() or the entire chain should be evaluated twice. Both are not really intuitive, so you'd probably want $self to only be the return value itself (which leads to the question of value vs place.

You've linked to my suggestion but did not list it in your list of possible implementation even though the main idea there is vastly different than all the options you did list. I also think it's worth mentioning because it's purpose is to address the first concern @CAD97 brought up - the handling of $self.

TL;DR: instead of capturing $self in the macro, we'll add a new simple syntax to the language (postfix let in my original suggestion, but many commenters argued it should be a postfix match instead) that will make postfix macros powerful enough even if they don't have any access to $self.

2 Likes

Perhaps too simplisitc/limiting, but what if postfix macros would work like a function in regards to self? Would that be too restricting?

impl MyType {
    macro_rules! a(&self) {
        ($self:expr) => {
            println!("{:?}", self.a);
            println!("{:?}", self.b);
        };
    }
}

do_something().a!();

In terms of capturing self this could be equivalent to the following (perhaps always getting inlined):

impl MyType {
    fn a_macro(&self, <token_tree>) {
        <token_tree> // With access to self
    }
}

do_something().a_macro(...);

This of course doesn't work in Rust today, unless the macro creates some struct containing all the data (see std::fmt::format_args!()), but unless I'm mistaken this would solve the problem of how self should be passed (place vs value). While at the same time working similar to the familiar functions.

It would of course be a bit restricting in terms of self:

  • You cannot run/execute the self argument later or conditionally (which would be confusing)
  • You are limited to &self, &mut self, self and similar

Note: I haven't thought a lot about this (especially in regards to macro expansion), so this could have lots of issues/pitfalls.

If I understood idanarye's post correctly it's the same suggestion, with the main difference being that the macro defines how it needs self instead of the call-site.

Not really. In my suggestion the macro does not define how it needs self - in fact, it has not macro-level access to self (the expression before the macro) at all! All it does is generate tokens based on the macro arguments and replace the macro invocation (and it alone - not the tokens that come before it. Though the . does count as part of the macro invocation and gets replaces) with these tokens.

If I were to convert your example to my suggestion, it'd look like that (I'm using the style describe in this comment):

macro_rules! a(&self) {
    .() => {
        .match {
            this => {
                println!("{:?}", this.a);
                println!("{:?}", this.b);
            }
        }
    };
}

And then this:

do_something().a!();

Will resolve to:

do_something().match {
    this => {
        println!("{:?}", this.a);
        println!("{:?}", this.b);
    }
}

Which is equivalent to today's:

match do_something() {
    this => {
        println!("{:?}", this.a);
        println!("{:?}", this.b);
    }
}
1 Like

Understood, thanks for the pointer.

Though I understand the reasoning and the intent, trying to predict (and fit into) the expectation of the end user - instead of giving them a clear-cut, consistent "this is the way it's processed" way, alongside a distinct do/don't guide in a separate comment - may only confuse them further.

Why would we introduce a different evaluation mechanic for the ($self:expr) alone, when people will have gotten used to the regular code substitution in standard macro_rules!, after all?

We would be breaking an established, consistent convention, of the macros themselves - merely to make it more similar to a different branch of functionality altogether: that of associated methods.

A double requirement (of both the attribute and $self) can thus serve as yet another option.

/// Hovering over the attribute would
/// automatically bring up the documentation, as in:
/// ---
/// Take note that the `$self` will be substituted as-is,
/// with no prior evaluation. This means, in code:
/// 
/// struct Mutable {
///   num: u32,
/// }
/// 
/// impl Mutable {
///   fn plus_one(&mut self) -> u32 {
///     *self.num += 1;
///     return *self.num;
///   }
/// }
/// 
/// let mtb = Mutable { num: 0 };
/// 
/// mtb.println_twice!(); // 1 first, 2 afterwards
/// ---
#[macro_suffix] 
macro_rules! println_twice {
    ($self:expr) => {
        println!("{:?}", $self);
        println!("{:?}", $self);
    };
}

Any additional lvalue/rvalue considerations, prior single evaluation, and/or binding modes - would only add to the mental overhead people would need to keep track of later on.

1 Like

Postfix macros sound perfect for an experimental rfc. The high-level outline makes sense, but the usability and corner cases are harder to determine without using it.


Personally I would expect to want both capture-by-value and capture-by-expression, with the latter somehow delineated:

a().b().c!().d()
// equivalent-ish in current rust to
let _temp = a().b();
c!(_temp).d();
{{ a().b() }}.c!().d()
// equivalent-ish in current rust to
c!(a().b()).d()

This enables the common case, in particular embedding the macro in a long method chain, without disallowing one of the important capabilities macros have. That said, perhaps we don't even know what the common case is yet without an implementation to play with.

1 Like

I assume this is to make it possible to "overload" macros so they do one thing in the postfix position and another when invoked normally, but it seems a bit odd..

basically you're allowing macros to do limited token lookbehind, which seems not in the spirit of macros... macros can appear in several different contexts (eg. expression context and pattern context), but they are currently not allowed to tell what context they were called in. if context-based overloading is desired, perhaps it can be added seperatly.

It's also worth noting that postfix_match is already an expiremental feature in the language.

I do like the idea of simply allowing macros to appear in the method call position without giving any extra significance to this, instead only allowing them access to the arguments that are actually within the macro call.

Agreed, my bad - linked it into the edited version of the post.

It does feel like a bit of an overkill to me - as we're venturing a bit beyond the macros themselves and the preprocessor which would need to handle them - yet if others like it even more, why not?

The point of having the . be part of the macro expansion is that if I have something like:

macro_rules! foo {
    .() => {
        .bar()
    }
}

baz.foo!();

It's expand to baz.bar() (replacing .foo!() with the macro result) and not baz..bar() (replacing only the foo!() without the .)

Of course, if the . wasn't part of the expansion one could do:

macro_rules! foo {
    .() => {
        bar()
    }
}

But it looks weird that bar() appears as a free function here instead of a method.