Macro Keyword

Keeping bracket matching rules, and string literals should suffice to find the end of an arbitrary macro, even by dumb tools.

I'm also interested in macros that work on entire files/modules, e.g.:

#[parse_as_html]
mod homepage_template_dot_html;

AFAIK currently such macro will only see mod declaration, rather than file body, and won't be able to create meaningful Spans even if it did a horrible thing and did its own filesystem access in a proc macro.

2 Likes

Mini counter (unbaked) proposal:

There have been many requests for attribute-like macros to allow arbitrary syntax within the item they're decorating. Just as an easy example off the top of my head:

#[asm]
fn isr_3() {
    // this is just asm
    pushad
    call increment_breakpoint_count
    popad
    iretd
}

This isn't possible in current Rust, because attribute macros require the item they decorate to be syntactically valid.

Item-like macros, then, would be $path! $item, except the item does not attempt in any way to parse the contents of any brackets in its top level syntax. So the same example:

```rust
asm! fn isr_3() {
    // this is just asm
    pushad
    call increment_breakpoint_count
    popad
    iretd
}

Or bitflags:

bitflags! struct Flags {
    A = bit 1,
    B = bit 2,
    C = bit 3,
    ABC = A | B | C,
}

Is this a good idea? I don't know. But it shows off a need, a practical definition, and some potential applications, so it's possible to discuss constructively.

No offense intended, @redradist, but it sounds like you're in the ideation phase for your proposal, whereas this forum (for better or for worse) prefers to discuss more concrete proposals on their technical merit.

The rust-lang development discord has a #design channel that might be more fruitful for ideation and initially discovering obvious improvements to ideas than irlo, where you're liable to get jumped on by all the regulars in parallel.


I've been an advocate for a TokenStream::open(&Path) API for a while now. Having to pass the Rust lexer isn't actually that problematic for most formats you'd want to read in a proc macro, and getting new spans for a new file would be amazing.

And controlled by the compiler if you proxy through it.

5 Likes

It is a discussion of maybe future RFC and it is suggested on https://github.com/rust-lang/rfcs/blob/master/README.md that any proposal should be discussed with community to reduce a lot of not accepted RFC-s :wink:

I'm not sure how the corner cases there are supposed to work out. Is it supposed to be a CharStream, where the proc macro decides when it ends, or is it a raw string-like form, where it scans to the first }, or is it a token tree?

If it's a CharStream, then it produces undecidable lexing, as all CharStream-like forms do.

If it's like a raw string, then it's not obvious how it's supposed to work with nested parentheses. For example, with a javascript! macro:

javascript! function alert_later(v) {
    setTimeout(function() { alert(v) } /* does it end here? */, 1000);
} /* or here? */

Raw strings need that "ugly looking" delimiter syntax.

And you really can't just say "it's going to match parenthesis without lexing," because then you lose string literal handling. Which again means you couldn't embed arbitrary JavaScript like the OP wants to.

Token trees would solve most of the problems, with the restriction that it needs to be a valid token tree (even if it's not otherwise valid Rust).

I think it is possible at the first stage discuss proc_macro_keyword that works with valid tokens in Rust:

#[proc_macro_keyword]
pub fn strange_struct(attr: TokenStream, item: TokenStream) -> impl Iterator {
    // ...
}

The javascript! example that consumes character input could be disccussed little bit later ...

#[proc_macro_keyword]
pub fn javascript(attr: TokenStream, item: CharStream) -> impl Iterator {
    // ...
}
1 Like

... And yes, javascript! macro function (that called from proc_macro_keyword function) will decide when it should stops

Well, in this theoretical world, it would in fact create a TokenStream, but non-Rust tokens would also be encoded in the TokenStream. But I think that feature might even be useful if it requires everything to lex properly.

And my counter proposal does require something that thinly looks like a Rust item, it's just within any brackets of that item that are only constrained to be a valid TokenStream.

Sorry, but no. That creates an undecidable parse, which we need to avoid. It must be possible to parse Rust code without executing arbitrary user code.

2 Likes

It will work only if user will use this kind of macro functions in code ?!

Yeah, should we try to split this into two topics?

These seem like separate proposals.

4 Likes

I was thinking why not have macro that perform on raw string without parenthesis?

  • macro!"text"
  • macro!macro!"text"

This could be similar to the internationalization stuff used like _("text") but we could make it t!"text". Or we could have:

html!"#
<html>
  <title>Hello world</title>
</html>
#";

struct Model {
    xxx;
}

impl Xxx for Model {
    ...
}

css!"#
body {
    margin: 0;
}

Then we can have vue in rust, not sure about API but the like of it.

1 Like

In other work, you propose that a macro can be surrounded by quotes, doubles quotes and string literals in addition to the current parenthesis, square bracket and curly braces.

I don't believe that single quotes ('), which delimit single characters, were being proposed.

Neither do double quotes (only string literal), but I fail to see why one may be a good idea and not the others. As a user, and unless there is a strong reason not to be able to do it, I would expect to have either the 3 possibilities, or ā€“ as currently ā€“ none.

Imagine if you're offering a macro and you add a new option. css!("blah") still works when you support css!("blah" with option), but css!"blah" has no way to support the option, and you have to use parens or a different macro anyway.

Rust parses quoted text as a single token IIRC, they're not delimiters for the token stream like normal brackets are.

1 Like

Yes, but there are cases where it confirm will not have extra arguments.

That really wounld't matter as is, due to the fact that strings are parsed as a single token, every 'macro' that accepts a string literal could be implemented as a function taking &'static str. Either that, or Rust would have to start lexing with context-sensitivity (to know if a string is being parsed for a special kind of macro or for something else, and it certainly couldn't lex CSS or any other non-Rust language,) OR string literals would no longer be able to contain text that Rust can't lex, which is absurd.

In other words, Rust can't lex arbitrary strings, so there's nothing for it to pass your macro except for a string literal. And you can already pass string literals to functions, so there's no point in doing it with a macro.

We could just say, macro! token get's desugarred to macro! { token }. That way any macro can take a single token and not need unnecessary delimiters. (This may even be nice for custom literals, say bigint!120381204803284123, or s!"hello world" for String literals)

This way if you need to call a macro with multiple tokens, yes you will need delimiters, but only if you need multiple tokens

edit: I'm going to say a single token is something matched by the tt macro specifier, but excluding grouping tokens for back-compat.

4 Likes

Actually I like this !!

It would solve any need without losing readability:

html!r#"
<html>
  <title>Hello world</title>
</html>
"#;

css!r#"
body {
    margin: 0;
}
"#;

javascript!r#"
function alert_later(v) {
    setTimeout(function() { alert(v) }
}
"#;

python!r#"
class MyClass:
    def init(self):
        pass
"#;

cpp!r#"
class MyClass {
 public:
    MyClass () = default;
    // ... some other stuffs
};
"#;

But it would be nice to delegate it to procedure macro 'cause the logic under this macro could not be trivial ...

Why is

html!r#"
<html>
  <title>Hello world</title>
</html>
"#;

a significantly better alternative to the existing

html!(r#"
<html>
  <title>Hello world</title>
</html>
"#);

such that it justifies the costs to the "weirdness budget"?

7 Likes

(Which is why I brought up custom literals, s!"Hello World" is way better than s("Hello World"))

Edit: I made a new topic to discuss this sugar specifically

2 Likes