[Idea or pre-RFC] Arbitrary Token Stream Region

crlf0710 · May 22, 2019, 12:54pm

Motivation

Macros are powerful tools. However currently they have a fundamental limit: The input code if the macro must be (assumingly) correctly parsed first.

By introducing "arbitrary token stream region"s, programmer can mark regions of tokens solely interpreted and used by macros. Note that comments will not be recognized by lexer within the zone.

Guide-level explanation

arbitrary token stream expression region

They’re introduced by r#( ... ) regions. You can add all kinds of tokens between the parenthesis, the only limit is that: all parenthesis within the zone must be properly paired.

arbitrary token stream item region

They’re introduced by r#{ ... } regions. You can add all kinds of tokens between the braces, the only limit is that: all braces within the zone must be properly paired.

diagnostics

If such a zone did not get replaced by a macro, the compiler will emit an error.

changes to proc-macro apis/`syn` crate

TBD

Example #1

See below.

197g · May 22, 2019, 1:02pm

Could a macro not interpret the contents of an include_bytes!() or a literal? The interpreted input would then likely not be located in the same file (or be an awfully formatting b"" string) but that would encourage separating files by syntax of their content which I would regard as a feature.

theduke · May 22, 2019, 2:03pm

That’s already possible, you can just define a proc macro that takes a a file path as input. The proc macro loads the file and does with it whatever it wants to.

Centril · May 22, 2019, 3:14pm

It is not at all clear to me what is being suggested here and much is left as guesswork for a reader of this thread. Please try to incorporate the idea into some example of how it may be used in its proper context.

Uther · May 22, 2019, 4:11pm

I agree you should be more precise on what you want to do with this feature and provide examples.

If I guess right, you want to accept code that does not match the Rust parsing rules. It might be useful to insert other languages in rust code. If I had to accept any code, I think this syntax would be even more useful :

my_macro!#{
    code that does not have any restriction. 
    not even matching } or )
}#

But if you accept code the rust compiler can not parse, I’m not sure it can be treated as a TokenStream. I think it would rise a lot of problems (with hygiene particularly).

josh · May 22, 2019, 4:31pm

What is the advantage of a non-Rust-parsable token stream over a string?

What if we just made the Rust lexer available as a crate that you could invoke on a string?

crlf0710 · May 22, 2019, 4:51pm

Yes, i admit that i should make an example first, let me try to make one now. The problem the example itself exhibits is not expected to be discussed in this thread, only showing this as a potential useful mechanism. So this functionality is kind-of meta.

Example #1 Delegation

Imagine i want to help myself to write a macro to do some method forwarding(not to change Rust the language, but only to meet my own needs in my own little project), i’ve chosen the following syntax:

struct Button;
impl Button {
    fn set_text(&mut self, text: &str) {}
    fn set_state(&mut self, state: ButtonState) {}
}

struct ImageButton(Button, Image);

#[delegation]
impl ImageButton r#{
    delegate self => self.0 {
         fn set_text(&mut self, text: &str);
         fn set_state(&mut self, state: ButtonState);
    }
    fn set_image(&mut self, image:Image) {} 
}

So under the current design, this code needs to be correct parsed first before the delegation macro can process it.(I don’t know whether it will parse, let’s assume it won’t. And even if it can be parsed now, maybe it won’t be properly parsed in Rust 1.45, who knows)

Without this feature, i’ll have to modify the syntax somehow to make it more like ordinary rust, by using more attributes and reduce keyword-like structure and usages.

However, using this feature, as the r# usage within the example, the parser will magically pack whatever between these braces together, and send them to the macro. The delegation macro will reparse and generate whatever code it like to implement this functionality.

crlf0710 · May 22, 2019, 5:08pm

For the surface syntax i think strings might give others wrong appearance and hints about the text within it (it seems it’s data!) And “bare” tokens with braces gives the impression taht maybe it’s actually some nonstandard-extensions to the language. And IDEs can provide some basic support here.

mcy · May 22, 2019, 5:13pm

Rust macros are already overpowered as it is. I cannot think of a situation where readability of a proc macro is improved by being able to process something that doesn’t lex as Rust.

josh · May 22, 2019, 5:28pm

I’d much rather see a delimiter that gets treated as a string, without the presumption of lexing. IDEs would give counterproductive help if they assume the contents should look like valid Rust.

CAD97 · May 22, 2019, 10:59pm

Any change to not require matching brackets of all three types would require major changes to the Rust lexer.

Function-like macros can already contain any syntax that Rust can lex.

Attribute macros specifically take valid Rust item syntax as input because they don’t wrap their contents, and it needs to be clear both to the compiler and the human what the scope of the macro is.

The ability to lex external files and get proper spans for it would be invaluable for macros.

For your specific use case, it might be more productive to argue towards functionlike macros in more positions, for syntax like:

#[delegation]
impl ImageButton {
    delegate! { self => self.0;
         fn set_text(&mut self, text: &str);
         fn set_state(&mut self, state: ButtonState);
    }
    fn set_image(&mut self, image:Image) {} 
}

Even better, and potentially valid today (didn’t check) (but definitely close):

#[delegation]
impl ImageButton {
    #[delegate(self => self.0)]
    fn set_text(&mut self, text: &str);
    #[delegate(self => self.0)]
    fn set_state(&mut self, state: ButtonState);
    fn set_image(&mut self, image:Image) {} 
}

system · August 20, 2019, 10:59pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Macro Keyword language design	40	3758	August 5, 2020
The future of syntax extensions and macros	62	13632	March 25, 2019
CharStream macros language design	7	1056	July 30, 2020
Bring back item-like macros language design	6	995	July 30, 2020
Idea: escaping macro separators language design	16	3212	March 25, 2019