Making AST to text conversion non-lossy

canndrew · March 24, 2019, 3:13am

The README for cargo expand contains the following disclaimer:

Be aware that macro expansion to text is a lossy process. This is a debugging aid only. There should be no expectation that the expanded code can be compiled successfully, nor that if it compiles then it behaves the same as the original code.

For instance the following function returns 3 when compiled ordinarily by Rust but the expanded code compiles and returns 4 .
fn f() -> i32 {
   let x = 1;

   macro_rules! first_x {
       () => { x }
   }

   let x = 2;

   x + first_x!()
}

This bugs me. I've been bitten by it before when trying to debug my own macros. I think we should try to make it so macro code can always expand to correct code when translated back to text.

In this case a possible solution is to add a syntax form for referring to shadowed variables. For example, since the ~ sigil isn't currently used for anything, the syntax could be:

let x = 1;
let x = 2;
let x = 3;
println!("{} {} {}", ~~x, ~x, x); // prints "1 2 3"

I don't think people should actually write code like this (they should receive a warning if they do), but I think that maybe macro expansion should. Note that this could also be extended to break/continue hygiene if we allow the '_ lifetime on break/continue to refer to the innermost anonymous loop (ie. the same as with the lifetime omitted), then '~_ to refer to the next-innermost anonymous loop, etc. eg:

macro_rules! do_three_times(($e:expr) => {{
    let mut vec = Vec::new();
    for i in 0..3 {
        vec.push($e);
    }
    vec
}}

loop {
    let vec = do_three_times!(break);
}

/// this(↑↑) loop would expand to this(↓↓)

loop {
    let vec = {
        let mut vec = Vec::new();
        for i in 0..3 {
            vec.push(break '~_);
        }
        vec
    };
}

Thoughts? Obviously the syntax is up for debate. But should we add something like this?

comex · March 24, 2019, 10:18pm

on making it non-lossy in some way, but as long as it’s only meant for debugging rather than real code, dedicating ~ to it feels like a waste. Rust has more keyboard symbols currently unused than some other languages do, but there’s still a finite supply of them…

Also, I vaguely remember hearing something about a stack pinning macro relying on the fact that shadowing makes a variable inaccessible to enforce safety. If that’s correct, we would need to find a replacement…

Edit: Silly idea: Use ↑ as the sigil.

CAD97 · March 24, 2019, 10:41pm

pin_mut! does rely on shadowing but it also moves the value out of its original binding, defeating this way of referring to shadowed names, thankfully.

Based on the comment, I suspect this was accidental.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=e37072ed803067810f274e760e6075e3

bjorn3 · March 27, 2019, 7:32pm

What about replacing every variable created inside a macro or being shadowed with the original name + some random id. Eg:

fn f() -> i32 {
   let x = 1;

   macro_rules! first_x {
       () => { x }
   }

   let x = 2;

   x + first_x!()
}

=>

fn f() -> i32 {
   let x__enioerio = 1;

   let x = 2;

   x + x__enioerio
}

mcy · March 27, 2019, 8:21pm

Rust is way too sigiltastic already. I think we should introduce a macro-like syntactic construct instead, e.g. unshadow!(x, 1). unshadow!(x, 0) is just x.

That said, I think this is an altogether bad idea, and cargo expand should emit either code which does not compile (clang -E doesn’t…) by e.g. turning x + first_x!() into x + #[shadowed = line_number] x. (I believe this is not a legal position for an attribute, but I don’t think there is any reason to be able to express this in rust).

CAD97 · March 28, 2019, 12:30am

I believe RFC#2602 would allow the attribute there IIRC.

I agree that making the macro-expanded code compile correctly isn’t necessarily a meaningful goal, but I think making it not compile and give the wrong result is.

The other case that this ignores is macros that expand to an identifier that’s in scope but with a different span local to the macro definition. That’s actually the common case of cargo expand's output not being “correct”. I’m pretty sure that the only way to make this fully general is to append a short hash of the Span of each identifier’s resolution to each identifier. That, unfortunately, sounds like it’ll hurt the utility of cargo expand rather than help.

petrochenkov · March 28, 2019, 5:40am

The important part of the span is syntactic context, and syntactic contexts are interned, so we can append the context's index in the interner instead of hash, like x__2131.
(Debug impl for Ident already prints it.)
(Of course, it can still conflict with x__2131 written manually, but this can probably be solved by conventions good enough.)

That's true.
Perhaps a separate pretty-printing flag will help, for-tools-not-for-humans or something.

EDIT: That won't work with macros 2.0 though (the printed code won't compile) because they adjust contexts before resolving identifiers.

comex · March 28, 2019, 9:57am

The problem with munging the identifiers themselves if that if you’re compiling a library crate, its exports have to have the right names.

bjorn3 · March 28, 2019, 6:49pm

Items are not hygienic anyway, so there would be no need for changing their idents.

system · June 26, 2019, 6:49pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Converting Rust's AST back to code compiler	2	1025	June 25, 2021
Rustdoc suggestion: Optionally show macro expansions (within a dropdown) documentation	3	877	November 7, 2022
On macros future-proofing, FOLLOW sets, and related stuff language design	33	5479	March 25, 2019
Spans from built-in macro expansions are not "from expansion" compiler	5	739	February 9, 2020
Macros vs incremental parsing compiler	13	2322	May 7, 2019

Making AST to text conversion non-lossy

Related topics