Making AST to text conversion non-lossy

#1

The README for cargo expand contains the following disclaimer:

Be aware that macro expansion to text is a lossy process. This is a debugging aid only. There should be no expectation that the expanded code can be compiled successfully, nor that if it compiles then it behaves the same as the original code.

For instance the following function returns 3 when compiled ordinarily by Rust but the expanded code compiles and returns 4 .

fn f() -> i32 {
   let x = 1;

   macro_rules! first_x {
       () => { x }
   }

   let x = 2;

   x + first_x!()
}

This bugs me. I’ve been bitten by it before when trying to debug my own macros. I think we should try to make it so macro code can always expand to correct code when translated back to text.

In this case a possible solution is to add a syntax form for referring to shadowed variables. For example, since the ~ sigil isn’t currently used for anything, the syntax could be:

let x = 1;
let x = 2;
let x = 3;
println!("{} {} {}", ~~x, ~x, x); // prints "1 2 3"

I don’t think people should actually write code like this (they should receive a warning if they do), but I think that maybe macro expansion should. Note that this could also be extended to break/continue hygiene if we allow the '_ lifetime on break/continue to refer to the innermost anonymous loop (ie. the same as with the lifetime omitted), then '~_ to refer to the next-innermost anonymous loop, etc. eg:

macro_rules! do_three_times(($e:expr) => {{
    let mut vec = Vec::new();
    for i in 0..3 {
        vec.push($e);
    }
    vec
}}

loop {
    let vec = do_three_times!(break);
}

/// this(↑↑) loop would expand to this(↓↓)

loop {
    let vec = {
        let mut vec = Vec::new();
        for i in 0..3 {
            vec.push(break '~_);
        }
        vec
    };
}

Thoughts? Obviously the syntax is up for debate. But should we add something like this?

1 Like

#2

:+1: on making it non-lossy in some way, but as long as it’s only meant for debugging rather than real code, dedicating ~ to it feels like a waste. Rust has more keyboard symbols currently unused than some other languages do, but there’s still a finite supply of them…

Also, I vaguely remember hearing something about a stack pinning macro relying on the fact that shadowing makes a variable inaccessible to enforce safety. If that’s correct, we would need to find a replacement…

Edit: Silly idea: Use ↑ as the sigil.

2 Likes

#3

pin_mut! does rely on shadowing but it also moves the value out of its original binding, defeating this way of referring to shadowed names, thankfully.

Based on the comment, I suspect this was accidental.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=e37072ed803067810f274e760e6075e3

1 Like

#4

What about replacing every variable created inside a macro or being shadowed with the original name + some random id. Eg:

fn f() -> i32 {
   let x = 1;

   macro_rules! first_x {
       () => { x }
   }

   let x = 2;

   x + first_x!()
}

=>

fn f() -> i32 {
   let x__enioerio = 1;

   let x = 2;

   x + x__enioerio
}
1 Like

#5

Rust is way too sigiltastic already. I think we should introduce a macro-like syntactic construct instead, e.g. unshadow!(x, 1). unshadow!(x, 0) is just x.

That said, I think this is an altogether bad idea, and cargo expand should emit either code which does not compile (clang -E doesn’t…) by e.g. turning x + first_x!() into x + #[shadowed = line_number] x. (I believe this is not a legal position for an attribute, but I don’t think there is any reason to be able to express this in rust).

2 Likes

#6

I believe RFC#2602 would allow the attribute there IIRC.

I agree that making the macro-expanded code compile correctly isn’t necessarily a meaningful goal, but I think making it not compile and give the wrong result is.

The other case that this ignores is macros that expand to an identifier that’s in scope but with a different span local to the macro definition. That’s actually the common case of cargo expand's output not being “correct”. I’m pretty sure that the only way to make this fully general is to append a short hash of the Span of each identifier’s resolution to each identifier. That, unfortunately, sounds like it’ll hurt the utility of cargo expand rather than help.

0 Likes

#7

The important part of the span is syntactic context, and syntactic contexts are interned, so we can append the context’s index in the interner instead of hash, like x__2131.
(Debug impl for Ident already prints it.)
(Of course, it can still conflict with x__2131 written manually, but this can probably be solved by conventions good enough.)

That’s true.
Perhaps a separate pretty-printing flag will help, for-tools-not-for-humans or something.

EDIT: That won’t work with macros 2.0 though (the printed code won’t compile) because they adjust contexts before resolving identifiers.

0 Likes

#8

The problem with munging the identifiers themselves if that if you’re compiling a library crate, its exports have to have the right names.

0 Likes

#9

Items are not hygienic anyway, so there would be no need for changing their idents.

0 Likes