Pre-expanding proc macros

Re @matklad avoiding serde in projects:

I wonder if crates could expand proc macros back to source code, off-line, before being published. This way a crate could use any number of proc macros without exposing that fact to its users. It wouldn't need syn/quote/etc. for every compilation downstream, only once before the crate is published.

In the JavaScript world such strategy has been used for CoffeScript and ES6 - they could be compiled down to ES5 and published as a pure-ES5 package without dependency on any translation machinery.

25 Likes

I would love this. I have deliberately avoided syn due to compile time, and it definitely makes writing macros more tedious. Having a background in JavaScript, I can confirm that it absolutely works there.

One concern, though, is conditional compilation.

2 Likes

This doesn’t even need any kind of special support form the compiler or Cargo. The only thing needed is for the proc-macro to also be published as a library operating on a proc-macro2 token-stream.

The “driver” code to std::fs::read_to_string .rs files, parse the with syn, and find the structs to apply derives to doesn’t seem too hard.

The consumer library then can include a test which checks freshness of the generated code and updates it if it isn’t fresh.

On T-cargo, we did some high level brainstorming for build.rs and proc macro generated results to be published with a package.

9 Likes

The way I understand you proposal is that you basically suggest to run cargo-expand on the code and the publish the output. I am not sure if that works well, given that macros distinglish between the macro and the use scope for hygene reasons.

Also dublication can be avoided, if the compiler has special infrastructure for this likely rather common case.

My personal choice would be some human readable text format for macro-fragment files and when rustc is passed a macro-fragment dir, via a new compile switch, it tries to find a suitable macro fragment there, before invoking the actual macro-code generation.

1 Like
2 Likes

To be clear, this isn't completely sufficient. The macro also needs to be written such that it does not rely on anything other than call-site hygiene. We don't have access to def-site hygiene yet, but we do have stable access to mixed-site hygiene. Additionally, it can be the case that the hygiene of passed in tokens matters.

I've used watt before, and it's awesome tech, but quite cumbersome to use. I've toyed with attempts at making it easier to use -- without committing a binary wasm blob into a git repo -- without breaking path dependencies, but never quite managed to push a solution over the finish line.

The main thing preventing a builtin wasm abi and runner for proc macros is aiui just developer time. Though if you're interested in working on wasm+proc-macro, please do ping @eddyb (and maybe me? :pleading_face::point_right::point_left: I'm interested too) as they've done a decent amount of design thinking around it.

Bonus points if you can make proc_macro usable outside the proc-macro server at the same time

Just an fyi, dtolnay put out a call for help/comments on Twitter a few days ago, but the exact post has since eluded me.

You'll want to ping @mystor for that (and the wasm stuff ofc), she's done most of the legwork lately to which I'm more of a bystander.

@dtolnay @sunfishcode

5 Likes

Note that it will need a fallback wasm interpreter as wasmtime only supports x86_64, aarch64, s390x and riscv64gc right now.

Not necessarily -- the fallback can just be the native compilation used today.

That would require a special case in every build system to check if rustc supports wasm proc macros on the current target and if not switch to compiling native dylibs. It would also mean that we can't remove the -Zdual-proc-macro hack that is necessary for cross compiling rustc, but breaks cross compiling tools linked against rustc itself.

Just posted another topic on this today... Sorry i didn't found this topic because of the title!

I think the major point is the pre-expansion of the usage crate, and whether the proc macro crate itself is pre-compiled is less important.

pasting the major points from that post:

I'm thinking maybe it's possible to add a new kind of dependency in Cargo.toml, maybe call it [expansion-dependencies], that acts just as normal dependencies. But instead, before uploading to crates.io, the proc macros defined in these crates got expanded and inlining the expansion results into the original source code. Only the expanded source code is uploaded to crates.io, and the dependency got automatically removed.

a proc macro crate can declare itself as able to be pre-expanded or not. For crates like thiserror and displaydoc , i think they're perfectly fittable for such pre-expansion.

1 Like

But instead, before uploading to crates.io,

In the above linked zulip thread, we were discussing the build.rs version of this to be a publish.rs with publish-dependencies. This naming opens the door for both build.rs and proc macro handling at publish time.

I have seen hesitance in adding more types of dependencies as it would be disruptive to the ecosystem to handle them. I could see these falling under dev-dependencies as this is for development but maybe this justifies a new dependency table.

Depending on how we handle this, a potential pitfall is if we merge "expansion dependencies" in with the regular "dependencies" during the build. This would cause features to be activated that wouldn't be activated for the published version which would make it harder to identifier or reproduce problems locally. We'd probably have to do an expansion pass locally as well.

Something I don't think I've seen addressed yet is how feasible the proc-macro side of this is within the compiler / cargo. While we have "cargo expand", I'm assuming we can't use it 100% the same way, e.g. we would only be expanding some macros.

1 Like

I do think we can just use dev-dependencies for this rather than a new kind of dependency.

1 Like

That seems expensive in situations such as when used as a git dependency. You would start having to build all dev-dependencies for all your unpublished dependencies.

What about a field on regular deps to indicate they belong in this (virtual) table? I would worry about publishing untested source code if these tables get desynced (and the publish-generated code doesn't match what anything else is actually using).

Shouldn't it be build dependencies? The macro would run on host machine, not target machine.

1 Like

Good point about host dependencies. The challenge with build dependencies is knowing which build dependencies we can strip and which we have to leave, since not everything will be able to be pre-expanded, even within a single crate.