Pre-RFC: procmacros implemented in wasm

ishitatsuyuki · September 8, 2019, 1:06am

I'm not in favor of this, mainly because I think having another runtime (WASM) is overkill in this case.

Sandboxing isn't something only achievable through WASM - using Linux as an example you can trivially restrict network access and boom a lot of attack surfaces has gone. In general, we should only do sandboxing when it's low hanging fruit, then leaving the rest to endpoint security software.

Determinism and dependency tracking isn't what you should do with WASM either - you don't have to "automate" dependency tracking so much that you hook into the filesystem APIs. Proc macro developers can do that reasoning on their own.

Finally, about IDEs - I think it might be a fair point, although are there many proc macros that runs in unbounded time? The current issue is with Cargo not optimizing them at all, in reality they shouldn't really cause problems for editors.

Please also note that the proc-macro RPC APIs are fairly high-overhead; they can only run without a performance hit when it's same process, same thread. See https://github.com/rust-lang/rust/issues/56058.

est31 · September 8, 2019, 1:37am

IIRC wasm runtimes allow you to run the wasm code in the same process.

quininer · September 8, 2019, 3:53am

Does it allow proc macro to write files? I am trying to generate some trace files at compile time using proc macro.

gnzlbg · September 8, 2019, 12:08pm

I think "trivially" doesn't convey how easy this is to do today.

If you install cross (cargo install cross), then cross build will actually build your crate, including its dependencies, inside a Docker container with limited permissions, e.g., the root crate directory is read-only, only the target/ directory can be written to, no directory except for the root crate can be read, only those programs available inside the Docker container can actually be executed, you control which environment variables are available to the Docker container via a toml file, etc.

When using cross build, all of this applies not only to proc macros, but also to build scripts, and no modifications in the compiler side are necessary.

It wouldn't be hard to extend cross to support, e.g., reading a configuration file that allows configuring the permissions of the Docker containers in a more fine-grained way.

These restrictions aren't only useful for security. For example, I've noticed that they also encourage good practices like requiring crates to generate files into the OUT_DIR instead of putting them in the root crate directory, resulting in an "always-clean" root crate directory.

comex · September 8, 2019, 4:27pm

Easy if you’re targeting Linux, maybe.

gnzlbg · September 8, 2019, 4:52pm

Easy if you are using a Linux host, many Docker containers are provided so you can target a lot of platforms with it.

AFAIK nobody has tried to add support for using cross on any other platform than Linux, but Windows, and OSX also support Docker, and other operating systems support other similar tools (e.g. FreeBSD has jails).

pietroalbini · September 8, 2019, 7:59pm

Docker for macOS only supports Linux containers and runs them inside a Linux VM, so you can't develop native things with it. Docker for Windows does support Windows containers, but only on Windows 10 Pro (or higher) if you can get Hyper-V running (so it's not trivial inside a VM). I wouldn't consider any of those a practical solution.

CensoredUsername · September 9, 2019, 11:57am

I'd like for it to stay possible for proc-macros to persist some kind of state between invocations. GitHub - CensoredUsername/dynasm-rs: A dynasm-like tool for rust. needs to be able to recall the current target architecture, target features and any register aliases between invocation for instance.

jsgf · September 9, 2019, 3:46pm

I had been thinking about each individual invocation being done with clean state, but this is an interesting case. I guess there are no real determinism concerns about this. It could be implemented either by having the procmacro return a state back to the compiler which passes it in next time, or just by preserving the wasm state from call to call.

For now I'd leave this as a case which can't be sandboxed, but it doesn't sound too hard to handle.

jsgf · September 9, 2019, 4:05pm

Using Docker-style containers is a useful technique for certain build situations, and I think there's a fruitful conversation to be had about it elsewhere.

However, it isn't a good solution for the cases I'm targeting (ie, fine-grained build determinism). Specifically:

It's coarse-grained - it's relatively easy to encapsulate an entire build, but not at a finer grain
It's extremely heavyweight - it requires an entire virtual environment to be set up and maintained
It's very system-specific. While Docker supports a limited number of OS environments, other solutions exist for other OSs. But there isn't a single solution for all OS environments.
It's relatively leaky - you can control network access and file access to a filesystem level, but you can't prevent processes from accessing time of day or randomness.
It doesn't integrate with other build systems. It assumes that the build target is mostly Rust, which isn't necessarily the case.

While this proposal requires changes to the compiler, it is ultimately completely optional. If you don't want the weight of a wasm runtime, then I'd expect it can be configured out. It won't have any operational effect on what procmacros can be used or what crates can be built.

jsgf · September 9, 2019, 4:12pm

Good question. The way I see it would be:

Firstly the Cargo.toml for the procmacro would have a flag indicating whether the procmacro supports this kind of sandboxing at all. Procmacros which are not intended to be deterministic (talk to network services, depend on real time, etc) would exclude themselves. Setting this flag would also indicate that its a candidate for prebuilt-caching, though I think we'd want an addition separate control for that (including options like what feature flags it should have set).

Then a user of the crate would have three options:

Never sandbox
Sandbox where possible
Disalllow non-sandboxed procmacros

depending on their requirements. (I don't think "never use sandboxable procmacros" is a useful option.)

Perhaps, but I think we'd need a solid amount of practical experience to see how it actually works out. I definitely consider this an experiment for now. However if it turns out that non-sandboxable procmacros are not actually useful in practice - or are only used for very specialized cases - then maybe it would make sense to assume sandboxable and sandboxed by default.

pietroalbini · September 11, 2019, 9:18pm

Just adding a note, any proposal about pre-building things on the Rust project infrasturcture should be approved by the infra team, both on a capacity/budget side and on a security side.

pygy · September 14, 2019, 5:33pm

Wouldn’t it make sense to leave crates.io as is, and develop an independent proxy, that defers to crates.io for source code and compiles/caches relevant binaries?

eddyb · September 26, 2019, 12:38pm

I wish we broke that when we had the chance because this was never officially supported. Among other things, the evaluation order isn't guaranteed (cc @petrochenkov can we randomize it?).

I'm not sure how to start a broader discussion on this but we do have plans for at least linting against that sort of thing (if we can bring the false positives down low enough).

petrochenkov · September 26, 2019, 2:19pm

Partially.
Some order is guaranteed - if macro call x expands into a macro call y or y's definition, then y certainly expands after x.
(I don't think we'll ever want piece-wise expansion of a single macro, that is required to break that guarantee.)

CensoredUsername · October 1, 2019, 12:14am

I understand and it's one of the reasons this project is still completely unstable. But as it allows for some pretty cool features I'm just asking if people would be open for an official way of supporting it.

eddyb · October 1, 2019, 10:42am

Usually you make something like this work by putting the entire scope within which you need some information in an invocation - attribute macros work great for this.

You don't even need to process most of the source yourself, usually you can inject a const or macro_rules! definition that provides that contextual information and then the other expanded code relies on that instead of state within the proc macro.

luser · October 23, 2019, 1:56pm

I was looking through the sccache source for something today and I realized that implementing this would benefit users of sccache's distributed compilation mechanism. Currently sccache will refuse to distribute any Rust compilation that uses a proc macro if the client system isn't exactly the same OS and CPU architecture as the build server (currently only x86-64 Linux is supported for the build server) because even with cross-toolchains rustc on a Linux server obviously can't load a Windows dll or macOS dylib proc macro plugin. If it was wasm this would work, however!

system · January 21, 2020, 1:56pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: Sandboxed, deterministic, reproducible, efficient Wasm compilation of proc macros language design	60	20980	April 5, 2024
Deterministic isolated proc-macros	27	2299	September 5, 2024
Sandbox build.rs and proc macros	24	3791	July 30, 2022
Security breach with Rust macros compiler	31	4041	August 25, 2021
Const fn + proc macros language design	30	3734	August 23, 2021

Pre-RFC: procmacros implemented in wasm

Related topics