Pre-RFC: procmacros implemented in wasm

I'm not in favor of this, mainly because I think having another runtime (WASM) is overkill in this case.

Sandboxing isn't something only achievable through WASM - using Linux as an example you can trivially restrict network access and boom a lot of attack surfaces has gone. In general, we should only do sandboxing when it's low hanging fruit, then leaving the rest to endpoint security software.

Determinism and dependency tracking isn't what you should do with WASM either - you don't have to "automate" dependency tracking so much that you hook into the filesystem APIs. Proc macro developers can do that reasoning on their own.

Finally, about IDEs - I think it might be a fair point, although are there many proc macros that runs in unbounded time? The current issue is with Cargo not optimizing them at all, in reality they shouldn't really cause problems for editors.

Please also note that the proc-macro RPC APIs are fairly high-overhead; they can only run without a performance hit when it's same process, same thread. See https://github.com/rust-lang/rust/issues/56058.

1 Like

IIRC wasm runtimes allow you to run the wasm code in the same process.

Does it allow proc macro to write files? I am trying to generate some trace files at compile time using proc macro.

1 Like

I think "trivially" doesn't convey how easy this is to do today.

If you install cross (cargo install cross), then cross build will actually build your crate, including its dependencies, inside a Docker container with limited permissions, e.g., the root crate directory is read-only, only the target/ directory can be written to, no directory except for the root crate can be read, only those programs available inside the Docker container can actually be executed, you control which environment variables are available to the Docker container via a toml file, etc.

When using cross build, all of this applies not only to proc macros, but also to build scripts, and no modifications in the compiler side are necessary.

It wouldn't be hard to extend cross to support, e.g., reading a configuration file that allows configuring the permissions of the Docker containers in a more fine-grained way.

These restrictions aren't only useful for security. For example, I've noticed that they also encourage good practices like requiring crates to generate files into the OUT_DIR instead of putting them in the root crate directory, resulting in an "always-clean" root crate directory.

3 Likes

Easy if you’re targeting Linux, maybe.

2 Likes

Easy if you are using a Linux host, many Docker containers are provided so you can target a lot of platforms with it.

AFAIK nobody has tried to add support for using cross on any other platform than Linux, but Windows, and OSX also support Docker, and other operating systems support other similar tools (e.g. FreeBSD has jails).

Docker for macOS only supports Linux containers and runs them inside a Linux VM, so you can't develop native things with it. Docker for Windows does support Windows containers, but only on Windows 10 Pro (or higher) if you can get Hyper-V running (so it's not trivial inside a VM). I wouldn't consider any of those a practical solution.

5 Likes

I'd like for it to stay possible for proc-macros to persist some kind of state between invocations. https://github.com/CensoredUsername/dynasm-rs needs to be able to recall the current target architecture, target features and any register aliases between invocation for instance.

1 Like

I had been thinking about each individual invocation being done with clean state, but this is an interesting case. I guess there are no real determinism concerns about this. It could be implemented either by having the procmacro return a state back to the compiler which passes it in next time, or just by preserving the wasm state from call to call.

For now I'd leave this as a case which can't be sandboxed, but it doesn't sound too hard to handle.

Using Docker-style containers is a useful technique for certain build situations, and I think there's a fruitful conversation to be had about it elsewhere.

However, it isn't a good solution for the cases I'm targeting (ie, fine-grained build determinism). Specifically:

  • It's coarse-grained - it's relatively easy to encapsulate an entire build, but not at a finer grain
  • It's extremely heavyweight - it requires an entire virtual environment to be set up and maintained
  • It's very system-specific. While Docker supports a limited number of OS environments, other solutions exist for other OSs. But there isn't a single solution for all OS environments.
  • It's relatively leaky - you can control network access and file access to a filesystem level, but you can't prevent processes from accessing time of day or randomness.
  • It doesn't integrate with other build systems. It assumes that the build target is mostly Rust, which isn't necessarily the case.

While this proposal requires changes to the compiler, it is ultimately completely optional. If you don't want the weight of a wasm runtime, then I'd expect it can be configured out. It won't have any operational effect on what procmacros can be used or what crates can be built.

3 Likes

Good question. The way I see it would be:

Firstly the Cargo.toml for the procmacro would have a flag indicating whether the procmacro supports this kind of sandboxing at all. Procmacros which are not intended to be deterministic (talk to network services, depend on real time, etc) would exclude themselves. Setting this flag would also indicate that its a candidate for prebuilt-caching, though I think we'd want an addition separate control for that (including options like what feature flags it should have set).

Then a user of the crate would have three options:

  • Never sandbox
  • Sandbox where possible
  • Disalllow non-sandboxed procmacros

depending on their requirements. (I don't think "never use sandboxable procmacros" is a useful option.)

Perhaps, but I think we'd need a solid amount of practical experience to see how it actually works out. I definitely consider this an experiment for now. However if it turns out that non-sandboxable procmacros are not actually useful in practice - or are only used for very specialized cases - then maybe it would make sense to assume sandboxable and sandboxed by default.

1 Like

Just adding a note, any proposal about pre-building things on the Rust project infrasturcture should be approved by the infra team, both on a capacity/budget side and on a security side.

6 Likes

Wouldn’t it make sense to leave crates.io as is, and develop an independent proxy, that defers to crates.io for source code and compiles/caches relevant binaries?

2 Likes