Sandbox build.rs (and possibly proc-macro) by providing a runner as env variable

I am one of the maintainers of cargo-quickinstall and one challenges we face is to sandbox the compilation process since build.rs and proc-macro can essen do whatever they want, going even as far as trying to access GHA tokens or mess with the workflow itself.

Thus, I think we shall have a sandboxing mechanism in rust for build.rs and proc-macro.

What I propose is a new environment variable RUSTC_BUILD_OVERRIDES_RUNNER.

When this environment variable is present, instead of running build.rs or proc- macro directly, rustc will instead runs the command specified by the env and then pass the path to build.rs and its args to it like this:

$RUSTC_BUILD_OVERRIDES_RUNNER build-rs /path/to/build.rs /path/to/build.rs/crate args-for-build.rs...

For proc-macro, this is a bit harder since it is loaded as a dynlib, so the natural thing to do is to pass rustc or a shim binary that loads the proc-macro crate and then communicates with the rustc using stdin and stdout:

$RUSTC_BUILD_OVERRIDES_RUNNER proc-macro /path/to/shim/or/rustc /path/to/proc-macro/dynlib args...

For proc-macro, it is certainly better to have them compiled down to wasm and run them in an interpreter, but this seems to take longer than expected and have an additional problem:

In order to update transitive dependencies of the proc-macro crates, you would have to wait for upstreamto re-compile and publish another release, which seems to be cumbersome for me.

It also added additional overhead as many proc-macro crates depend on proc-macro2 and syn.

I think in order for this to work, we would have to figure out how to link proc-macro crates with their dependencies dynamically so that they can be upgraded at will by the users while reducing the size of the prov-macro crates on crates.io and users' computers.

But repr(interoper) is simply not there yet and judging by the RFC, it would likely takes years to finish and that would block compiling proc-macro to wasm for a long time, and there's demands to sandbox them today.

So I propose this simple mechanism for both build.rs and proc-macro right now and if proc-macro to wasm is supprted to future, we can simply retire RUSTC_BUILD_OVERRIDES_RUNNER support for proc-macro that is compiled to wasm.

Have you considered isolating the entire build process in a container? That seems like a much simpler option that wouldn't require any extra dev work in cargo/rustc.

1 Like

That is certainly doable, but then I can't use sccache to speed up the CI. Part of the reason I want build.rs and proc-macro is so that I can use sccache.

Also, isolating build.rs and proc-macro separately from rest of the build has additional benefits:

They will not be able to temper with the build process, e.g. access to source code of other crates or try to temper with build process via /proc.

A whole build sandbox can make the source directories read-only so it couldn't tamper with other crates source. One thing I have seen in the wild that this could additionally prevent is a build-script that reaches up out of its OUT_DIR and searches for a different crates OUT_DIR (though, technically this should be allowed, at least read-only, if the dependency crate uses links and passes a path into its OUT_DIR through a DEP_ variable).

Why can't you use sccache in a container?

Will this sandboxing continue to allow build.rs to generate code in support of the crate; for instance libraries in other languages or FFI code or ... ?

I would love to see a IDK build.toml where the privileges are stated that build.rs needs.

  • You put build.rs into a sandbox with exactly the right privileges
  • Tools and humans can inspect the privileges.
2 Likes

It's not that we cannot use it, but rather we cannot reuse the sccache cache on other packages.

Since build.rs and proc-macro can temper with the build process however it wants, we cannot trust the cached content of sccache anymore

Yes, since the mechanism I described here enable you to decide how to sandbox the build.rs and proc-macro.

Yes, build.rs tempering with other crates OUT_DIR or the build process is what I want to prevent here.

I believe sandboxing each build.rs and proc-macro is the only way to ensure they don't temper with other crates build.

I don't think that sandboxing build.rs and proc macros would be enough for you to be able to trust a shared sccache for cargo-quickinstall.

The threat model for standard supply chain attacks is "someone tampers with one of the packages that I have written down as trusted, or that one of those packages has written down as trusted (transitively)". Each of these packages is trusted to execute arbitrary code on your target, so as soon as someone tampers with one of these packages, the game is lost. There typically isn't much point in thinking up mitigations one this has happened. Things like seccomp and wasm might help a bit, but they're basically security theatre once a supply chain attack has happened.

Once you're in this mindset, there doesn't seem to be much point in hardening your compiler against malicious code. It is so easy for malicious code to do damage at runtime that guarding against malicious code at compile time feels pointless. Any vulnerability in the compiler would rightfully be assigned low priority as a result. This is why sandboxing build.rs and proc macros are so low on the priority list.

Now consider the implications for the shares sccache use case (assuming that you have sandboxed build.rs and proc macros already):

  • Someone finds a bug in the compiler that allows arbitrary code execution at compile time. It is marked as low priority so they probably have ages to work this out
  • They use it to take over an sccache runner by uploading a malicious package to crates.io and triggering a build. They know the hash of a package configuration for a popular trusted package that has not been built by sccache yet, so they pretend that they have built that package, and upload a version of that to the cache
  • Users of the shared sccache download what they think is the trusted package but end up executing malicious code that they never asked for.
  • The shared sccache project now has a critical security incident that is caused by a low priority bug in the rust compiler

The only way to avoid this is to sandbox the sccache builder sufficiently tightly that

  • it is only ever allowed to build a single package before being destroyed
  • It never has access to any secrets that can be used to upload to the shared cache - done other process outside the sandbox must be responsible for doing this.
  • The sandbox must be bullet proof. Docker containers won't cut it - it needs to be firecracker or runsc or some similar grade of vm-strength sandbox.

This is basically what cargo-quickinstall does for bin creates. We use GitHub runners as our sandbox boundary. There is no documentation on how strong the sandbox is for these things, as far as I can see. I've not looked into how hard this would be for sccache. I tried doing something similar in the form of cargo-quickbuild, but my understanding of proc macro creates and the cargo dependency resolver stumped me so I gave up.

(Sorry if this is unintelligible - it was written at 4am on an overnight coach from London to Durham, because it turns out I can't sleep on coaches)

1 Like

If you're building a binary intending to run it, I agree. And that's the case here. (But the point of sandboxing here isn't for security AIUI, it's for reproducibility.)

cargo run is obviously unsafe for untrusted code. But buildscript/procmacro code isn't just run for cargo run. It's also run for cargo check. It's much less obvious that cargo check is an attack vector, and cargo check often gets run automatically by IDEs or other processes (e.g. cargo watch).

It's a smidgen better than npm where the cargo add equivalent runs an install script[1], but not much — combined with an IDE doing cargo check on save or with proc macro support enabled, cargo add typosquatted can still end up running untrusted code.

The cratesio team does remove actively malicious crates known to be exploiting this attack vector once they're known, but there's always a period of time between upload and it being removed where the attack is possible. Even though it's not a complete solution, the ability to run buildscripts/procmacros in a wasi sandbox or otherwise would still be a material improvement over the current status quo.


  1. Since Javascript doesn't have a separate build step, this is the moral equivalent of a buildscript. ↩ī¸Ž

4 Likes

@alsuren Thanks for the writeup, I suppose for quickinstall, we probably want a per rustc invocation sandbox, which can be accomplished by RUSTC_WRAPPER.

But IDK how to use sccache while sandboxing, I suppose that needs sccache itself to support samdboxing in its config/env.

I still think having per build.rs/proc-macro would be great for general use, especially for developers, as pointed out by @CAD97 .

I just realise that compiling build.rs to wasi is completely useless for sandboxing purpose since once you allow to execute external cmds, which it absolutely needs in order to compile, link against system wide libs unless you have cc/cxx, ld/ldd make, ninja, cmake, meson, pkg-config available as a library to link with.

So for build.rs, we could only use methods describe in this post: Either use RUSTC_WRAPPER to sandbox entire rustc invocation, or having dedicated sandbox for build.rs.