Sandbox build.rs and proc macros

in npm, node-ipc's author recently launched a supply chain attack using npm's functionality for install scripts. they is working on removing these install scripts, but rust has the same issue with build.rs and proc-macros.

rust already has the Miri and the CTFE engine. we should use these to implement sandboxing of build.rs and proc-macros to protect build systems against these attacks.

these must be opt-in due backward compatibility, but one day they may restricted entirely in an edition to preserve security of npm. proc-macro have unrestricted access to files and network, and build.rs cannot even be opted out of.

1 Like

Sandboxing is already being discussed. But note that even if build scripts and proc macros are sandboxed, a library could still include malicious code in the actual functions people call, or in _init or similar mechanisms run on load, or many many other things.

Something like WASI module linking where each library has minimal privilege would help with that. But in the absence of that, a malicious library is dangerous whether or not you sandbox what it can do at build time.

2 Likes

Note that cargo doesn't really have an equivalent to mom's install scripts. npm install is cargo add, and adding a dependency in cargo does not run any custom code.

There's a general positive opinion of sandboxing build scripts and proc macros (the typical suggestion is using wasm; this has the added benefit of being able to provide pre-compiled versions.)

That said, here's the general argument for why this isn't higher priority:

Generally, doing cargo build implies that you're going to cargo run. At cargo run point, it doesn't matter if cargo build is sandboxed, because you're running the code on your machine anyway.

For the purpose of reviewing an untrusted library, you can avoid running any untrusted code just by not running cargo check. (You can assume that it builds.) I believe both VSCode and IDEA now have a feature that asks you if you trust the author of a workspace, and does not enable any features that require running untrusted code (i.e. check on save, proc macro support) if you don't mark the workspace as trusted.

7 Likes

That's not true.

we could use extra command to make cargo run safer.

e.g., we could unplug the network cable to disable the potential network transfer during cargo run, but with cargo build, by default, we need to keep the network connection to download crates.

what's more, sudo -u nobody ./target/debug/bin is a good replacement of cargo run, but we have no replacement to cargo build since sudo -u nobody cargo run generates

error: could not create home directory: '/.rustup': Permission denied (os error 13)

Actually, "run as nobody" could be good, since it minimize the damage of a malicious program.

Sadly, there is no such option in cargo

5 Likes

You could also pre-download them with cargo vendor - The Cargo Book

Certainly. But you can already do things like running cargo run inside a container or VM, which will be more secure than anything that cargo itself can do.

Which is why, as CAD97 mentioned, doing partial sandboxing as part of cargo hasn't been a high priority.

2 Likes

It should use a dedicated user IMO as two programs running as nobody can still interfer with each other in a potentially dangerous way. For example I believe on windows the java updater was running it's downloader with a low integrity level. This allowed malwre running at low integrity level to manipulate the downloaded java installer which would then run at regular integrity level, thus making it easy for malware to escalate from low to regular integrity level.

It depends.

Sometimes, a sandbox is more prefer than a VM.

Think about a silly question: how many VM you should have?

I wrote a crate that could modify rustc if available.

Which means, if you have several project and one of them has been poisoned, then all your project is affected.

There is no official cargo container, and it is not easy to create multiple VM copies to prevent the possible damage from malicious build.rs.


what's more, some crate should be build only in the host machine (e.g., torch-sys needs to bind with cuda, download torch dll, etc. doing such things in either a container or a VM seems not a wise choice.)


the interaction between nobody might not be a critical problem, since it would not affect the main environment.

Here, IMHO, java downloader have the the same permissions as other regular program, which actually hits your situation: two programs running as nobody can still interfer with each other in a potentially dangerous way.

maybe we are talk about the same thing, but you have misunderstood my opinion.

I means, cargo is running under the normal permissions. cargo build may compile build.rs, this is under the normal permissions, too. but after build.rs is compiled to build, the execution of build should under the permissions of "nobody" -- this is a lower permission, thus it won't affect the main cargo thread.

after build is executed, the normal compile procedure is under the normal permissions, too. Which might be what you want.

1 Like

That's a silly question indeed: it's not relevant to the point being discussed. Moreover, with today's technology (hardware and software), you can run a large number of VMs or containers on a single physical machine without any issue.

In practice, cargo build and proc-macros didn't turn out to be a security issue. Usefully sandboxing build scripts and proc-macros is hard, because what are you going to forbid them to do? Strip them off access to the filesystem? Then how would cc (which writes to object files) or pest (which reads an external grammar spec) work, then? Deny them networking? I legitimately needed that once too.

The issue of sandboxing builds have already come up a long, long time ago. It's not like you thought of this idea first, and it should be implemented today, without further consideration, just in order to please the paranoid. There are far easier security measures available today, which can and should be applied. In particular, code review of the crates you trust is not optional, and it never will be, however cleverly it might be possible to sandbox cargo one day.

2 Likes

Something is always better than nothing. allow cargo run as nobody and preserve all the environment variables might solve most of the problems in Linux.

IMHO, cargo build with nobody's UID/GID and temporarily grant all the permission to nobody under the target dir would be great, since nobody permission is very weak, it won't be able to read or write any sensitive file. cc works just fine, networking works, and only insensitive informations could be leaked.

Here, nobody is always an option, if it failed, we could switch to current user by something like cargo build --user $(whoami) --group $(id -gn) in linux, or similar command in windows (--group here could be something like "trustlevel" in windows)

You're absolutely correct, but id really depends. Without technique like sandbox, we HAVE TO untrust every package we depends on.

Even if we have manually checked them, we could not execute cargo build after update the index accidently (e.g.,execute cargo install empty-library), since crates might be poisoned.


What's more, runtime malware is easier to detect than build-time malware.

since malware could operate rustc, inject malware code, and then pretend itself be a normal crate.

we could not debug such procedure, since we are not willing to debug cargo.

but if the malicious code occurs in a executable file, at least we could debug it, figure out who inject malicious code.

We surely could not trust every crate, but at least sandbox could prevent us from doing a lot of undesireable check.

1 Like

Yet. If you look at NPM…

They may be granted access to files below the project directory and those under /bin, /lib and /usr, but not to any other files.

Do you need to be able to bind to arbitrary ports at build time? Send e-mail in a build script? Surely not.

I think even a GET-only HTTP client might need to be restricted for reasons that don’t directly pertain to security, like reproducibility of builds.

2 Likes

I'm wary of this argument.

Like, if you need to solve a strategic problem, with 10 sub-problems, and you need to solve all 10 sub-problems in any order to succeed, then you can always say "solving sub-problem X isn't high priority, because even if we do it, we'll still have 9 other sub-problems and we won't have solved anything" and then you never make any progress.

If you believe that achieving sandboxing of cargo dependencies has strategic value (and I do believe that), then getting started on the first step has just as high a priority as getting the entire thing done.

Yes, there are other sub-problems that are a lot harder to track (eg preventing arbitrary system calls from dependencies), but these problems will seem a lot less untractable once they're better defined, and they'll seem better defined once we've picked the low-hanging fruit and started sandboxing some parts of the compilation process.

(Also, I'm hoping other features like std-aware cargo and const traits will help)

7 Likes

I don't see how you would trust when even with sandboxing.

Isn't this solved by Cargo.lock? And even then, you can require the exact version of a dependency in Cargo.toml.

I don't see your point. What's preventing runtime malware from pretending to be normal code? And why do you have to debug cargo or a runtime program? That's too late, the malware already ran and did damage, if you want to be safe you need to check it before execution, i.e. in the source code, and this is equally possible for build scripts, proc macros and normal library crates.

You'll have to do that check anyway if you want to be safe. Sandboxing will just give you a false sense of security.

3 Likes

Sorry for my unfamilar with cargo.lock.

What about using nm and check whether strange symbol occurs?

e.g., if you found your program have a symbol named

_ZN4core3ptr86drop_in_place$LT$core..result..Result$LT$std..fs..File$C$std..io..error..Error$GT$$GT$17h73ce0b4f35433880E

but you are not using anything related to filesystem, that could be suspicious.

IMHO this is the only safe way to access file system. Other way could be check through check unsafe code. Unnecessary unsafe code could be suspicious, too.

Similar analysis only available when you build within a sandbox. Without sandbox, the rustc could be injected before it compile the main function, thus it could inject unsafe code and obfuse symbol's name to pretend they're safe program.

The more safe a build procedure is, the less we must do to ensure the safety. If we cannot run cargo build as nobody, we must check both build.rs(and what has been included in build.rs) and the source code.

With running cargo build as nobody, we only needs to check the actual code we use.

That is not as easy as false sense of security

Note that Cargo already supports “runners” which replace the normal process of directly invoking the compiled executable. This is usually used for cross-compilation, but I don't see why it wouldn't work when the target triple happens to be the same as the host, to perform “copy this executable into a fresh VM and run it”. Thus, together with build.rs and proc macro sandboxes, everything could be isolated.

An advantage of this over running a VM which contains the entire build process (or running as nobody), as has been suggested here as a currently-existing option, is that the code being built/run cannot corrupt future iterations of the build process, and correspondingly you never need to throw out any of your package directory or your cache data in ~/.cargo/ and target/. Thus, it allows a more un-interrupted workflow. (However, there would be the additional risk of malicious output from build.rs and from proc macros allowing sandbox escape via bugs in cargo and rustc. But since this is Rust, we can hope that it's mostly filesystem paths and such needing care, and not lurking memory unsafety bugs.)

So, unless I've forgotten another way user-supplied code runs, @PoignardAzur's point applies here and there are only three subproblems, one of which already has the hook needed to solve it.


I'd also like to see build.rs and proc macros sandboxed not for security purposes, but for well-definedness: people writing build-process-component code occasionally have interesting ideas about what they should read or write, and having a sandbox would ensure that such activities have a documented and enforced scope.

9 Likes

This is where WASI would be helpful, as you could give access to only the OUT_DIR (or other developer-defined directories if needed). Networking does pose a problem at the moment, as it hasn't been fully implemented yet, but it is being worked on (both tcp/udp sockets and only HTTP).

3 Likes

Watt was an interesting experiment in this area:

It executes procedural macros inside of a WASM interpreter. It claims to provide better performance than the traditional proc macro system.

9 Likes

If multiple people run as nobody, then one person can escalate privileges to the other person if the other person runs the resulting program. That is why you need a separate build user for every real user.

2 Likes

But what if your program is supposed to do IO?

You're still trusting the result of the build script. Are you saying you prefer to check whatever shared library/code it produces instead of the build script itself?

delete IO related code and check whether this code exists

With LTO, such symbol could be removed automatically.

Actually, I prefer to trust the whole crate when I use it. And if the build.rs could not execute arbitary code, we could trust it since malicious code must occur in other parts of the code, which we would check if we feel it might be dangerous.

What we should not trust are, whether the crate will work, or will the crate damage my system.

the former question is quite easy that we may have plenty of test cases, and the latter could be solved by some easy anti-virus scanner or just scan it by the simple nm .. | find .. code.

Without modification of the rustc, it is harder to convenience the crate maintainers.

Sidenote: In the latest comments everyone seems to assume that sandboxing would be done with sudo -u nobody. This has the following disadvantages:

  • If multiple people run as nobody, then I think one person can escalate privileges to the other person if the other person runs the resulting program (but let's not discuss about this)
  • Switching to nobody requires sudo privileges

These options seem better:

  • Somehow interpret the code: Use Miri, as proposed in the original post, or use WASM.
  • Use user-defined Linux namespaces (like Bubblewrap which is used by Flatpak does)
3 Likes