Compile time sandbox

Having a compile-time sandbox could limit the damage of a malicious take over of popular crates like what happened in xz.

The compile-time sandbox would work by introducing a per-crate capability. By default, the crate would have the full capability and can opt out of any capability:

[capability]
default-capability = false
capability = ["unsafe", "fs", "net"]

Disable of a capability will:

  • disable the corresponding interface in stdlib (for capability unsafe it would disable the unsafe keyword)
  • disable the capability in any of its dependencies

Capability can be combined with the use of features, to conditionally request for capability:

[features]
a = ["cap:fs"]

The capability can also be applied to each dependencies:

[dependencies]
a = { version = "1", capability = ["net"] }

The capability for crate "a" would be unified on dependency resolution, i.e. if another crate specifies a = { version = "1", capability = ["fs"] } then a would have capability net and fs.

The capability constraint put on a would also apply to its normal-dependencies as well.

NOTE that it does not apply to the proc-macro and build-dependencies, for that a separate build-time sandboxing is required.

Supported capabilities:

  • unsafe, capability to use unsafe keyword
  • fs, capability to use any fs operations
  • fs-readonly, capability to use any read-only fs operations
  • env, capability to use std::env
  • env-readonly, capability to read env
  • net, capability to use any net operations
  • process, capability to spawn process

I don't see how this can be reliably enforced. There are too many compiler bugs that allow unsafe things without writing the unsafe keyword (or anything else linted against by the unsafe_code lint, like #[no_mangle]). In addition fs implies both env (through /proc/self/environ), process (through writing eg ~/.bashrc) and through process every other capability. And net implies process (through talking to systemd over a unix domain socket).

6 Likes

Basically the only reliable ways of sandboxing are OS level sandboxing which is with process granularity or something like wasm which is effectively with process granularity too if you consider each wasm module to be a process. Both Java and C# gave up on sandboxing individual libraries.

8 Likes

Also note that if capabilities aren't transitively required (i.e. my crate needs the unsafe capability to use a crate with the unsafe capability) then it's borderline ineffectual (e.g. because I can just make a crate indirection to hide capabilities), but if they are transitively required, then they (at least unsafe) also become nearly ineffectual, since most crates want to utilize a core library crate which correctly utilizes unsafe for performance reasons.

For unsafe specifically, when you want to limit the extent what you want is cfg(ub_checks), which adds O(1) checks to _unchecked operations where that's reasonable and is being worked on.

5 Likes

This needs sandboxing like WASM and/or process isolation, because the whole native code stack trusts the programmer, and has never been a security boundary. The compiler toolchains can't even be trusted to handle file names with spaces safely.

Rust's standard library is not mediating interaction with the OS, like browser JS does, so denying access to one of its modules doesn't deny access to the same functionality from elsewhere. Rust's libstd is just one of many ways of executing arbitrary code and telling the OS to do things. And non-experimental OSes don't have real capability systems that could precisely enforce access.

Rust's separation between crates and modules is not a security boundary. It's only a helpful illusion, but it stops existing when the code is compiled. It's a mere naming scheme when the objects are sent to the linker. The linker doesn't even know Rust is a thing, and just smushes all the objects together, assuming they're trusted input.

Rust can block some obvious bypasses like no_mangle and arbitrary linker flags, but the whole stack was never meant to handle malicious inputs, so I'm afraid there will be many many loopholes from all the less obvious ways of breaking compilers, linkers, or influencing their configuration.

Rust's safety is designed only for catching mistakes of cooperative programmers, and is defenceless against malicious code.

16 Likes

And not just compiler bugs, but soundness bugs in any crate containing unsafe code. If you have any crate in your dependency graph with the unsafe capability, then a soundness bug in that library could lead to arbitrary code execution by other crates, even ones that don’t contain unsafe code themselves.

This is one more reason that, as @kornel says, Rust’s safety checks are not suited to stop malicious programmers.

2 Likes

:netherlands: hanks, in that case I think we should go with something simpler:

What about a pure and safe-only crate?

I.e. a crate that cannot use unsafe, fs, changing env, net, process related API, and cannot pull in any crate using that.

By putting

[sandbox]
pure-and-safe-only = true

into a crate, it would forbidden

  • use of unsafe keyword and use cfg(ub_checks) to prevent any potential UB
  • use of std::fs module
  • use of std::env::remove_var
  • use of std::env::set_current_dir
  • use of std::env::set_var
  • use of std::net module except for socket address types
  • use of std::process module

And the limit would transitively applies to all its dependencies, if any dependency violates that, the entire build would fail.

This would enable a subset of the dependencies to be marked as pure-and-safe-only, thus removed from the lists of crates to be reviewed manually.

Combined with build-script/proc-macro sandbox (or tools to list all crates with build-script), and tools like cargo-tree to filter out pure-and-safe-only crates, maintainer can get a list of crates they need to review which is a subset of their entire dependencies.

It won't try to "sandbox" the entire process, but rather reduce the maintenance burden of auditing dependencies.

I wonder how many crates could enable that lint in practice though, and for those that could enable it by changing some dependencies, what would be the ergonomic and performance impact of doing so.

I think serialisation/deserialisation crate and container/algorithm crates like tinyvec which forbids unsafe could be marked as pure?

Some previous discussion of ideas along these lines:

It’s tempting to think this, but it is not totally true.

For example, tinyvec is a crate with no default dependencies, no unsafe code, and no system I/O. It could be marked pure-and-safe-only in your proposed scheme. Suppose I write a crate that uses unsafe code to call a C library that returns raw pointers. I store a collection of these pointers in a tinyvec::TinyVec, and after I free all of the pointers, I use TinyVec::clear to ensure I don’t use them again.

What if a bug or a malicious backdoor is inserted into tinyvec that prevents TinyVec::clear from actually removing all the items from the vector in some circumstances? Now my crate could have a use-after-free bug leading to undefined behavior, thanks to a bug in a completely safe crate.

This is the sort of thing we mean by “crate boundaries are not security boundaries.”

8 Likes

rustc currently assumes that input is trusted, and any sort of escapes are not a vulnerability. making this airtight is super hard and then keeping these guarantees is even harder - with the worst part being LLVM, which is of course full of segfaults and other memory safety issues and therefore easy to exploit. LLVM will not treat their backend as a security boundary, so using the LLVM backend for compiling untrusted crates is out of the question. memory safe backends like cranelift are more realistic, but it still seems unlikely that we'd want to commit to making rustc a security boundary.

6 Likes

thanks, yeah it seems like a terrible idea to provide any guarantee, code audit and careful dependency review is necessary.

I have a tool called cackle that attempts to do a lot of this - per crate capabilities, sandboxing of build scripts and rustc etc. Also a blog post were I walk through using the tool. I don't think a tool like this can ever really be perfect, but not being perfect doesn't mean it's not still worth doing.

3 Likes

Thanks that's awesome!

I checked it before but didn't notice that it has sandboxing.

What kind of sandboxing does it use for build-script and rustc invocation?

I personally think using gVisor with docker is pretty good sandboxing, with network turned off and the image mounted immutably.

Currently the only sandbox supported is bubblewrap (bwrap). By default it blocks network access and only allows writing to the output directory (for build scripts).

Thanks, I think an implementation using gVisor + docker would be beneficial, since the Linux namespace has a few bugs reported before, and gVisor is designed to prevent that.

Though since it is designed to be used with docker, it'd be a bit difficult to use it directly, but the take away is that implementing a userspace filesystem might be safer than relying on linux kernel namespace alone.