RFC: Implement a sandbox for environment variables and files


#1

RFC for PR 49387. It introduces some new command-line options to sandbox env!() and include!() directives.


This PR introduces some simple sandboxing for process environment variables and for include files (collectively “system environment”).

This is primarily to allow a build system to more precisely control the inputs to rustc which may affect the generated output. Rust has two mechanisms by which an input source can access ambient properties of rustc’s system environment: env!() and option_env!() for reading environment variables, and include!()/include_str!()/include_bytes!() for reading arbitrary files.

(This PR specifically does not intend to address any actual security concerns, since there are many other avenues that it does not attempt to control, such as compiler plugins/proc macros. However, it does help with unintentional problems which could result in later security problems if unaddressed.)

Environment Variables

rustc allows source code to directly access its process environment variables via the env!() and optional_env!() (pseudo-)macros. This poses a few of problems:

  • the build system has no idea what variables the code is using for the purposes of tracking its inputs
  • code can read arbitrary contents from arbitrary variables, which may pose unwanted information leakage (from build environment to final deployment environment, for example)
  • it combines the environment rustc needs for its own operation (PATH and LD_LIBRARY_PATH, for example) with environment the compiled code might want, and doesn’t allow them to be set independently

We introduce the following new command-line options to allow the apparent environment to be controlled at a fine-grain level:

  • --env-clear - completely empty the logical environment visible to env!()/optional_env!(), causing them to all fail/return None. Without any other options, this will completely disable environment variable access.
  • --env-allow NAME - allow a specific environment variable to be read from the process environment
  • --env-define NAME=VAL - define a logical environment variable. This does not need to be present in the actual process environment, or if it is, its value is overridden

By default, the environment is completely open, leaving the existing behaviour unchanged. Once one of the options above is specified, accesses to environment variables become controlled accordingly.

Paths

Rust allows arbitrary other files to be directly included, either as more Rust source code (include!()), a text string (include_str!()) or arbitrary binary data (include_bytes!()). These macros take a raw string which is used as a path which may be absolute - they therefore allow any file that rustc has permission to access to be used in the compiled output.

(This differs from separating a crate into multiple source files, as those files are always relative to the top-level lib.rs/main.rs source.)

This causes a couple of problems:

  • the build system can’t know or constrain what files are actually inputs to the compiler
  • the code can unintentionally leak state from the build environment to the deployment environment

To implement this, we introduce a couple of command-line options:

  • --clear-include-prefixes - clear all allowable prefixes, effectively disabling all include*!() macros
  • --include-prefix PATH - add PATH to the set of valid prefixes. All included paths must match one of the valid path prefixes before it can be opened.

All paths are canonicalized before matching, so they must exist at the time they’re specified.

By default, all path prefixes are valid, leaving the current behaviour unchanged. They are only constrained once one of the options above are specified.

Note that a “path prefix” can be an entire pathname, allowing these options to explicitly specify which individual files may be included.


#2

It would be good if all attempts to access environment variables and/or include prefixes that are blocked by the current option output a list of the blocked items formatted as options to enable them. That way, you can clearly see what the build wants by using the “–clear-*” options and this will produce a set of “–include-prefix PATH” and “–env-allow” entries that you can review and then include in the build once you approve them.


#3

That might be a nice refinement, but its a bit awkward for a few reasons.

One is that I’m not sure that parsing can continue after env or include failure - the current behaviour is to stop pretty quickly.

Secondly, it would be hard to know what to report. For --include-prefix, for example, what prefix should it report? Should it try to find something that’s common to all the include failures? Or just the complete path to each file?

And similarly, there’s no clear way to know whether to suggest --env-allow or --env-define.

My particular use-case has well-defined values for all of these, set by the build system, so simply reporting the first failure is the correct response and any suggestion would be wrong (other than fix the code to either not include the bad path or not use the unknown variable).


#4

I think this seems like a good feature, but I feel it is complex enough to merit an RFC. I’d like to have the cargo team take a look, for example.


#5

RFC PR https://github.com/rust-lang/rfcs/pull/2391


#6

How do you plan to use it? Isolation of Cargo from its environment could break builds in Xcode.

For example, when Xcode launches cargo, it sets MACOSX_DEPLOYMENT_TARGET. It affects rustc, and also has to affect clang launched by cc-rs from build.rs scripts. If the same value is not set for every object in the project, it may generate buggy executables or executables that don’t work on less-than-the-latest macOS version.


#7

These options have no effect on the actual environment inherited by subprocesses - they only affect the apparent environment visible to env!() etc. So if build.rs has a env!("MACOSX_DEPLOYMENT_TARGET") then it would affect that, but if the build executable uses std::env::var("MACOSX_DEPLOYMENT_TARGET") or invokes a subprocess which uses that variable, they will work unchanged.

My initial concern is sandboxing the system environment when using buck to build Rust code. I haven’t given cargo much consideration, but it looks like there are others who are.