Does include_bytes! need to support arbitrary absolute paths? (outside of workspace, .cargo, or target dir)

This seems like a surprisingly powerful feature.

I don't think crates-io dependencies have any business poking around the disk. I know dependencies already have ability to do sneaky things via or proc macros, but include_bytes! adds more things to look out for (and maybe someday the other things will be sandboxed).

In local workspaces, literal absolute paths in the source code seem like a risky practice, which makes projects dependent on disk layout or specific OS.

include!(concat!(env!("OUT_DIR"), "")) is common and could be technically an absolute path, but it's not an arbitrary one.

How about adding a lint or warning for absolute paths outside of usual places like the workspace root, target dir, and .cargo/registry/src subdirs?


I can imagine semi-legitimate reasons why a crate might include things from /usr/lib or /usr/include (probably using a path detected or configured via some mechanism), and of course a user can pass an arbitrary path in an environment variable.

But in general, warning (in a way the crate can't suppress) for unusual paths seems reasonable. This could align with whatever sandboxing system we end up using for build scripts, where the user can extend the permitted accesses from the sandbox if they expect those accesses and want to allow them.

I think the primary question is whether there's value in doing the initial step before the full sandboxing. Because in the absence of sandboxing, any crate could easily pull in a proc macro that does exactly what include! or include_bytes! does but without the sandbox.

Don't focus too much on the absolute paths -- relative ../ paths can have the same effect if you try hard enough...


Symlinks are a problem too, but I think cargo's crate (tar) extraction blocks that at least.


Escaping relative paths as well as symlinks can be caught as well. A check whether a path is within allowed directories will require path canonicalization anyway.


I know we are reading in directories above the workspace at work. We have an existing legacy C++ application and have recently added some Rust. In particular I know we are including protobuf files from something like ../../proto.

I don't remember if the mechanism is or include or whatever. But I consider that sort of thing a somewhat legitimate usage for mixed build system setups where cargo isn't the top build system. (Currently cmake fills that role, but the plan is to transition to bazel.)


My view on this is that anything which makes the malicious stuff look weird or out-of-place is a benefit— If I'm pulling in a less popular crate, I'll at least glance over the source code to see if it feels like a quality implementation.

The implementation of this would involve Cargo (which knows how to classify paths) passing data to Rustc (which expands the macro). Unless it is done in a weird way, it should be at least possible from the outer build system to add non-warning paths using RUSTFLAGS.

1 Like

We have an RFC for env variable sandboxing for rustc. I think it would be interesting to also have FS sandboxing and exposing the two of these in cargo so packages declare what they need access to. I'd want to see easy opt-in for "no env except what cargo gives me" and "no paths outside the root mod's parent directory" (wording is specific so we can sandbox #[path]).

While there is still the hole left by and proc-macros (until we can sandbox those), at least we can statically determine whether those exist in the dependency tree and mark them as needing audits.


The best fix for this is using cap_std - Rust


Is that ever realistic considering backward compatibility and things like sqlx? Same for the other types of sandboxing, it has to be opt in. Maybe you could do something with editions, but then you also need to consider that you should be able to mix editions.

Of course there will need to be some opt in/opt out and a migration process. For proc macros this seems to be tractable — there’s already been a WASM proof of concept, and it could automatically control file system access. seems much more difficult, because sys crates may legitimately need to poke around the system.

However, for me the concern is that a crate that has no macros and no may seem less risky, but surprisingly it still has file system access.


That makes sense. However, in that case I assume that all Rust crates and C++/protobuf/etc code live in a single folder, in some subfolders. I.e. they are not just pulled from a central registry into arbitrary places. I think it's thus a reasonable default if all crates on were sandboxed in a way which prevents filesystem access outside of their OUT_DIR. It also seems like a reasonable default for crates in custom registries, because putting those in arbitrary folders and expecting some certain structure from the filesystem is likely to be a bad idea.

Also, the issue is most dire specifically for public crates on or GitHub, rather than private registries and repositories. In those cases one can reasonably assume that only employees have access to the registry, and that they are responsible for the code they put there.

I.e. if this sandboxing is applied exclusively to dependencies, it would be a significant improvement to security.

One interesting mixed case is where someone pulls a public potentially malicious dependency from GitHub, and opens it in an IDE, which will execute all build scripts & macros. While modern IDEs ask to enable this functionality, they are barely useful without it, so one can expect users to execute compilation scripts in the projects they open. They may do some cursory checking of or proc macro dependencies, but more complex attacks like include_bytes! injection could easily slip by. However, at that point the crate is on the user's filesystem, so how could we distinguish this case from the one where people are working on their own trusted crates and they need unsandboxed filesystem access?

Which makes me think that perhaps Cargo could have an override switch for sandboxing of local crates, so that all crates are sandboxed by default, but trusted projects which need unrestricted filesystem access could use it. Enabling this switch would be an obvious security flag that one could check, even automatically, to signal that the project requires potentially more scrutiny.


This seems like a reasonable step towards having sandboxed builds. I agree it should work in conjunction with the availability of sandboxing (or at least with a lot of disclaimers to avoid giving a false sense of security). Without sandboxing, it only defeats the laziest possible attacks.

I also want to note having all of

  • filespace sandboxing during build
  • env sandboxing during build
  • sandboxing
  • proc macro sandboxing

will not protect anyone from a malicious dependency once they run the built executable. After the first such run, one can't assume any builds are clean (or assume much of anything else really). And I imagine most people who develop and build in an unfortified environment also run the produced binaries in that environment.

cargo run and friends also having some form of auditing sandbox would help some (and be worth having IMO), but would still be a best effort sort of feature as far as the produced binary goes.[1]

Any documentation/help around these features should make clear to the programmer that there's only so much that Cargo and the compiler can do; they can never completely protect the programmer from a malicious upstream. Those who care should be guided towards fortifying their development environments, CI best practices, crev, etc.

  1. More sophisticated attacks will try to detect being in the sandbox, etc, so a clean audit doesn't mean a safe executable. ↩︎


Just pointing out another thing...

#[path = "/etc/passwd"]
mod foo;
1 Like

I think that any limitation or warning on include_str! should be viewed as a convenience lint and not a security feature. That's because can do arbitrary things like running programs (such as running cat /etc/passwd) and there is no way to statically detect that with 100% reliability without also running the itself (that's called Rice's theorem). So any lint on include_str! can't improve the security of the build (unless one posits that malware developers will not test their code and tinker until they make it work)

Because of that, if you need to defend against a malicious code in the build step itself, if the build is not sandboxed you have already lost.

On the other hand, from the developer's point of view, this kind of lint may be sometimes annoying (there may have legitimate use cases for reading from /etc), so there should be a way to turn it off.

I think that not fully sandboxing the build by default was a mistake. I get that some builds need to access system files (for example, to use distro-provided -dev libraries in /usr/include and /usr/lib rather than vendoring dependencies inside the crate itself), but there should be at least a manifest detailing every such dependency, rather than being free for all by default.

Right now the only way forward I can think of is to run the build inside a sandbox. For example, I think that something like naersk fits the bill (but afaik it doesn't use cargo).

Maybe there should be a cargo-sandbox command that builds normally, but in a sandbox.


If this also applied to it would break sys crates, (anything using pkg-config or calling the system C compiler at all for example) as @kornel pointed out. That is a major problem.

Other than that I think your idea seems sensible.

My interpretation of a sandbox environment includes everything you need to build "normal" crates, and nothing that could leak private information on the host. This is a bigger scope than "just run it in WASM", but there may not be an alternative.

Emphasis then on sandboxing being a reasonable default. Ideally escaping the sandbox has various easy and explicit mechanisms to do so, some of which get printed out when the sandbox causes an error.

And ideally the sandbox does become the default, and not an option that simply almost everyone should enable.


One other solution would also be to have some scanning feature, when publishing, which checks for obvious things and makes it visible to the user.

For example using base64 encoded strings, could generate a warning, as its not that typical to use in a library. Or any form of /etc/passwd, generates a severe warning, etc.

I definitely prefer this approach over sandboxing, because as soon as you invoke the compiler, which is typical for, you can do everything anyway.

And an easier way to report crates with malicious intend might also be a good idea. (Which I guess is not implemented yet for spam reasons)

Test suites may contain them. Docstrings for examples of things using Base64 (HTTP header examples come to mind) may exist as well.

Again, test suites and comments may exist. And anything simple is going to miss concat!("/etc", "/passwd").