Pre-RFC: procmacros implemented in wasm

pygy · August 31, 2019, 11:21am

Another thing I just though of: without a sandbox, a proc macro could also leak build server secrets into the binaries.

bjorn3 · August 31, 2019, 12:18pm

Any rust code can:

fn get_gh_token() -> &'static str {
    env!("GITHUB_TOKEN")
}

pygy · August 31, 2019, 12:34pm

What about file system read access?

theduke · August 31, 2019, 12:51pm

include_str

pygy · August 31, 2019, 1:19pm

Then what about "scour the file system looking for bitcoin wallets"? I'm new to Rust (still learning) and quite surprised by what appears to be a lack of a security model around the compiler. The include_str documentation doesn't mention restrictions on legal paths (crate-local ones would seem a good default)... Likewise, granting arbitrary access to environment variables seems crazy (Oh principle of the least privilege, where art thou?).

That being said, I'm veering off topic here. Would this be appropriate as a new topic, or has this been discussed already and deemed "wontfix"? Assuming the latter, I'll shut up (until I grow some more rusty hair ;-).

bjorn3 · August 31, 2019, 1:29pm

That won't work. There are no dir listings and accessing a non-existent file would cause a compile error.

How do you define crate-local? Without cargo support the compiler won't know if files from the OUT_DIR are allowed (build script output). It also won't know if you are allowed to access dirs one level up relative to lib.rs/main.rs are allowed. (most crates put them inside the src subdir, but it is not required. Eg rustc puts them in the same dir as Cargo.toml

pygy · August 31, 2019, 2:01pm

Thanks @bjorn3, so there's interest, even though the corresponding RFC seems to have stalled (only so many hands/eyes in the team, I suppose).

Re. root path, the compiler could accept a project root as a parameter, and reject IO attempts by the code that is being compiled outside of that path. Defaulting to the directory of the main file (whatever comes before main.rs/lib.rs). OUT_DIR should a distinct preference, pertaining to the ultimate output of rustc, not to IO by the code that is being compiled (though waters become muddy for macros that generate external files (think cbindgen as mentioned above)).

jsgf · August 31, 2019, 3:52pm

Yeah I need to get back to that RFC. I want to split it into separate file and env RFCs.

Though, like the proposal in this thread, the primary motivation is to have tighter control over the inputs to a build to get better determinism.

They will also have some incidental security benefits but ultimately I don't see it as cargo/rustc's job to implement a bespoke security model - that's something that should be solved on a system level in the context of some specific threat model.

pygy · August 31, 2019, 4:17pm

I'd argue that rustc should have a security model, and should be secure by default. Something along the lines of "it shouldn't be able to read/write info or execute code outside of what's provided in the current directory unless given explicit permission".

Especially when RLS relies on rustc to do its job.

ratmice · August 31, 2019, 6:46pm

In theory at least with WASI design @sunfishcode mentioned above, it may be possible to have a different mechanism for allowing usage of secrets which doesn't reveal them to proc_macros.

This would involve setting up a program which has the secret e.g. via argument, and then giving WASI a filehandle to that programs stdin file descriptor like a pipe.

Thats a rather different scheme than e.g. travis/github assume with their environment passing though.

derekdreery · September 2, 2019, 3:16pm

Yeah I would also love some sort of badge that says "this matches what is at the repository" maybe by looking for a tag matching the version number. You'd have to continually check this though in case the history changed.

est31 · September 3, 2019, 2:19am

In fact cargo already stores the git hash when uploading a .crate file to crates.io. All that's needed now is to:

Add a button to crates.io linking to the specific commit on github (my cargo-local-serve tool already has such a feature)
Add a big red warning if the vcs commit does not match the content that lives on crates.io. It can maybe even generate an automated rustsec warning or something like that.

CAD97 · September 3, 2019, 3:47am

I'd much prefer a soft warning to a hard warning and especially an automated rustsec issue.

A decent couple of projects I've worked with use somewhat complicated code generation we want to spare the downstream users of running, so we bundle that code in the published crate but don't check it into the VCS.

This means the published content is not the same as the development version, though they build to the same thing.

And even ignoring that esoteric use case, any scheme definitely needs to be careful to respect the include/exclude keys and not error on extra content in the VCS.

pygy · September 3, 2019, 9:06am

In that case, you could, as part of your release process, create a temp branch, commit the generated code, tag that commit with the semver number, delete the branch and push the tag as a detached head. That way you don't clutter your repo with useless branches, but you still have the info out there.

This can off course be scripted, so it is a one time effort.

pygy · September 3, 2019, 3:36pm

Searching for "capability" lead me to relevant threads, and ultimately to the Rust Secure Code Working Group. I'll review the various threads and other ongoing work, and I'll try to see if/what I can contribute.

sgrif · September 5, 2019, 7:37pm

Just to make sure discussion doesn't get too focused on a dead end, having crates.io compile binaries is not likely to be something we could prioritize any time in the near future.

dtolnay · September 5, 2019, 9:13pm

Compiling in docs.rs and pushing to crates.io (for now) could be a much smaller architectural change.

pietroalbini · September 7, 2019, 5:33pm

I'd strongly prefer not to push other responsibilities to docs.rs.

The build and sandboxing code of Crater was recently extracted into a standalone library called rustwide and we're in the process of integrating it with docs.rs (replacing the existing build code) to provide a consistent build environment between the two (which should benefit Crater's coverage a lot) and to reduce the overral maintenance burden.

Thanks to that building a standalone service to precompile proc-macros to wasm wouldn't be a huge undertaking, as rustwide already deals with maintaining a build environment and keeping everything sandboxed and secure. Of course the work for it has to be done, and infra resources have to be allocated to ensure it's secure and to operate it, so we shouldn't just assume it will get done.

gnzlbg · September 7, 2019, 8:01pm

How would users use this ?

AFAICT, these restrictions would need to be opt-in, at least on the current edition, because otherwise that would be a breaking change. That is, at least initially, users would need to write down somewhere (e.g. in their cargo.conf) that they don't want to allow any macros to do this or that. Is that correct?

Do the proposals include making these restrictions opt-out instead to protect users by default?

If so, how would the migration to an opt-out model work ? I could imagine that a future Rust2021 edition could make these restrictions opt-out, but in a dependency graph proc macro crates using different Rust editions would need to be able to co-exist. Would the opt-out be dependent on the edition of the crate a Rust macro gets expanded in ? (this would require users to upgrade all their dependencies to Rust2021 to get protection by default) Maybe on the edition of the proc macro crate? (this would require users to verify that they do not have any proc macro dependencies using an older edition) Something else?

Dushistov · September 7, 2019, 10:04pm

But isn't idea of precompiled source code make all idea insecure? There are a lot of infrastructure between end user that download wasm binary and crate source code, like random machine in the cloud that build source code, crates.io server and random machine in CDN that may cache wasm binary. So to make sure that all secure user should recompile crate source code before the first use, may be better just deliver source code as usual and compile wasm on user's machine?

Even if wasm binary run inside sandbox, it may just starts add "bad" code into generated code, and to hide its maleficent intent it may start do it not right now, but when current datetime will be >= some constant.

Topic		Replies	Views
Pre-RFC: Sandboxed, deterministic, reproducible, efficient Wasm compilation of proc macros language design	60	19531	April 5, 2024
Is executing arbitrary code at build time a good idea? language design	16	1969	April 24, 2019
Deterministic isolated proc-macros	27	1774	September 5, 2024
Caching of Proc-Macro expansions libs	45	1293	September 25, 2024
Sandbox build.rs (and possibly proc-macro) by providing a runner as env variable cargo	13	1192	January 25, 2023

Pre-RFC: procmacros implemented in wasm

Related topics