Making crates.io verify code against repository?

droundy · February 21, 2021, 4:29pm

This attack requires compromising the git repository. The security benefit is precisely to require that. It's not so hard to make the git repository more secure than than the crates.io tokens are, just encrypt your ssh private key. Since we really want to secure our got repositories in any case, making crates.io at least as secure send a major benefit, even if it's opt-in.

One can hope that the maintainers of the most widely used crates will take the effort to secure their repository keys.

kornel · February 21, 2021, 6:11pm

When I said that I've had a different attack in mind when a malicious actor would publish their own crate and their own repo, except the repo had source code that looks innocent, but the crate would contain something else. This is a way to mislead users who review source on GitHub instead of reviewing actual crate source.

I am very worried about insecurity of access tokens, but IMHO that should be addressed directly with 2FA on crates-io, and not indirect and more complicated repo<>crate matching machinery. There's no guarantee that the repo is well protected, but crates-io could enforce its own 2FA.

ratmice · February 21, 2021, 7:29pm

Indeed, I think the most useful/least misleading thing was if crates.io could tell you the SHA corresponding to a crate, trying to attest to tags/branches seems absolutely the wrong thing to do.

atagunov · February 21, 2021, 10:26pm

Isn't it true that an attacker who has gained access to publish to creates.io is free to publish a version of a crate that references attacker's own github repo?

pietroalbini · February 22, 2021, 9:51am

If the goal is to make it harder for attackers to publish malicious versions of an existing crate with compromised access tokens, I think implementing some sort of 2FA or staged uploads on the crates.io side would offer better protection than adding optional code verification against a git repository.

While not impossible, implementing code verification wouldn't be trivial either. Cargo modifies the source code of the crate when it generates .crate files (for example it moves the original Cargo.toml to Cargo.toml.orig, changing the contents of Cargo.toml), so we'd have to detect which version of Cargo was used to upload the crate (which is not possible for older Rust versions), download that version (potentially a nightly) and run cargo package on the repository. Cloning git repositories could also be really slow, clogging the crates.io background jobs queue when a lot of crates are published in a short amount of time.

eminence · February 22, 2021, 2:48pm

I'm a little confused on the proposed logistics here. Assume that "cargo publish" did the above describe validation. What would happen in the following scenario:

I, a malicious crate publisher, insert some malicious code into my crate. I commit the malicious code via mercurial/pijul/git/whatever. Due to the aforementioned validation, I'm required to push this code to some public repo before publishing. I do that. (I assume I also have to record the name of a version control reference in my cargo.toml file. For example, the name of a git tag). Then I run cargo publish.

Crates.io on the backend will validate that the publicly available code matches what is uploaded in the .crate file (taking into account the differences described by pietroalbini).

But after that validation happens, I immediately delete the commit/tag that I just published, and perhaps publish a new one in its place (one that doesn't have the malicious code).

What happens now? Anyone can still download the crate file to inspect the code (edit: but the initial code verification is now useless). Will there be additional tools that will allow a downstream user of this malicious crate to say "the code uploaded at package time used to match some public commit, but not that commit cannot be found"? This is too expensive to do everytime a crate is downloaded. Maybe cargo-crev could learn to do this?

FWIW, I am somewhat skeptical of this code-comparison approach, and I also think that 2FA or something similar is the better approach to protect against the misuse of compromised crates.io credentials

droundy · February 22, 2021, 3:12pm

Not if my idea is implemented and a previous version referred to the repository. The idea is to use the previous version's repository configuration to prevent that from happening.

droundy · February 22, 2021, 3:14pm

I agree that would provide more security if it were used. If it were optional, it might well provide less security. Security approaches that are complicated or inconvenient aren't used.

Nemo157 · February 22, 2021, 3:18pm

I think that cargo-crev could (somewhat easily) add support for helping with this, so I opened an issue about it

github.com/crev-dev/cargo-crev

Verify crate contents against repository

opened 03:16PM - 22 Feb 21 UTC

closed 04:44AM - 15 Jan 22 UTC

Nemo157

There are two very helpful steps that `cargo crev` could provide as a starting p…oint to review a crate, probably as separate subcommands that you can use while in the context of `cargo crev goto`: 1. Show differences between the published crate files and what has been recorded in `.cargo_vcs_info.json` (I'll just refer to this as the _identified commit_ from now). 2. Verify that the repository contains a tag for the version number pointing to the identified commit. These would both not indicate anything malicious if they fail, but depending on how they fail the reviewer may be able to draw additional data to add to their review. My suggestion for implementation: 1. Check that the `.cargo_vcs_info.json` exists and has an _identified commit_ to use, with a known VCS. 2. Attempt to checkout the url given as `package.repository` in the metadata into a temporary sandbox (with pre-confirmation as connecting to the server may be a privacy leak). a. If checkout fails, allow user to specify the url in case they are able to derive/find a usable URL. 3. Verify that the repository contains the identified commit a. If it does, generate a diff from the published crate files to the repository content at that commit for the user to peruse. 4. Print all tags (with name, commit, and signature details if signed) that: a. Contain the version number as a substring b. Reference the identified commit (both directly and looking through a signed tag). <sub>Inspired by discussion in <https://internals.rust-lang.org/t/making-crates-io-verify-code-against-repository/14075></sub>

riking · February 25, 2021, 9:48am

Not crates.io itself, but certainly a Rust-specific crate source viewer website would be an excellent place to implement Rust-only features like running rust-analyzer or building registry-wide crossreferences to symbols. If this site became the standard for casual source review of Rust crates, then the "I'm not actually looking at the published source" problem goes away.

carols10cents · February 25, 2021, 3:49pm

Again, docs.rs already implements source viewing: rand 0.8.3 - Docs.rs

RalfJung · February 26, 2021, 8:39am

What is the right place to submit feature requests for this? For example, when pointing to a particular function, it would be good to be able to link to a line in the code, similar to what the rustdoc, GitHub or GitLab source viewer do.

Nemo157 · February 26, 2021, 12:56pm

The source view is definitely very barebones at the moment, IMO it'd be great to have an overhaul of it making it more useful.

jyn514 · April 30, 2021, 7:53am

I don't know how useful it is to have both docs.rs source view and the auto generated rustdoc source. Docs.rs will never be able to offer go to definition or anything like that, it doesn't have any info about the code itself. IMO what we should do instead is make it easier to navigate between the Rust source that rustdoc generates, and the non-rust code hosted by docs.rs that rustdoc ignores.

(To be clear, when I say non-rust code, I mean things like README.md, build.rs, Cargo.toml.)

jhpratt · April 30, 2021, 9:26am

I can neither remember the name of the project nor find it relatively quickly, but I recall a project from a while back that generated static webpages where the code contained go-to-definition style links, type annotations on hover, and similar. It was a clever thing that I wish were more widely used. ^{yes, I know not remembering the name isn't helpful}

I would presume something like this could theoretically be tied into the docs.rs build setup, though it would certainly extend the already stressed build times.

bjorn3 · April 30, 2021, 9:30am

Rustdoc excludes source files that aren't used while compiling the current crate. That includes source files used on different platforms, build scripts and files read by build scripts.

matklad · April 30, 2021, 10:27am

Docs.rs will never be able to offer go to definition or anything like that, it doesn't have any info about the code itself.

I am more optimistic. It'll take some time, but we'll get there one day. Just six years ago we didn't have goto definition in the editors.

EDIT: actually, I am somewhat surprised that we just don't have this already widely deployed? Opened https://github.com/rust-analyzer/rust-analyzer/issues/8696.

jhpratt · April 30, 2021, 10:33am

That's the one! I guess I was mistaken that it generated static pages, but that should be possible if truly desired.

system · July 29, 2021, 10:34am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Verifying that .crate files match the git repository	33	1916	October 13, 2024
Security fence for crates tools and infrastructure	50	3123	March 25, 2019
[Solved] Run crater/cargobomb on GitHub projects not published on crates.io?	3	1512	March 25, 2019
Vendor lock-in	11	1071	March 25, 2019
Impersonation supply chain attacks	5	289	November 5, 2024

Making crates.io verify code against repository?

Related topics