(Copying some stuff from the optional namespaces thread)
There's been some recent discussion about a Medium post by a security researcher about the supply-chain attacks he performed on NPM, PyPI and RubyGems.
While the specific attack isn't applicable to Cargo, I think it is a good illustration of the problem with vulnerabilities based on supply-chain attacks, typosquatting in particular: they accumulate quietly for a while, because attacks aren't really feasible in small ecosystem; and when attacks do happen, the ecosystem has become too big to change course quickly.
In this case, the researcher made over 130.000$ in less than 6 months, from bug bounties alone. Given the high-profile nature of the targets, one shudders to imagine how much he could have made by selling these vulnerabilities to malicious actors; or how many similar attacks exist undetected in the wild, camouflaged as bugs.
Granted, the supply-chain attacks Cargo is vulnerable to are subtler, and harder to exploit. But they very much exist, and the only thing keeping them from being exploited is that Rust isn't used on the same scale Node or Ruby is. This will change sooner than later.
Rust seriously needs to have a stronger supply-chain-security story. Part of that story would be better tools to controls capabilities given to dependencies (eg forbid unsafe code in dependencies, forbid arbitrary system calls or filesystem access in dependencies, WASI-style); and a large part of it would be strong defenses against typo-squatting.
After I posted this, people offered a few suggestions to mitigate these attacks. I don't think any of them really address the root of the problem:
- Hamming distance covers some, but not all typo-squatting attacks (for instance,
foobar_asyncas a squat of
async_foobar), and has false positives.
- TUF is opt-in, and doesn't cover supply-chain attacks on open-source dependencies.
- cargo-supply-chain is opt-in and only gives coarse information about package security.
- cargo-crev is opt-in, and requires a manual review process (more on this later).
I'm not sure how to communicate this, but I think that Cargo needs to form a threat model, and these suggestions don't really have one. Some things to consider:
- The threat model must assume that code can come from anybody, and libraries that accept code from unvetted strangers will outcompete libraries that only accept code after a rigorous vetting process (eg, I'm currently contributing to Raph Levien's druid; for all Raph knows, I'm a DGSE agent planted to introduce vulnerabilities in his code; Raph has done none of the thorough background checks that would be needed to prove this isn't the case; yet he's still taking my PRs).
- The threat model must assume that people will be as lazy as they can afford to be when pulling dependencies. If people have a choice between a peer-reviewed dependency without the feature they need, and an unreviewed dependency with the feature, they will take the latter.
- The threat model must assume that both attackers and legitimate developers can write code faster than people can review it.
- The threat model must assume that some attackers will be sneaky and determined; if the ecosystem defends against supply chain attacks with heuristics, they will learn to game these heuristics. If the ecosystem only checks new crates for suspicious code, they will write non-suspicious code at first and add the actual vulnerability months later.
We need a zero-trust model, that starts from the assumption that any dependency code is untrusted, and works from there. That means a capability model, where we have the ability to restrict dependencies from performing broad types of actions (opening sockets, reading arbitrary memory, etc).
Just being able to set a cargo option to transitively forbid unsafe code would be helpful; because it would mean crate maintainers would be more likely to get "I can't use that crate because I have a transitive-forbid-unsafe cargo config, please fix" types of issues.