The problem
I assume that most readers here are aware of the recent flatmap-stream attack on Node. If you’re not, here’s the short version: the developer who managed a much-used library turned out to be malicious, and injected malicious code, which flew under the radar and was installed as dependency in gazillions of applications.
The Rust ecosystem is vulnerable to the exact same kind of attack: one could imagine a malicious developer taking over, say, Itertools or some other popular crate through social engineering, and using to to attack Servo users, for instance.
I believe that there are several ways to mitigate such attacks, at several levels, and I would like to suggest a purely-technical security approach, based on permissions.
Permissions
I would like to add the ability to tag crates by what they can do, and have cargo (or perhaps crates.io?) verify these tags when updating dependencies.
For instance, let us assume that we can differentiate the following crates by static analysis:
- crates that could perform I/O (including any crate that contains
unsafe code, or crates that depend on crates that could perform I/O);
- crates that never perform I/O.
(I know that we would need to special-case things such as logging or build-time I/O, but please bear with me for the moment)
A crate InMemory that never performs any I/O is declared in Cargo.toml
[package]
...
permissions = []
If we write a crate that depends from InMemory, when cargo build builds a version of InMemory for the first time, it runs static analysis to ensure that neither InMemory nor its dependencies feature unsafe code or call into std::fs or std::process.
If this invariant is violated, compilation fails.
Conversely, a crate OnDisk which does perform I/O is declared in Cargo.toml
[package]
...
permissions = ["io"]
We do not need to check anything for this crate, but using it taints its dependents.
We now develop a crate MyCrate that depends on InMemory and OnDisk:
[package]
permissions = ["io"] # Since we depend on a package that requires io
[dependencies]
ondisk = { version = "*", permissions = ["io"] }
inmemory = { version = "*", permissions = [] }
What we gain
If the static analysis is sound, we have safely decoupled code that can perform I/O for code that can’t. In practice, this provides a form of (minimal) crate-level type system.
In particular, if inmemory is every updated to a version that performs I/O, the developer of mycrate will find out about this as mycrate will stop building due to a now invalid dependency to inmemory.
Limitations
This is by no mean a complete solution to malicious code. However, I believe that it would already mitigate all the simple cases.
The example above uses I/O as a permission. It is, of course, possible to think of other permissions.
We probably want to allow some specific releases of some crates to drop some permissions (e.g. log performs console I/O, which sounds fair enough) after passing some kind of audit. To be discussed.
Precedents
About 20 years ago, the MMM web browser used this mechanism to guarantee that OCaml extensions could do no harm (for some definition of harm). I have not heard of any problem caused by this policy, although I admit I haven’t attempted to follow this closely.