[pre-pre-rfc?] Solving Crate Trust


#21

Very nice pre-pre-RFC, although I don’t understand why opt-out/allow-by-default/blacklist for many capabilities instead of opt-in/deny-by-default/whitelist? Given that we are explicitly talking about a security/safety/trust problem here, shouldn’t we just be extra vigilant and as strict as possible?

For example, I would very much like "allowed to use unsafe" to be denied by default. In my experience, many crates that use unsafe really don’t need the unsafe. Some programmers use it for “optimization” purposes; others are C programmers trying to avoid learning how to borrowck; and very few are actual, legitimate users who are trying to call into a C library or write a safe primitive to build on.

Related: I am currently working on a project that will explicitly advertise itself as "transitively free of unsafe". For now, it means that I have gone through every dependency of my project one-by-one, recursively, and verified if/how they use unsafe in any of the functions/types I am using from them. I’ve also submitted pull requests that remove unnecessary unsafes from several such depended-upon-by-me crates in the course. In the future, if I update any of these crates, I will need to go through it again.

To mitigate this, I think many of us would appreciate some sort of tooling, e.g. cargo safe, that would allow one to query and track uses of unsafe in the transitive dependencies.


#22

Because we don’t know which security problem the user is concerned about; it is up to them to pick. This RFC is providing a framework that lets users decide what issues they care about and then audit for those.

Also, if this is built into cargo, capabilities being opt-out is a breaking change.

For example, I would very much like “allowed to use unsafe” to be denied by default. In my experience, many crates that use unsafe really don’t the need the unsafe.

This does not match up with my experience. Plenty of crates need it, especially FFI. There are amateurish crates out there, sure, but that’s a problem anyway, you should be picking good crates.

To mitigate this, I think many of us would appreciate some sort of tooling, e.g. cargo safe, that would allow one to query and track uses of unsafe in the transitive dependencies.

This is … basically what is being proposed here? A superset solution.


#23

I see this might be useful (I’m not really trying to decide if this exact set of capabilities is best, I mean the general idea). I’m a bit worried about the initial scope ‒ would it make sense to start small first?

Also, this is just the first step. This tells me which dependencies I might want to audit, but it doesn’t help me with the audit itself. I understand it is not the goal, but maybe it would be worth stating explicitly as non-goal (when I read the title, I kind of expected this to be addressed as well).

I also think some kind of signatures/reviews/audits published to crates.io (or elsewhere) could help with the other half ‒ at least for individual developers without the means to audit everything they have ‒ firefox has a large team and the resources. If I want to do my own little program, I want it to be safe, but I don’t have the means to check everything myself. But yes, that’s probably different part of the puzzle.


#24

Maybe use google’s pagerank


#25

Another failure mode:

This could potentially happen with Cargo — author of a popular crate may decide to sell it or accept a lucrative deal to “monetize” it. This means that even age and popularity of a crate is not a guarantee the crate won’t become malware :frowning:


#26

Trust is not transitive. In the end, it will probably come down to auditing and community endorsement, I think…


#27

My biggest concern with this is that crates often seem to extend their functionality and thus acquire additional dependencies that I do not want and will not use. The ergonomics around splitting the functionality of a crate is not great right now.


#28

Would cargo features help with optional functionalities and their dependencies? It would be an additional aspect complicating the RFC, but it might be worth mentioning.


#29

About a year ago I have read a post about reproducibility of the rustc releases (there is also an issue for this). I think the same approach applied to the crates.io libraries can make them more secure. At least anyone would be able to reproduce a build and compare it with one from the crates.io (or it can be done automatically).

I know we depend on LLVM but as it was discussed in the post I mentioned above everything can be solved.


#30

There are a few challenges with this, namely that as far as I understand releasing a crate right now involves building it, which may involve procedural macros, and in that case it’s tantamount to RCE. It’d be nice if such a build system wasn’t a terrifying single point of compromise.

I covered some ways to address this, specifically through the use of reproducible builds and scoped credentials, in my Rust Bay Area Meetup talk on Macaroons last year:

(start at 32:15)


#31

hello, I’m looking into this as a possible topic for my masters thesis. I’d like to do something like an overview of the whole problem (“trust in third-party libraries and dependencies”), with applicability and implementation specifically for Rust and crates.io.

I’m at the stage where I’m not even entirely sure what the end result would look like. At the moment, the following features look interesting:

  • a mechanism to tie a published package on crates.io to a specific git(hub) commit (which can be signed, and this signature maybe used in place of, or in addition to, a separate crate signature?)
  • a mechanism to attach third-party signatures to this thing - for “trusted audits” that @kornel talked about
  • something about auditing subsequent commits, so that it’s easier to upgrade a package from a known good version
  • TUF-inspired mechanism(s) that would ensure integrity of the whole scheme, key revocation, etc.

I’m mostly thinking about the trust model and the general “allowed to exist” capability; more granular capabilities could maybe piggyback on the same basic underlying scheme (audit related to unsafe code only? to code harm overall? build scripts?), but I specifically don’t aim to solve granular enforcement, such as sandboxing.

I’m still trying to connect all the dots and form a more complete picture of the problem. I’ve been reading about TUF (and https://github.com/rust-lang/crates.io/issues/75) and similar. Are there more resources and/or related discussions that you could point me towards?


#33

hello, I’m looking into this as a possible topic for my masters thesis

This sounds great! Let me know how I can help!

FWIW @withoutboats is working on stuff in this space already

but I specifically don’t aim to solve granular enforcement, such as sandboxing.

I think that’s fine – I think having the ability to specify granular capabilities is important, but a different tool can enforce the capabilities. So your tool may allow users to specify/allow/deny various sandboxing-related capabilities, and we can have the build system query it to actually sandbox things.

I’m mostly thinking about the trust model and the general “allowed to exist” capability; more granular capabilities could maybe piggyback on the same basic underlying scheme (audit related to unsafe code only? to code harm overall? build scripts?),

I suspect it will be easier to do in the other direction – solve the problem for the general notion of capabilities (In general I think trust is very multidimensional), and then “allowed to exist” and “allowed to unsafe” are specific variants of this. Enforcing “allowed to unsafe” can be done later, but having some way of specifying simultaneously “I am okay with this crate but it is not allowed to use unsafe”, which can later interface with a tool that forbids unsafe – would be nice.

Are there more resources and/or related discussions that you could point me towards?

I don’t know where boats’ crates.io signature stuff is, but TUF and that are the only two things I can think of.


#34

This is probably the worst attack model Cargo has right now, since it means people trying to audit their crates can easily end up not actually auditing the code they’re using (like that NPM-related article). Ideally, crates.io would refuse to publish a crate if it links to GitHub but doesn’t have the same code as the repo it links to.