[Pre-RFC] Cargo Safety Rails

Ok, so I am confused hear. What is the threat model this would protect us from? Who is doing what to whom, and how would this help?

I think this feature is a blunt tool. It’s going create a lot of wasted effort, and overall it will be ineffective method of ensuring safety of Rust dependencies.

  • Making “is it safe?” decisions individually by each user for each Cargo.toml creates a lot of fragmented and duplicated work (it confuses trust with safety; one is subjective and distributed; the other — in Rust terms — is an objective property of the code itself).

  • Auditing of someone else’s code is hard, and a “drive-by” check is going to have low accuracy (not enough time and familiarity to spot hidden errors, and OTOH safe blocks of code may be unnecessarily rejected if they look scary to an outsider).

  • In code reviews usually 1-line changes are picked apart, but 500-line changes get “LGTM”. The same effect is likely to make users just accept large crates or deep dependency trees and hope for the best.

So in the end it may just create a lot of work spent on micromanaging a moat in Cargo.toml, with shallow peeking at code and going with a hunch.

The effort could be better spent on shared community audits, like Libz Blitz, so that code is more carefully audited, with more in-depth look, and that effort is done once, not duplicated by every crate user.

And instead of blunt "has/doesn’t have unsafe" heuristic it may be better to use more specific heuristics like checking test coverage, static analysis tools, fuzzing, runtime sanitizers, etc.

2 Likes

I think I’m interpreting this very differently from other commenters. I would assume (since it would be a breaking change otherwise), that every crate is trusted by default. This would allow users to specify that they only trust certain crates to contain unsafe code, and don’t want other crates to introduce unsafety.

This seems eminently reasonable to me, since unsafe code is required to uphold complex invariants to guarantee that you don’t get undefined behavior. While there is obviously still the potential for logic bugs, a key goal of Rust is to eliminate a category of very high risk bugs, and that relies on all unsafe code in your dep graph upholding all of these invariants.

I could easily see projects deciding that they only trust std & rayon, for example. And I could see people wanting to be notified when unsafe code is introduced into a dependency when they upgrade, so that they can audit it and see if they actually want to upgrade.

Obviously its a blunt instrument, but how is that worse than no instrument? I don’t see where anyone has suggested we make all crates untrusted by default. This would be something users would opt into.

12 Likes

That matches my reading of the original proposal:

I've tried to come up with a design that doesn't change the experience for Cargo users who wouldn't use this feature (e.g. dependencies should be trusted by default).

And I very much agree with it, as much as I agree with the feature proposed here. Having deps untrusted by default would not just be a breaking change, such a check is also only useful for a subset of use cases. In many (most?) use cases it is not useful, and as many point out correctly, even harmful, so not even future epochs should make all dependencies untrusted by default. There is only a small subset of the cases Rust gets used in where there is a strong bidirectional "unsafe" <=> "may have vulnerabilities" relation, and such crates can opt into the feature.

However, it would be cool if leaf crates could opt in to treating all their dependencies as unsafe per default, except for the trusted exceptions they list in Cargo.toml. This would mean having to attach "untrusted" to each and every untrusted dependency.

You mean this one? It is a vulnerability in binary tool components of ImageMagick, and the tool code of course is a different use case than a decoding/encoding library, which is supposed to be neutral of the means it gets data from (whether its an https download or a read from a file or whatever).

Ye, I see unsafe as merely a statement “caution - safeguards removed”. It merely highlights an area where handholding stops, not indicating it as bad.

I think this is an awesome idea.

Using unsafe keywords doen’t automatically affirms that the code is not trustworthy, but IMHO the keyword exists exactly to allow and warn about some unsafe code and would be nice to have this track over dependency tree as well.

+1 for @konstin solution, but I would replace [security] for [safety] to avoid confusion.

[safety]
allowed_unsafe = ["foo"]

Also, maybe people are disagreeing just because this confusion. Unsafe code does’t mean it’s insecure. Marking your crate as unsafe doesn’t mean it is insecure.

6 Likes

What is the status on this? I strongly agree with the suggested RFC, and here is why:

About Rust’s “safety” / security:

  • correct usage of unsafe code requires a good understanding of ALL the possible pitfalls (and who knows them?), for instance (but not limited to):

    • standard raw memory management pitfalls / typical C pitfalls : buffer overflows, use-after-free (and double-frees), dereferencing after pointer-based (may involve indices) arithmetic, explicitly using uninitialized memory, … ;

    • allocating/freeing memory across FFI boundaries;

    • panic!-ing (unwinding the stack) across FFI boundaries;

    • generic memory management code handling ZST or references to generic data converting it to raw pointers without thinking about references to DST being fat pointers and thus containing more data than just a single memory address;

    • uninitialized/zeroed memory vs. compiler optimizations based on bit-layouts (i.e. the compiler may implicitly access data that you have unsafe-ly (un)initialized. For instance, with a non-nullable type such as a reference, Some::<&_>(::std::mem::zeroed()) is UB).

      If this situation seems unlikely, know that all it takes is using mem::uninitialized() / mem::zeroed(), a generic type and an enum somewhere;

    • “exception” safety;

  • Of course, non-unsafe code is plenty of capable of bugs (and thus security issues) as well, such as:

    • logic bugs;

    • integer overflows: (e.g. downcasting, extending without attention to sign, and overflowing-unchecked integer arithmetic in release code (UB!) for those not aware of the available tools to handle them), which may, in turn, lead to logic bugs.

      It is specially dangerous with indices (there is little difference between an index and a raw pointer: only the fact that “dereferencing” an invalid index causes a panic instead of straight UB (if the indexed collection has been well coded);

    • Deadlocks;

    • Poisoning, which may, in turn, lead to deadlocks;

    • OS-related data races (which may, in turn, lead to logic bugs);

    • Hash-based collections using a DOS-vulnerable hashing logic (::std::hash::Hasher);

    • Memory leaks;

Thus non-unsafe does not mean “safe”!

Now that that’s been cleared out, and after reading all the comments, I still think there is a need for crates to be labelled as whether they use unsafe or not (imho some kind of label / badge on crates.io would already be very helpful).

Why is that ? Because of small crates from unknown (and potentially novice) people; they should be encouraged to publish crates, which will only really work if people use them. And that should only happen if their crates can be trusted.

Obviously an audit should be done, but realistically there may not be time for that. With a flag saying hey, I know you don’t know me, and I know I’m still grasping rust, but you can still use my crate without fear since I have not used unsafe, it would be easier for people to depend on newer and more exotic crates.

“But you just said that not using unsafe does not guarantee safety!” - you may retort. Well, if you look more closely at the above enumeration of safety issues, you will realize that except some weird logic bug or faulty crypto implementation, most safety issues with exclusively non-unsafe code are controlled crashes, i.e., denial-of-service attacks. Panic/DOS is not UB (which is like adidas, where “impossible is nothing”: anything can happen from UB, including remote code execution!). Panic/DOS is something non-production code can afford.

Conclusion: when you want to use / depend on some exotic / poorly known crate, and when your library/application can afford minor safety issues such as panics (but cannot allow UB in your code), it would be very helpful to do it without fear thanks to a “uses-unsafe”-based mechanic, such as the one proposed in this pre-RFC. Currently, you do not have a choice but hope that the crates you depend on do not UB, thus leading to depending only on trusted (e.g. popular) crates.

An unsafe badge just makes you fear crates, and doesn’t offer any help in checking whether they are actually safe or not. And as you’ve noted, absence of the unsafe badge doesn’t mean they’re safe either.

Instead of FUD-badging of crates, try crev. This is a much more positive tool, because instead of blind trust in crates, and fear of unknown, it gives you a real tool to actually trust the code you’re running.

4 Likes

But crates should be feared! It is code executed on your machine we are talking about! I agree that a community-rating tool such as cargo-crev is more than welcome to help in that matter (I didn’t know of it, I thank you for the reference). But it does not solve the issue at hand: the exotic / unrated crate that just appeared. I am sorry to say so, but this kind of crates is more fearsome if it uses unsafe than if it does not. In the same way a C program is more fearsome than Python script. Both can be harmful, of course! But it is far more likely for the former.

Regarding cargo-crev vs labelling unsafe-usages, I ask: why not both: if a crate had a “uses-unsafe” label, then using a tool like cargo-crev would be a good way to override the initial mistrust.

I even imagine the following labelling system:

enum SafetyLabel {
    CommunityReviewed {
        safety_rating: f32,
    },

    NotEnoughReviews {
        does_it_use_unsafe: bool,
    }
}

So, if a crate got enough ratings, that would be the badge shown, with the associated average grade or something like that; else, it would get a badge stating whether unsafe was used. Then, the creator of an unsafe-using crate, to promote and get those positive reviews, would add explicit justification of the unsafe usage and why he thinks the usage is sound.

Because currently, this is sadly not always the case. In the current situation, people can hide the fact unsafe is used, which, in turn, leads to security-aware people to mistrust all the exotic crates by default, without any kind of hierarchy among them whatsoever.

1 Like

While I have no strong opinions about the details of whatever mechanisms and tools we end up using here, I do think it's really important that we agree (or discover we don't agree) on this point:

A crate that uses unsafe code, given that the soundness of its unsafe code has been proven beyond question, should not be treated any differently from a crate that has only safe code.

Simply showing badges for presence/absence of unsafe code, in the absence of any other information, obviously violates that. If we treat known-sound unsafe code differently from safe code, that makes known-sound unsafe code seem more dangerous than it really is, and makes safe code seem more safe than it really is. This is what I believe @nrc meant in his earlier post when he said "I think this would both demonise unsafe code beyond what is reasonable and give a false sense of security", which I completely agree with.

Obviously, we need to nail down what "proven beyond question" actually means, but that's the kind of detail I don't have any strong opinions on.

1 Like

No, they should be treated with due care, but not driven by fear. "I'm afraid of what it may do" is just unhelpful in security context.

You fear unsafe and UB, but others fear "safe" std::process::Command stealing their SSH keys. There have been suggestions to block access to std::io and std::process because of that fear. And there's another thread about sandboxing build.rs, because running someone else's code during cargo build feels more scary than running someone else's code during cargo run. And I fear someone will bypass all of them with a hash collision in no_mangle or some clever flag in ./.cargo/config.

And I'm afraid that even if you add a badge for every suspicious Rust feature, someone will make a crate with no scary badges, which manages to do nasty things by taking advantage of a bug in another "safe" or "trusted" dependency.

Rust is full of scary things. Just being afraid of majority of Rust's features gets you nowhere:

  • you'll end up rejecting lots perfectly good crates out of unsubstantiated fear. It could be bad, but you don't know, you don't check, so the only way is to assume everything is bad.
  • but then you'll run into a dependency you can't live without, and can't reinvent on the spot (for example, you'll need TLS connections), and you'll end up installing a ton of "bad" things anyway, and only have hope they're not as bad as you fear.
5 Likes

Note that there are two active threads discussing this same general idea:

1 Like

The case for unsafe is more regarding common programming errors that render programs vulnerable despite the programmer’s good-intent than regarding malicious intent (e.g. with process::Command and/or build.rs scripts). When I talk about mistrust, my personal moto is errare humanum est. And since a random bug in unsafe code leading to remote code execution may be harder to spot than, for instance, a directory traversal within a file access, I am more fearful of unsafe-usage / C code than of non-unsafe-using / Python-like code.

Still, what you said is fair enough; tunnel-visioning on unsafe-only is not enough. Those are very interesting points, and they show that the issue of trusting crates is not as simple. I didn’t know of the 2 threads mentioned by @bascule, which seem like a better starting point for discussing this.

I am especially personally interested in the “crate capability lists”, since that’s a solution that could be broad enough to include the different axis you mentioned. An unsafe capability, a io::{Read, Write} capability, a process::Command (or more broadly, any execve) capability, a build.rs capability, etc. Those could be enabled by default but toggled off by the more paranoiac individual.

1 Like

I don't want to derail the discussion but I think we could also discuss solutions where we allow libraries to do specific I/O using a more constrained API. I did not really think this through, this is just an example:

A TcpStreamFactory type that can be used by a library to create a TcpStream whenever it wants to. The caller of the library (the actual application) provides the TcpStreamFactory object. The caller can set various filters on the factory, for example only specific addresses/hosts are available, it can limit the number of crafted TcpStream, etc.. More elaborate solutions can be designed. If we started to design common traits to give limited capabilities to libraries and the crate ecosystem would actually use them then we could simply forbid the usage of the too powerful general APIs in libraries. A benevolent library could work with those. I think this solution is even simpler and more powerful and it does not actually need any language or cargo level change or feature.

1 Like

I really like that. There's a ton of examples of this that are already best practices anyway -- like taking Read instead of a File -- and it would let a much simpler "this library isn't scary" suffice for quite some time.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.