[Pre-RFC] Cargo Safety Rails

I assure you the spirit of the proposal is neither to be unfriendly nor antagonistic to the health of the ecosystem, though we can disagree about whether it would, in practice, help or hurt it.

My position is that given how deep dependency graphs already are, it's unreasonable to actually audit every single crate I depend on in many cases. If I care about ensuring type-safety (for security or reliability reasons, for example), it just means I'm not going to use any of crates.io.

For me, this isn't theoretical, it's the reason we don't use crates.io in Tock (yet anyway). So from my perspective, not having a mechanism to take care of this is harmful to the ecosystem.

Of course, that could be too special a case to solve for generally, and i don't think it's an unreasonable conclusion to say that this feature is better off overall in, say, a fork of cargo. But anyway, that's my justification for the proposal being in service of the ecosystem rather than hostile to it.

I wonder if some of the other forms of the proposal mitigate this concern at all. For example, @lxrec suggested having an explicit opt-in in the top-level crates being compiled (allow_unsafe_default = false), which would (or could?) mean that if the end-user doesn't explicitly opt-in, there would be zero impact even if some crate along the dependency chain uses it.

FWIW, in Safe Haskell I don't think it really has this affect (although Safe Haskell has module level granularity, so a library author can declare certain parts trusted or safe). In practice, what happened there is that the vast majority of hackage packages were already safe but some packages (Vector, aeson, binary, bytestring, wai, etc) are not for performance or ergonomic reasons and they are simply marked as Trustworthy. Perhaps it would be helpful to empirically asses what that would look like in crates.io as well? That kind of comes down whether this is completely an objection in principle or a pragmatic one...

I also wonder if @Nemo157's suggestion of having a way for dependent crates to detect if they are compiled with unsafe allowed or not would mitigate this concern. For example, if you are making a non-breaking change that uses unsafe as a performance improvement, you could simply gate that change on whether you're "allowed" to use unsafe.

(I also am realizing that maybe the fact that we're using "allow" here is triggering negative reactions, because of course as a library author you can use whatever you want, it's just that if I want to depend on your library and not trust your library, that would only be possible if your library compiles with -F unsafe_code.)

I completely agree. This proposal only practically works if the unsafe is used sparingly (I believe it is, but am not sure). Though, I think having some technical support to assist library users in figuring out which dependencies need the most auditing is complimentary to reducing the amount of unsafe code.

If I'm writing C, I basically need to be fairly confident about every line of code I compile into my binary (whether I or someone else wrote it). Part of the goal of a strong type system (different people have different justifications, but I think this is one of them) is to mitigate that. If I have no idea whether a dependency is using unsafe that subverts this goal.

2 Likes

Also, happy internals ā€œanniversaryā€ @Gankra! (according to a nice birthday cake icon near your handle)

4 Likes

It seems very reasonable to me for users to want to decide they donā€™t trust (some) ecosystem authors to write unsafe code. Unsafe code is not bad, but it requires a lot more trust to use than safe code.

2 Likes

There are many kinds of such "breaking changes" already:

  • If you start depending on nightly Rust, breaks your crate for all users of stable/beta Rust
  • If you start depending on features of Rust only available in newer releases, e.g. the ? operator, breaks your crate for all users of older Rust
  • If you add a crate with a copyleft license like the LGPL or the GPL, or even MPL; breaks your crate for projects with a licensing policy like servo or rustc (servo allows MPL crates though);
  • Similarly, if you add a crate which is licensed apache license 2 only (no dual licensing with MIT) as dependency, it may make your crate unusable for (L)GPLv2 licensed crates
  • If you previously supported no_std and now you don't support it any more
  • If your crate that provides bindings to a native C dependency via pkgconfig requires a newer version
  • If your crate suddenly has more than 4 kb of code and therefore doesn't fit onto some chip any more.
  • If you use crates from outside crates.io and have a git depdendency, and suddenly the server stops hosting the git repo.

I think it never will be really possible to guarantee that no users will notice issues when updating patch releases; some users will always get issues.

Also, adding unsafe to a crate that never has been using unsafe before should be pretty rare, if someone adds a patch to optimize something using unsafe the crate most likely already contains unsafe.

I don't think this will encourage micro-crates (I agree that too small crates are hurting the ecosystem cough iron cough). If it will encourage anything then features like unsafe which turn on an unsafe version of some function. And this would be IMO a great thing to happen, because it gives users access to the performance <-> security tradeoff that they can decide upon based on their individual needs/policies.

Crates may have different positions towards unsafe, this new feature will only enable leaf crate authors to have the final say which is I think the right place.

I think this would both demonise unsafe code beyond what is reasonable and give a false sense of security. While I agree that evaluating the quality-level of a program including its dependencies is a good goal, I think focussing on unsafety as a way to do so is somewhat naive. I would prefer to rely on social solutions such as audits (e.g., the libs blitz, or some security focussed variant) and formal approaches such as the rust belt work.

I believe that the majority of unsafe code is not a problem - it is small, self-contained, and innocuous. I further believe that in general leaf crate authors are not going to be able to judge the safety of such code, i.e., whether such code is trusted. Furthermore, they have to trust not only that the use of unsafe code is safe today, but that it will remain safe for evermore (unless they check every transitive dependency every time it is updated).

Next, unsafe code is not the only source of bugs. If you care so much about safety/quality that you would check every dependent crate for unsafe code, you should almost certainly be checking them all for logic bugs too. This will make that more unlikely by giving a false sense of security that the code is safe if it passes these checks.

To put this another way, while this sounds superficially useful, I believe that it will encourage over-cautiousness where it is not warranted, and where caution is warranted this only makes a minor improvement (and possibly discourages best practices). This is security theatre.

11 Likes

I find this attitude surprising, given the potential scope for a vulnerability exposed by unsafe code. This sounds like the arguments against Rust that claim that since it doesn't fix logic errors then it's not worth the extra hassle just to address memory safety.

8 Likes

I think there is a subtle but important difference in the context here. Given a single crate, written from scratch, of course I think you should avoid unsafe code and when you have to use it, you have to look more closely at it. Thus it makes perfect sense for tools to bring your attention to it (e.g., Servoā€™s Highfive). And if you give me a crate which you claim is well-written, audited, and tested and has a small amount of unsafe code, if I were doing my own audit I would pay close attention to the unsafe bits.

BUT if you give me two crates and you claim both are well-written, audited, and tested, one has some unsafe code and one doesnā€™t and I have to choose between them, then I think it would be wrong to pick the crate with no unsafe code without doing any further investigation.

That is kind of extreme, but it seems to be what is being encouraged here.

It's not what's being encouraged. What's being encouraged (by me anyway), is a mechanism to surface dependencies (importantly, they may not be direct) where if you choose to incorporate them, you may want either audit them for soundness violations or just choose to trust the author (which may be totally reasonable too, in the same way that most people don't audit the Rust compiler, but just Trust The Process). Thankfully that's fairly easy in the cases you pointed out---small, self-contained, innocuous.

This is a false dichotomy. While it's of course true that soundndess/type violations are not the only source of bugs, but a caller can guarantee very strong properties about isolation if they know the callee cannot violate type safety (given the right software architecture of course). This is true in plenty of cases where the caller either doesn't care if functionality is correct, or they can verify functionality enough with testing (while we can test expected cases, testing for adversarial inputs is much harder, which is one reason type-safety is so useful in the first place).

There is a long history of research using precisely this kind of safety to isolate extensions (Spin, Singularity, Jif, all the IFC work in Haskell, etc) as well as practical examples in your exact area of expertise---browser-based JavaScript applications (the JavaScript API exposed through the DOM simply doesn't have an unsafe equivalent, but that's just a different mechanism).

As I mentioned in a previous response, I think a totally reasonable response to this is "get your hands out of my package manager, make your own if that's what you want" (though I'll be offended if you actually say it like that). Maybe you think that as well, but I think it's a very different argument than dismissing the utility of enforcing type-safety at the package level all together.

1 Like

I can see that it is not what is intended to be encouraged, but I worry that it might be encouraged anyway :slight_smile:

This is a false dichotomy...

I think you are assuming this facility will only be used by experts, or at least people who understand these issues as well as you do. I worry about the 'gamification effect' where people want a 'no unsafe' 'badge' for the sake of it.

OK, so let me suggest an alternative since I believe the goal here is reasonable, but I have been negative about the suggested solution: I would like to see a tool implemented as a cargo plugin (i.e., you run it with cargo unsafe-audit or something) which looks at every crate (including deps) and gives you a nice JSON output of every unsafe block, function, and impl, with the crate, file, and line it occurs (based on the AST, not grep so we'd see through procedural macros, etc). So with some nice frontend you can see which crates have no unsafe code and which have a little or a lot. So when you do an audit of your code, it makes it easy to find the unsafe code, but it makes it harder to make a naive 'unsafe bad, no unsafe good' judgement about a crate.

7 Likes

To me this feels more likely to have negative side effects. It's hard to judge unsafe code by itself. You'd usually have to see it in its wider context to make sure it is safe. Usually when it depends on some value or conditional that is outside of the direct scope of the unsafe.

I do find the original proposal appealing, because I'm still feeling a lot more comfortable grasping safe Rust than unsafe Rust, and thus auditing it. There is (and probably will always be) a difference about the dangers sleeping in the safe and unsafe parts. I can understand things when it's just about using unsafe functions to achieve a goal that is aligned with the functions definitions. But I wouldn't dare yet judging something dealing with direct memory access, inline assembler, and things on that level.

So for me the big plus would be noticing the switch from "safe" to "unsafe" in a dependency. Because right now (and I assume others will be at this point in the future as well) it is equal to "auditable for me" switching to "possibly need to defer judgement to others".

But I'll admit I would be more likely to trust a safe crate due to the above than an unsafe one.

1 Like

I'd say its dependent on the use case.

If you have a security library like maybe openssh or openssl then almost every logic bug is automatically also a security bug. Similar for operating system kernels. And for web services. If you write a web server in Rust, you can expose the entire user database with logic bugs or allow people to log in as admin. There are many examples of logic bugs in websites affecting security, like this one. Safe Rust doesn't prevent that.

However there are other use cases for Rust. Think of media players or image viewers. These tools need to be able to handle untrusted data, and display it to the user. Here there is a big difference between logic bugs, which in the worst case give you a black/corrupted image, and memory safety bugs, which may give you a "your files have been encrypted" message. The only places where memory safety bugs can be introduced are inside the unsafe blocks, so it does make sense to minimize them, and if possible, provide an option to get rid of them completely.

As an example, think of the nautilus thumbnail bug which gained code execution privileges by exploiting memory unsafety, only from viewing a directoy with a prepared file in the file explorer. There are definitely parts of a file explorer where you might have security relevant logic bugs, but for displaying thumbnails, there is no real logic bug which is of relevance for security, at least not directly (you can use a wrong thumbnail to convince people that the file is something else than it really is, but that is a far less severe security issue than having an attacker user execution privileges on your machine), while all memory safety bugs are relevant for security. If it had been all safe Rust, it wouldn't have happened. If it had been mostly safe Rust with unsafe components, it may have happened.

Sure, a malicious crate author might also add malicious code to an image decoding library but that assumes actual malicious intent from the author which is much less prevalent and can be recognized far more easily during review than susceptibility to doing mistakes when writing unsafe code (which affects most people, sadly). If I wanted to hide a security vulnerability, I would do it in unsafe code, not in safe code.

Unsafe code also makes logic bugs worse. Sometimes unsafe code has conditions under which it works safely, and if those are violated, safety is as well.

To summarize, being cautious which libraries get allowed to use unsafe is for many use cases no way "security theater" but a legitimate strategy to minimize attack surface.

6 Likes

Stigmatizing unsafe is based on flawed premises that code using unsafe is not trustworthy, and that code without unsafe is safe.

The last critical security vulnerability in ImageMagick was caused by passing unsanitized arguments to command-line helpers, and such vulnerability is also possible in Rust. Rust is not a sandbox.

2 Likes

Then why have the unsafe keyword at all? If it has no safety implications above normal code, and if thereā€™s no need to review unsafe code differently than normal one.

Code using unsafe is less trustworthy to me. Because I cannot trust it, because I donā€™t have the experience to trust it.

I feel like thereā€™s a bit of a rift in the community about the safety/unsafety aspect of Rust. Iā€™ve been reading statements similar to ā€œas long as you donā€™t use unsafe, you should not run into memory safety issues in Rustā€ multiple times in the past, together with statements about the awfulness of memory safety errors.

3 Likes

There are different levels of unsafe use. After reading about all the ways you can get burned using unsafe, itā€™s quite clear to me that I donā€™t want to use any crate written by a novice and using unsafe. But standard library unsafe use we hope is sound (although there have been cases in the past where that wasnā€™t true). At the other extreme we have C code automatically converted to unsafe Rust which is probably as bad as the original (or worse) in safety terms. So a ā€œuses unsafeā€ label doesnā€™t give enough information. It requires a human to evaluate how unsafe is used. A simple label doesnā€™t really help.

2 Likes

@jimuazu But, wouldnā€™t the first part of your comment imply that a ā€œdoesnā€™t use unsafeā€ mark would be helpful?

@phaylon, yes along with a heuristic for detecting novices! (or automatically translated code).

Almost every crate uses standard library unsafe code directly or indirectly. So all crates should have a ā€œuses unsafeā€ label. So to avoid that means ā€˜blessingā€™ certain blocks of unsafe code as trusted, e.g. all standard library code. So weā€™re back to different levels of unsafe. If someone wants to go around auditing unsafe blocks and ā€˜blessingā€™ them, then we could have a ā€œdoesnā€™t use any unaudited unsafe blocksā€. I think thatā€™s the best you could do. I would also like to avoid people getting an ā€˜unsafeā€™ allergy and ending up avoiding really good external crates that make very careful use of unsafe, and whose use has been well-audited.

But, as you said, the trust towards the standard library is different from the trust one has in any random crate. For example: Iā€™m fine with stdā€™s Vec using unsafe. Iā€™m fine with crates using stdā€™s Vec, but I would be weary of a crate implementing their own Vec using unsafe.

So, Rayonā€™s an external crate. It uses plenty of unsafe. But itā€™s written by someone who knows his stuff. Are you going to bless that too? And then what else?

In then end we end up needing some authority to go around auditing things to make a simple label actually meaningful. Otherwise we need something much more detailed than a label, like a report of all the unsafe use in a crate and dependencies that the developer needs to read over and assess for themselves, like @nrc suggested.

2 Likes

Yes, I'd make the decision specifically for the rayon crate that I have heightened trust in it and am fine with it doing unsafe things. This is based on: the fact that I have heightened trust in the developers; that rayon is a well-known crate with many eyes on it; that it is wide-spread enough that if there is an issue with unsafe that I run into, there's a good chance I'll find someone who'll be able to help. This things aren't true for me for any random crate, and it's an individual per-crate decision.

I actually do check out code from crates I'm interested in to see if there's any unsafety going on. So in a sense, I'm already doing what a flag would be helping me with. I'm just doing it manually all the time, probably missing things, and likely to not notice if a crate that previously did mundane things suddenly starts using unsafe.

I did not say that. I mean that code using unsafe may be trustworthy. Mere presence of unsafe doesn't mean a crate is unsafe, and absence of unsafe doesn't mean a crate is safe, so banning of unsafe keyword is a very blunt auditing tool that will have false positives and false negatives.

1 Like