Safe Library Imports


#1

I want to be able to import libraries and know that they cannot access files or the network or use unsafe code. In particular, I want this for codecs, so that I can use them without having to audit the code. If I import a png library, I want to know it can only access the data I give it, do some computation and give data back to me. I can take care of network and file access for it.

It could be a new keyword or keyword-alike:

safe mod rustpng

Ideally, there’d be a known-safe subset of the standard library and safe-mods could only import other safe-mods and the safe subset of the standard library. It would give me a lot of peace of mind around the safety of the programs I create that happen to read/write media files, knowing that they couldn’t possibly have design flaws that could escalate to system access. It also seems to fit the core Reliability/safety tenet of Rust’s goals.

In the longer term, you may want to provide them with whitelisted other mods, but that could be a future enhancement.


#2

Enforcing this seems spectacularly hard. You have two choices for implementation:

  • Anything calling a banned syscall transitively is a compile error (bad: implementation changes are now breaking changes).
  • Anything such a module calls must be recursively must be marked as such, leading to spectacular churn (either every function that is safe must be marked as such (almost all of them) or we introduce a breaking change where every function must publish whether it does IO.

This exists: you can apply #[forbid(unsafe_code)] to completely ban the unsafe keyword. The only part you’re missing is being able to enforce "I will use this library only if it is marked #[forbid(unsafe_code)].


#3

What you want would indeed be nice to have, but I’d argue it doesn’t need to (and thus shouldn’t) be a keyword; since this property is literally an attribute of the unit of code (module, crate, whatever it is) you would be using, why couldn’t it be an attribute?

Thinking about it more, the current #![deny/forbid(unsafe_code)] attribute I think is already a pretty clear marker; what would be useful (and more orthogonal) is a new attribute on extern crate and/or use declarations that transitively errors on the import of any code that does not contain #![forbid(unsafe_code)].


#4

Ideas like this come up semi-regularly. My current understanding is that:

Most recent threads about this seem to be focused on sandboxing build scripts in particular, though from what little I know the issues there seem very similar to runtime sandboxing.

The Rust Secure Code Working Group has seemingly active issues on Build-time sandboxing, Safety-oriented static analysis tooling, Reduce the use of unsafe in the ecosystem and so on, which imo are more cost-effective/less ecosystem-splitting solutions for improving our overall security than a language-level feature.


#5

See also:


#6

We have talked here before about adding an Effect system to Rust – basically, transitive restrictions on a granular level, such as functions.

  • io
  • allocation
  • syscalls
  • panicking
  • unsafe

It would be nice to have for safety and reliability concerns but would also require a major effort.


#7

My goodness, that’s quite a lot to read. I’m working on it. Also, there are so many tangential sources of confusion that the debate gets sidetracked onto:

  1. There seems to be some confusion between posix-style ambient capabilities and object capabilities, which doesn’t help. With object capabilities, you can call code that can access files (say, for logging) from code that cannot, and still know it to be safe. Posix capabilities instead seemingly dictate what a thread/process can do, independent of execution path, effectively preventing actions.

  2. Some seem to think that the unsafe blocks in the standard library should also be transitively inaccessible, but that doesn’t really follow. When you’re trying to secure a system, there is some code you trust (like vectors in the standard library) even if they do use unsafe blocks, and some that you don’t.

  3. Some seem to say that it cannot be perfect and hence shouldn’t be done, but that same argument says we should give up on memory safety, since there may always be vulnerabilities in code generation. The existence of bugs and vulnerabilities should not prevent us from tightening security. Similarly, network and file access are very different things, even if one can escalate to another in many cases.

  4. Some say that such features may engender a false sense of security, for multiple reasons, but the same could again be said for memory safety in general. There’s benefit in improving security, even if we can’t reach perfection.

  5. Some want it to be part of a code review and static analysis system, but the right few language and cargo features would make that work unnecessary in many cases. I don’t want to have to find a trusted reviewer just to use a png library when a few compile-time booleans would be enough.

  6. Sandboxing is related, but is generally aimed at allowing the running of code that does unsafe things and then preventing it from doing them, similar to posix capabilities. In this case, we’re talking about preventing a system that would break the sandbox from being compiled, moving a runtime concern to a compile-time one, a la static typing.

  7. For highest security, even access to the system clock, random numbers, threading libraries and other sources of non-determinism and side-channel amplifiers would want to be controlled, but that’s definitely out of scope in this discussion.

I hope that a simple, desirable and achievable motivating case might keep this discussion on track. But I’ve got a lot of reading to do and maybe threads to comment on.

I do agree that an attribute could be used instead of a new keyword.


#8

If the goal here is restricted to “limit the things I need to code review before using”, there might be a way to have crates automatically (and thus trustedly) marked “low-risk”, for some simplistic definition. For example, if something only uses core and other low-risk crates and doesn’t use unsafe, it’s “low-risk”. (Locally there’d probably want to be some allow-list of crates that are manually “trusted” that they could also use, or something, but that wouldn’t give them the badge on shared registries.)

That’s not at all a 99% solution, but it might be more understandable and still very helpful as only a 75% solution.

Preemptively adding the usual complaint: The demonizes unsafe, which we don’t want to do.


#9

I like the idea of being able to label a function as “pure” = operates only on the inputs, uses no unsafe, and calls only pure functions.

A version update would not be considered compatible with previous versions api if it lost the “pure” mark.

This has problems too, as it might make useful modules seem like second class citizens if they couldn’t write pure…

But a part that might make this work would be “pure(trusts:Rc)” or if you your code uses usafe: “pure(trusts:self)”

It still seems dodgy, but if the trusted library/module is one you already use, that would enable some safety.

This may be slow to implement/get implemented in crates but it wouldn’t be a breaking change, only something people voluntarily sign up for.


#10

demonizes unsafe , which we don’t want to do

This is something I disagree with, but I’m both visibly biased (and hence undermined) by my feature request and not a core contributor. I think unsafe is dangerous and should be discouraged. I think it’s a rare developer that can use it safely, and it bypasses the runtime guarantees that Rust seems designed for. I support the ongoing efforts to reduce its usage by identifying places where it is used unnecessarily.


#11

That’s not really relevant to the issue of trust. If you trust the code, you don’t need to review it anyway. If you don’t trust it, then if there’s any possibility of an escape hatch, then you have to review it anyway. A best effort doesn’t mean anything when you’re trying to build a system to protect yourself from malicious actors. If all you want is convenience, just do a grep on the files and see if they call any file access routines.

Memory safety is not meant to absolve you from doing code reviews. It’s meant to make it easier to review the code.


#12

For what you want, your best bet is probably to port the Rust code you want to use to be able to compile to Webassembly, and then run them within a well-constrained Webassembly runtime. Webassembly was designed for efficient operation within browser sandboxes, and given the kinds of things that are expected to run within browsers, that’s a high standard.


#13

That’s a high price to pay just to make sure some random library I decided to use does not steal my private keys. There must be a cheaper solution.


#14

I’ve now read the other threads, where they talk about similar ideas. There are these extra confusions (beyond my earlier list, which still applies):

  1. Some suggest grepping the source. This clearly will not catch any even mildly obfuscated access to the system libraries and unsafe code.

  2. Purity. While pure functions, if they can be known to be such, can indeed not affect the system, they are also more restrictive than we need. Particular kinds of system access, like logging, would be desirable.

  3. Anti-pattern spotting (including crev). While anti-patterns can be problematic, and spotting them can help fix things, a malicious party would simply check their code isn’t impacted by it before release.

  4. Some imply that a solution that doesn’t cover soundness bugs in the compiler isn’t worth using. I’m lost for words on that one.

  5. Some say that since logic errors can have security implications, pointing at unsafe code as a problem is misleading. But logic errors can’t break the type system and other language guarantees in the same way unsafe can.

  6. We can’t just use WebAssembly. It’s significantly slower, doesn’t interface cleanly with other runtimes and is distinctly focused on particular kinds of constrained processing. I want this language to be secure in and of itself.

There are these criticisms that require further discussion:

  1. Demonizing unsafe. There seem to be two schools of thought on unsafe. Some think use of it is not at all a problem and shouldn’t be discouraged. Some think use of it is a problem and should be discouraged. As unsafe breaks all type system guarantees, I’m firmly in the latter camp. Could it be useful to open a new thread on this? I’m not “demonizing” unsafe, and feel a discussion as to what the risks and benefits are could help us all.

  2. Fragmentation of crates. Some believe we’ll end up splitting the rust crate world in two - on one side, those who care about this kind of security and are overly selective. On the other, those who are happy with the status quo and ignore the new features. This could be a significant issue. I do believe the community would actually standardise on the secure-by-default side, in the end, but it could cause pain in the short to medium term.

  3. What about those who do use unsafe for performance reasons? We can either explicitly trust them on import or we can ask them to include a compile-time selection between fast and not-unsafe code. I think it can benefit code understandability to have a working non-unsafe version to peruse, anyway. But this is indeed a good open question.

And I’m honestly generally getting a feeling that any proposer has a lot of emotional labour to do (less so in this thread, yet, but more so in the others). The standard for discussion on these topics seems to be to shoot them down with mantras “no demonizing unsafe”, “false sense of security”, “not the real problem”, “static analysis is the answer”, “trust audits are the answer” and “it will cause extra churn”. And the proposer gives up. It’s just not a facilitative conversation.

The end result will be suffering the same malicious crates and mistaken security flaws that many other languages suffer, when we could avoid them without unnecessary cost. It seems like an especially odd approach, given that a large part of Rust’s proposition is that it’s more secure than C/++.

These solutions that are being offered aren’t just out of nowhere. They have worked for other languages and runtimes, to varying extent, and they did not hit these issues. If you can get static assurances from the type and build system, you don’t need trust audits and further static analysis (for this purpose). If you know that some code doesn’t use any system-access modules or unsafe code, you know it simply cannot access the system. Logic errors are irrelevant if all you’re doing is passing byte arrays back and forth (and don’t mind risk of infinite loops or memory exhaustion).


#15

Sorry, it seems I shouldn’t have used “demonize”. I don’t actually agree with the complaint; I just wanted to include it since I know it’s a common response (though if argued for real probably done with a better word).

I think there’s a good middle ground where unsafe is encouraged to be separated into a small crate for separate, easier review and crates are not themselves penalized for using a crate that exposes those abstractions.


#16

Rust has one very simple reason for not including an effect system: the strangeness budget. Getting people over the borrowck hump is hard enough; teaching them to use an effect system is also very hard; overcoming both of them would be exponentially more difficult, since they’d have to learn both of them at once.

The thing that always annoys me about discussions like this, as you’ve already said, is that it’s very easy to fire off shots in it without very much investment, without necessarily arguing in good faith, and while refusing to admit to their real motive. So I’ll put it out there right now:

I think Rust should sacrifice infosec in this case in the name of expediency. The concept of the “weirdness budget” articulates why I think it’s expedient; not having an effect system means that nobody has to be taught how to use it. Not having an effect system is essentially catering to the lowest common denominator, since it still leaves open language-agnostic opportunities for supply chain management for those who really need it.


#17

The original request in this tread was about a PNG library. Speed of PNG decoding and encoding is almost entirely dependent on the gzip implementation, which in turn greatly benefits from optimized memory copies (unsafe) and SIMD (unsafe).

I’ve ported lodepng from C to Rust, and its slice-ified deflate is 2 to 10 times slower than best (unsafe) deflate implementations.

So “just don’t do funny stuff” type of import would be very unsatisfying. You’d probably want to allow the PNG library to use unsafe zlib.


#18

Rust has one very simple reason for not including an effect system: the strangeness budget

A fair point. I should note that I thnk my proposal is a lot simpler than an effect system, both in effort and strangeness, but the strangeness may still be higher than you’d like. And the performance of safe rust for file parsing is much worse than I expected, making the cost higher.

I think Rust should sacrifice infosec in this case in the name of expediency.

I can appreciate this point of view; there are all sorts of budget in language design.

The simplest variants of this could be voluntarily followed (as scottmcm noted) and ignored by the greater community (as it involves flagged imports and compile time conditionals and could use an existing no_unsafe feature), which I think makes it less of a sacrifice.

It still feels a little like batting it down without due consideration, especially as you describe it as an effect system, which it is only superficially similar to. But I do understand my bias as the proposer might always make it seem that way.

Would a complete worked example help, do you think? (as a github fork of a crate and some documentation)


#19

That is worse than I expected and may indeed be too much of s performance drop for safety for most people. (Edited in:) One could trust a gzip library to use unsafe code without trusting a png library that uses it to use unsafe code, but that’s a big ask.

Yeah, it would need to be finer-grained to avoid this problem. You’d then hypothetically want to use a flag to only allow particular non-core imports (or none) but still allow unsafe code. It would still protect againt programmer error, to some extent, but not against malicious intent.

(Meta: I only now discover how to quote someone properly. The quote button in the toolbar should perhaps give a helpful hint, in the preview pane, about selecting text in the page.)


#21

It has been pointed out that Effects keep coming up because my description may lead people in that direction. What I’m actually talking about is more like module-scoped capabilities. It’s perhaps best described in sequence.

Imagine you’ve got some Rust code in which you cannot import any libraries at all and cannot use unsafe. With such code, as long as Rust itself is not broken, you cannot do anything but compute (and allocate supporting memory). There’s no file access, network access or system access of any kind. But there’s also no vector classes, no mathematical operations beyond the basic operators, and a number of other missing features.

There is a subset of the standard library that provides vector classes, mathematical operations and other things. These provisions still don’t allow file, network and runtime-bypass abilities and so are safe for anyone to use, even if some users are malicious.

Then there are parts of the standard library (and unsafe) that do provide malicious or exploited actors with a lot, like the aforementioned abilities to access files, access the network and bypass runtime safety.

If you could specify that some library only had access to the safe subset that could work in isolation from an OS, and whatever additional libraries you whitelist, then you would know it could not either drain your bitcoin wallets or accidentally allow others to do so.

Note that allowing access to some library that can access files (like a timezone library accessing the tzinfo file or a logging library outputting to a known location) is fine. As long as Rust’s type system is working, that file access can still work, even while the caller has no other way to access files. Similarly, you can provide a png library with callbacks that grab data from a file, without it being able to access files itself.

All of the access to libraries and similar is through mod and therefore, I’m suggesting restriction attributes on mod that would restrict what further sub-modules a module can load, with an easy way of saying ‘only the known-safe ones’. So your main module could import what it likes, but it could restrict sub-modules to importing only specific things. The current Rust type system would take care of the rest.

It’s worth noting that #![no_std] actually goes some way towards this, but eliminates a number of useful base libraries. Splitting std exports just a little could facilitate both better embedded programming and this proposed new feature.

Those who don’t care about any of this could just continue importing all of std and carry on as normal. It only restricts those who choose to have the extra safety.