Crate capability lists

You need #[no_mangle], but if I add that…

[1]    39916 segmentation fault  cargo run
Process 39946 stopped
* thread #1, queue = '', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)

…fun times.

1 Like

Anyone happen to know why the symbol clash above doesn’t cause a linker error?

One way to deal with that in the context of my unsafe-features proposal: require #[cfg_attr(unsafe_feature = "x", no_mangle)] in order to disable name mangling (i.e. all no_mangles must be tied to a corresponding unsafe-feature)

It’s not an error to have repeated symbols - the linker will just take the first one found, at least on unix.

Isn’t that static in a no-exec mapping though? I expect that’s the reason for your segfault, even if you did put valid machine code in the byte array.

there’s probably an attribute to make it executable

I think that depends on a specific symbol resolution strategy, as I’ve certainly dealt with my fair share of “ld: duplicate symbol […]” errors in the past

I suspect this doesn’t work in practice. My impression is that a very large portion of unsafe code is either for FFI (where turning it off isn’t even an option) or for optimization (which you would typically want on by default).

In particular, there’s a very strong risk of any audit framework/tooling like this unintentionally leading to crate authors being discouraged from using any unsafe code for optimization, even when the soundness of that unsafe code is uncontroversial. In past discussions this concern has sometimes been expressed as “demonising unsafe code”. Unfortunately I have no good ideas on how to prevent this, but I do think that we’d be causing more harm than good if we introduced a system that did have this problem in practice (if nothing else, it risks encouraging the idea that security is at odds with performance).

Pedantic but important: We’d want to check the entire module for changes, for every module containing at least one unsafe.

Yes, I think you are right, but that approach has a granularity that is not practical in my opinion.

If the user has to whitelist all unsafe blocks in a transitive way, which means that everything that uses Vec or String or similar safe abstractions is affected then I think that would be affect way too many code that cannot really cause harm. I think my approach is better because we only consider the top, public interface of std and we assume that the internal structure of std is irrelevant, safe and harmless.

If there is then we can check for that attribute in the checker. I think that’s okay.

Care to elaborate why you think the module is the right boundary? I thought about this and I think it depends on the unsafe block. Each unsafe block is supposed to be wrapped in a safe abstraction, in can be a module or a single function. I think the tool is supposed to check whether the code that provides the safe abstraction is changed. The safe abstraction is supposed to be written in a way that its users cannot use it incorrectly so the initial audit has to check that this assumption is true then the change checker ‘only’ has to check that all code the provides the safe abstraction is still unchanged. I don’t think it is possible to figure out the boundary of safe abstraction in an automatic way so the initial auditor should probably provide this as the result of the audit.

It is true that const fn cannot do IO (at the moment) but the same is true for a function that does not call any external function or IO functions from std. I don’t see what we gain with the const fn restriction in this situation.

Oh, I thought this was an uncontroversial point the community already had a clear consensus on.

Let’s make it a bit more precise. The soundness of an unsafe code block is at most affected by the entire module it appears in, and there’s no way to automatically determine when something smaller like a single function happens to be sufficient.

So yes, we could have a feature where the initial auditor states there’s no need to re-audit the whole module. But I doubt it’d be worthwhile in practice because

  • unsafe code authors probably are and should be keeping the modules with unsafe code as small as feasible to make this stuff easier to reason about anyway, and
  • the kind of unsafe code that is clearly sound without checking the whole module is typically a good candidate for moving into a separate crate like std or rayon or pin_utils or bytes or whatever
  • auditing a module with unsafe code against all possible future changes to the safe parts is a much bigger ask than auditing the current state of the module (is that even possible for any real-world examples?)

To forbid code from doing IO via requiring const fn is a more bullet-proof and tractable analysis that won’t go wrong as compared to the complexity of forbidding IO by other means. This is almost as conservative as you get (save for preventing non-termination attacks) and lots of algorithms shouldn’t require more than const fn.

compile-time IO should be a thing.

Yes, it is more bulletproof and simpler but I think it is still too big of a restriction. If we cannot implement the more complex version of the checker then that’s a big problem. The way I see it, either it is possible to implement this checker and we should do it or it’s literally not possible (for reasons) and we are doomed to build a system on human auditors and trust networks. I’ll be honest though, I am not perfectly familiar with the exact limitations of const fn functions (aren’t they constantly changing?) but there is no way that most of the crates can be implemented in const fns. Do we have statistics about this? I don’t know what the average crate is using. Are most of them wrappers around unsafe stuff? Maybe we should start with getting statistics. Maybe this idea is infeasible because the average crate is using at least one unsafe block and using a c library so there’s nothing to gain with an automatic checker.

Basically, everything that isn’t a) unsafe, b) random, or c) environment-dependent should eventually be able to be const.

const is roughly a conservative determinism guarantee. If it can be computed at compile time and be guaranteed to have the exact same result as runtime, it can (theoretically) be const. (This eliminates anything that touches the file system, @Soni, as the file system can be different at compile and run time.)

const is similar to unsafe in that way; the restriction of const is that you can only call other const things. Even allocation is theoretically const-safe, at least if only immutable references “escape”.

For how it’s very conservstive. But hopefully, if you can do it in pure Haskell, you can do it in const Rust.

I think if this proposal were fleshed out a little more, we could reconcile these concerns. Namely, I suggested unsafe-features would be off-by-default in my proposal. What if that were only the case if allow-unsafe = false or unsafe-features were present for that crate, and if they weren’t, all unsafe-features were on-by-default? (or perhaps default unsafe-features would only be honored in the event that allow-unsafe = false or unsafe-features weren’t in use)

Furthermore, since this proposal is built on a conditional compilation, crate authors could even include a safe fallback for users of allow-unsafe, with the option to opt-in to the additional performance using unsafe-features. I think this approach could give consumers of crates choices around security vs performance:

  • Nothing would change for projects which don’t choose to make use of the allow-unsafe or unsafe-features attributes. They’d automatically get opted into the unsafe performance optimizations.
  • Users of allow-unsafe could get a less performant, safe fallback.
  • If the additional performance is desired, unsafe-features can be used instead of allow-unsafe to opt-in to the unsafe optimization.

I imagine things like [patch] could even be used to globally enable certain unsafe-features for all crates.

Overall I would expect a feature like this isn’t commonly used, and would require transitive adoption and support on a crate-by-crate basis in a dependency hierarchy. In that regard I think it’s a bit like #![no_std]: a feature which is extremely useful to a subset of the community (people using Rust in any sort of “high assurance” capacity) which can be largely ignored by the rest.

I discussed this in the “Tough questions” section of my proposal, and suggested that std could be allowed to “bless” certain unsafe code which does not provide access to ambient authority or cause potentially security-critical side effects.

Though I didn’t explicitly call it out as such, I think allocators would belong to this category.

I don’t think that allow-unsafe = false approach or list of crate capabilities will be practical on their own, mostly because of the transitive nature. I believe we need a review infrastructure, so we will be able to configure our builds with conditions like: “allow unsafe only in crate versions reviewed by group X”, “allow network/file IO only for crate versions reviewed by Y or whitelisted by me”, etc. Probably this configuration should be local, i.e. it will not influence build process of downstream crates.

1 Like

Doesn’t this essentially mean that group X is the maintainer of the project? If I simply do not use code produced by the original maintainer then group X is the maintainer, at least for me. I don’t think this approach scales either.

No, X can be an organization which will review crates across whole ecosystem (e.g. by selecting most important crates which use unsafe), the main tasks for people in this organization will be reviewing published crates, not writing code or fixing issues, and they will not have any crate publishing rights, so it’s a bit strange to call them “maintainers”. If for example maintainer will publish a patch update, cargo update will not switch to it until it is reviewed by X.

The only solution that will actually work must be tool driven and automatic, otherwise someone manually has to check every single version of every crate. I think it is the same deal as the memory safety issues Rust is supposed to solve. We thought for a while that we can solve them by simply asking a large bunch of people to review everything and it didn’t seem to work. Of course the automatic tool cannot possibly tell which change is harmless and which is tricky but it should be able to recognize trivially, provably safe ones so only the tricky changes must be manually reviewed, similar to case of the borrow checker…

This idea (in broad strokes) came up on the kickoff calls we had for the Secure Code WG. I opened an issue about it here:

1 Like

I don’t want to hijack, but since you’re considering tracing a bunch of package metadata, it seems like one of the ‘capabilities’ you might want to track is software license.

In effect, if you use a package with a more restrictive (viral) license than you presently have, or a dependency switches to use a license that is more restrictive, people who use rust technology for production software would probably want to know that.

It’s possible this should be a separate item, it just triggered my memory as I was reading the proposal and the notion of ‘trust’ above.