RFC use `–cfg reduce_unsafe` to signal preference of safe code over performance

Yeah, it's a fine idea. I've added them in csv 1.1.3.

2 Likes

Also you said "613ns to 893ns slower", but out of how many? is that like 5% slower or 100% slower?

A better, more specific way to phrase it would be:

"TinyVec had runtimes for that set of benchmark functions ranging from 613ns to 893ns, while an average of staticvec, smallvec, and arrayvec averaged ~14ns for all but one of the functions (which averaged ~230ns)."

So depending on how specifically you calculate it, and in what terms, it was more like 300ish up to more than 6000 percent slower.

2 Likes

@SlightlyOutOfPhase Can you post more about that benchmark work over here? https://github.com/Lokathor/tinyvec/issues/39

This post is going to be rather long, because I'll explain how I view and treat unsafe in Rust (both in my code and in third-party crates that I use).

These are my general principles (which I've always used), it doesn't refer to any particular crate or author or event.

When I am writing Rust code, my first instinct is to just always write it in Safe Rust, completely avoiding unsafe.

This tends to result in clean, readable, maintainable, correct, sound, robust, fast code.

The idea that unsafe == fast is often simply incorrect. I suspect people believe it because they come from other languages, where unsafe does usually result in faster code. But Rust is different.

Rust and LLVM are amazing. Safe Rust is usually optimal in speed, so using unsafe does not actually give any speed benefits.

In the rare cases where I must use unsafe (such as with FFI or Pin), I try to use unsafe as little as possible, and I very carefully verify that the code is sound.

In the even rarer cases where the Safe Rust code isn't fast enough, 99% of the time the solution is to use a different algorithm or data structure, not to use unsafe.

In the extremely extremely rare situations where Safe Rust code isn't fast enough, and there aren't any other algorithms or data structures I can use, and unsafe actually gives a speed boost, then I selectively put in unsafe to make it faster, and carefully verify that it is sound.

I assume that the third-party crates I use also follow this philosophy. I have no problem with crates using unsafe, because I assume that they have verified that the unsafe code is sound. It is completely 100% safe to use unsafe, as long as it has been verified to be sound.

And if I ever find unsound unsafe code in a third-party library, then I will not use that library until the unsoundness has been fixed. This is because undefined behavior is "infectious", it can invalidate all of your code, not just the crate which contains the unsoundness. And undefined behavior can even break your code at compile-time! It's a lot more serious than just buffer overflows (though those can happen too).

So how does that relate to this proposal?

  • If a crate is using unsafe because they must use unsafe (e.g. FFI), then this proposal does not help at all.

  • If a crate is using unsafe for performance reasons, presumably it's because unsafe actually does give a performance boost and they have verified the soundness of the code (which is mandatory when using unsafe, since undefined behavior is not allowed in Safe Rust).

    So in that case this proposal doesn't help at all, because the unsafe code is already proven sound, so the safe code isn't actually any safer, it's just slower for no reason.

  • If a crate author accidentally writes unsound unsafe code, then the solution is to politely point this out to them, so that way they can fix the unsoundness.

    This proposal doesn't help with that, since the unsafe branch would still be unsound. The solution is to just fix the unsoundness.

  • If a crate author intentionally writes unsound unsafe code, and refuses to fix it, then this proposal won't help, since the proposal requires opt-in from the author.

    If an author doesn't care about unsoundness, why would they care about using this proposal?

The only way I can see this proposal helping is in the following situation: a crate author intentionally puts in unsound unsafe code which uses undefined behavior in Safe Rust to increase performance (even though this causes catastrophic problems), but the author is willing to have a compile-time switch which enables the "safe but slower" code path.

I don't think we should be catering to that situation, because undefined behavior is simply not allowed in Safe Rust, under any circumstances, because of how catastrophic it is.

Disallowing undefined behavior is the primary core reason for Rust's very existence. It is the thing that separates Rust from other languages. If we encourage undefined behavior, why even bother with Rust, just use C++ instead.

And in any case, the author can already do that (without this proposal) by simply using a Cargo feature to enable the "slow" path.

I'm all in favor of giving tools to authors to help them prove that unsafe code is sound. And I'm all in favor of giving people tools to ban unsafe from their projects. But I don't think this proposal really does either of those things. Instead, something like crev or geiger seem more useful.

9 Likes

This proposal is not for all cases of unsafe code. It is intended only for the cases where both the safe and unsafe code implementations may exist, the unsafe code is believed to be sound and is measurably faster.

Any references above for "unsafe is faster" are post-analysis; demonstrated as faster for practical inputs.

SIMD et al HW accelerated backends are usually gated behind cargo-features or cpu-targets; using safe-only backends as fallbacks. This is already good.

FFI may be necessary and may not have a safe fallback without Rewriting it in Rust (RIIR). This proposal doesn't affect this either.

Offering or asking for safer fallbacks does not imply the unsafe code is unsound. It only allows consumers to decide if the unsafety adds value for their application.

If we ignore the proposed --cfg flag; the two primary points I want to keep are:

  1. Use debug_assert to check invariants when they can be checked. This is done, but not enough IMO.

  2. If both a safe and unsafe implementation can exist with the same interface; use a feature flag to decide between the implementations, as we do with SIMD backends (which may also use #[cfg(target_feature)]). The unsafe backend may be enabled by default and disabled with default-features = false.

(1) promotes testing and (2) promotes benchmarking and choice.

Yes, I am aware of that. But the question is: when is that actually useful?

If the goal is to make it possible to use code without unsafe, what is the motivation for that?

Even if your dependencies don't have any unsafe at all, you must still do full code audits of the entire crate to verify that they don't contain any malicious code or serious bugs (because malicious code can be done entirely in Safe Rust without unsafe). So in that case the solution is something like crev.

If your goal is to ban unsafe (completely or selectively), that's what things like geiger are for.

If your goal is to make it easier for the library author to verify that their unsafe code is sound by comparing it to equivalent safe code (e.g. in unit tests or fuzzers), I think that's a great idea, but this proposal doesn't do that. I think a feature like that would be really cool, but it should be fleshed out and probably made more general.

So could you please explain the motivation for this? What do you gain from this feature?

Also, I noticed you said this:

However, I don't really understand that. If a library crate has a slow feature, and the bin crate enables that feature, then it will be completely enabled for that library, no matter how deep the library is in the dependency graph. So it really does behave exactly the same as --cfg reduce_unsafe

2 Likes

when is that actually useful?

When using the unsafe variant has value; but that value may not apply to all consumers.

If the goal is to make it possible to use code without unsafe , what is the motivation for that?

Less or no unsafe code is easier to review. If the code is all safe, then I don't need to be one of the tiny subset of auditors who understands the nuanced details which determine soundness. Neither I, nor all consumers of each crate should be in this set. Yes, there are cases where unsafe code is necessary. Yes, I may need to trust others. Yes, crev does help.

I recognize unsafety isn't the only issue, but it is the hottest accidental issue. Safe Rust restricts the options for faulty code. Let this be simple off-by-ones or subtle arithmetic faults or actively malicious uses of std::{os,net,fs} or #[no_mangle] extern "C" fn some_libc_function(). The former are the hardest to notice but depending on the application can usually be caught by decent property tests, fuzzing and coverage analysis. Though malicious code may actively evade tests.

If a library crate has a slow feature, and the bin crate enables that feature, then it will be completely enabled for that library, no matter how deep the library is in the dependency graph. So it really does behave exactly the same as --cfg reduce_unsafe

The motivation for --cfg reduce_unsafe was to enable non-additive global control despite other crates deciding. Cargo features are additive only. Using cargo features is fine if everyone uses default-features = false too or do not set or mandate features = ["use_faster_unsafe_variant"].

In case it wasn't clear in my last comment; I don't care so much about the --cfg specifically. I care for more usage of debug_assert and promoting choice for unsafe vs safe backends.

I think it would be a good idea for somebody (not me) to do a casual review of the most popular unsafe crates and see how many of them could replace their unsafe usage with a safe alternative. That would give some perspective on whether this proposal will actually be useful in practice or not.

Reviews of this sort are already being coordinated at:

And I'd like to highlight:

Everyone is invited to participate!

You do not have to be an unsafe expert to help out. There's a lot of work to do just picking crates (ones with a lot of reverse-dependencies are best), and then sorting out where they use unsafe and why. If you think something isn't right just post it in the tracking issue and others can have a look and talk it out.

6 Likes

@Lokathor:

Sorry I didn't get back to this thread for a while. I'll take a deeper look at that Github issue, but after a brief onceover I can say the benchmark posted already by mbrubeck there is one I've actually had an adapted version of in the staticvec repo (comparing StaticVec against instances of Vec that are created solely using with_capacity to be as fair as possible) for a while now, interestingly.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.