Step-by-step, developer-friendly `#[no_panic]` / `#[warn(panic)]`

Motivation

The topic of panic free code has come up a few times. Some want a subset of the std-lib that cannot panic, others want their code to fail if a function contains a panic path (see no_panic - Rust), others want to incorporate it into the type system and/or have generics to specify when a function may panic depending on other functions (see effect system).

They're good ideas but all have problems:

  • Nobody is working on a subset of the std-lib and a lot of things in there can panic
  • You always have the issue of panics only getting eliminated when optimizations are enabled
  • The no_panic crate works, but cannot tell which panic path still exists, requiring manual analysis.
  • The error message of no_panic isn't really nice (probably because of how panics are detected)
  • Specifying that a function will never panic is (as many suggested) a Semver hazard.
  • Having the panic or no panic info in the type system (or an entire effect system) is a big task, where we likely won't get many benefits for a long time (especially on stable).

Suggestion

Instead I'd like to propose a multi-step plan for improving the workflow for writing panic-free code. Each step can bring a useful benefit and may lead to or simplify other, more complex solutions like an effect system. Hopefully without getting redundant at some point.

Step 1

Add a custom LLVM pass that runs after LLVM optimizations and detects panics that are considered reachable by LLVM. This could also be done by analyzing the output of LLVM, but doing it this way is likely easier. The output would ideally be formatted like rustc error/warning messages or sent through the rustc error reporting system.

The no_panic crate already achieves this (without a custom LLVM pass, through different means), but this likely is not sufficient for Steps 2 and beyond, otherwise the error message would already contain more information.

As many have mentioned doing this before LLVM optimizations is not useful in many cases.

I do not know how difficult this step is.

Benefit of this step: None (as the same can be done using the no_panic crate), but it lays the foundation for the next steps.

Step 2

Add a way to tell rustc which functions it should warn/error on if a panic is considered reachable by LLVM:

#[warn(panic)]
fn foo() {}
#[deny(panic)]
fn bar() {}
#[cfg_attr(not(debug_assertions), warn(panic))] // release builds only
fn baz() {}

Maybe have an easier way to specify warn(panic) only if optimizations are enabled.

Without step 3, this is probably hard to implement, as it effectively requires knowing which functions can call this piece of code. Step 3 won't make it much easier, but at least you have additional information during the LLVM pass.

Benefit of this step: Users can opt-in to warnings (or even errors) for functions they don't want to panic. At this point it's still rather basic and only outputs a yes/no, but it's still useful to get notified if anything changes in regards to panic reachability and even to get an overview of all functions that can still panic.

Step 3

(All?) Panic messages in Rust contain information that points to the source line that caused this panic. This information should be read/extracted by the LLVM pass and displayed as part of the warning/error message.

Note that this does not make the above mentioned semver hazard worse, since a dependency cannot act on this information. If you add a new panic path without a breaking change that's likely still a problem, but the worst that can happen is that a crate that depends on you will get a warning that informs them about something they might want to have a look at.

At this point there may still be some issues. For example: The following can be called with a function that can panic and a function that can't. All approaches I've seen so far that add the panic information to the type system need to deal with this. In our case we don't, though Ideally the warning/error message could show that foo<function_a> panics because function_a panics. For that we need to know which part of LLVM IR is called from which function, but that shouldn't be too big of an issue.

fn foo(f: impl fn()) {
    f()
}

Benefit of this step: It makes the error messages a lot more useful and easier to act on. Ideally the hints would suggest to only add the attribute in release mode, check if this is a new panic path that needs to be considered or remove the attribute.

Step 4

Stabilization. I think this is something that can be stabilized way faster than anything related to the type system, an effect system or something that may cause semver hazards.

Benefit of this step: More people can make use of this when trying to write panic-free code, as it's effectively just a linter warning that is emitted at a later time during compilation (any might not be emitted in cargo check).

Step 5 and beyond

We now have the infrastructure for panic detection, even if they depend on panics. People can more easily experiment with panic-free code (though it's still not trivial to write it) and have better warning/error messages related to it. This infrastructure will likely also be useful for the type system changes, effect system and attempts to avoid semver hazards.

Doing these or a subset of them is likely something we want, but as said above: Those are large topics with some unsolved/unknown issues and likely require something like this LLVM pass, too.

Open questions

  • I do not know how complex it is to write such a panic detection, get the line number and send that information through the error reporting system of Rust.
  • I do not know how complex it is to know from which functions the function that can panic is reachable.
  • I do not know whether there should be a #[no_panic] attribute (or part of the function signature) indicating that adding a panic case will be considered a breaking change. A #[deny(panic)] could already be seen as such, but I think there is still a difference (#[no_panic] should imply #[deny(panic)]), but that's a question that can be answered later if #[warn(panic)] does not imply that that a function won't panic in the future.
1 Like

I think the semver risk is much bigger than your post makes it appear. Yes, from a purely technical perspective it's no worse than the no_panic crate. But socially there is a big difference between "relies an unsupported crate to expose an implementation detail" and "relies a feature built into the compiler for this purpose".

1 Like

True. We also have #[forbid(unsafe_code)] already, which is/can be seen as a promise to not be removed in the future without being a breaking change, though it technically only affects a single crate [1].

Granted, with the exception of maybe unsafe_code I can't think of another lint that has such a significant impact in regards to semver.

I'd personally draw the distinction based on the level used (unless there is a separate distinction from guarantees), though as far as I know it isn't specified at what point something is considered a breaking change:

  • deny/forbid: (Maybe) Promise to not introduce it in the future [2]
  • warn: No such promise

I get what you mean. With this it is possible to change the code in a way that doesn't alter its behavior (panic is unreachable but LLVM doesn't know it), which would then cause the dependant crate to print a warning, or in the case of deny(panic) to fail the compilation. I'd caution against using deny(panic), especially in libraries, as that is effectively an opt-in to your dependants code not even compiling after a non-behavior-changing change.

But if the Goal is to never panic in a specific piece of code, then this is better than the current situation (in my opinion), since you now at least know that there is something you might want to look into. Though it might be necessary to have an opt-out if your dependency unexpectedly adds a panic path (if it is unreachable but LLVM doesn't know that).

The docs should probably clearly state in how far this attribute is/should be considered semver relevant and that you do opt-in to your code not compiling when using deny(panic), similar to what happens when you deny all warnings. But for a warning I don't really see the issue. Rustc does/can already (at any time) add new linter warnings and thus cause new warnings after updating the rustc version. I'd argue this is the same here. And if there is a new panic path that is actually reachable it already is a breaking change. The only difference being that you now know about it thanks to the warnings.

This RFC is largely focused on API changes which may, in particular, cause downstream code to stop compiling. But in some sense it is even more pernicious to make a change that allows downstream code to continue compiling, but causes its runtime behavior to break. 1105-api-evolution - The Rust RFC Book


  1. It can have unsafe code in its dependencies. ↩︎

  2. Strictly speaking this can only be applied to functions that only depend on functions that also have this in stable dependencies. ↩︎

There are two levels of "no panics", one which may be annoyingly restrictive, and one that may be hard to support.

The language can conservatively guarantee that there are no panics by only allowing calls to functions that are trivially obviously panic-free and all functions they might call are also obviously trivially panic-free. This can be implemented easily, and I wouldn't even be surprised if the compiler already had it for some internal reason.

This means that array[i] would be forbidden. Arc::clone() would also be forbidden, because after 18 quintillion copies, it might panic. println! panics, dbg! too. assert! obviously. Lots of things will get blocked unintentionally until fixed, e.g. thread::spawn shouldn't need to panic, but has an .unwrap() somewhere internally.

Code like if 1 == 2 { panic!() } would be forbidden too. Even though this expression is trivial, anything more starts requiring analysis that becomes complex, expensive, or outright impossible, so the easy and consistent solution is not to analyze any dynamic conditions.

Annoyingly, the static panic detection would also forbid calling any dyn functions. You could not use dyn Error or dyn Read, because dyn can be anything, including panics. It would require dyn Error + NoPanic, which probably won't happen, because it's going to require spamming every API out there, and someone will say it needs to be a part of a generalized effect system.

This kind of enforcement may still be useful in limited scenarios (e.g. an unsafe function that doesn't want to deal with the extra danger of something panicking), but I don't imagine anybody writing a #![no_panic] crate.


The second interpretation of "no panic" means that there are no panics after compiler optimizations, so for i in 0..arr.len() { arr[i] } is still allowed, because the compiler can figure out it doesn't need to panic.

This is more useful, and probably what most developers mean when they want no-panic checks.

Unfortunately, this depends on the quality of compiler optimizations. Rust can't guarantee that the optimizations won't change between versions. It can't even guarantee they won't get worse! This sometimes happens by accident, sometimes intentionally when incorrect or unreasonably expensive optimization algorithms get removed.

So even though rustc could have an LLVM pass that looks for calls to panic functions, connecting that to anything more than a warning would be too fragile. Rust code that compiled with one version of the compiler could stop compiling with another, either because LLVM tweaked its optimizations, or because libstd added an extra assert! somewhere.

11 Likes

There are a couple of big downsides I don't see mentioned in the post:

  • this won't work with cargo check it does not actually run codegen;

  • this feature would be tied to LLVM and not supported on other codegen backends, and even if it were to be supported it would yield different results;

  • it ties the compilation of your program to LLVM optimizations: these are very hard to reason about and their results can change inbetween versions, hence increasing the chances that some code won't compile after a rustc upgrade.

2 Likes

And at the moment you don't even know when that happens. At least outside of the stdlib, which might get some kind of analysis as part of the impact of those changes if we're lucky. But as of now there isn't good tooling to detect that (unless you manually analyze the resulting binary, which may have hundreds of valid/expected panic calls).

Yes :+1:

When using #[deny(panic)] or something like #[no_panic]: Yes But not with #[warn(panic)] unless you treat all warnings as errors.

Granted: There may be social pressure on dependencies, the stdlib or rustc when/after adding a panic path that isn't optimized away.

Don't forget about integer arithmetic as a potential source of panics (depending on compiler flags).


Generally I think ensuring that code doesn't panic is a better fit for external verifiers, like Kani or Verus.