Motivation
The topic of panic free code has come up a few times. Some want a subset of the std-lib that cannot panic, others want their code to fail if a function contains a panic path (see no_panic - Rust), others want to incorporate it into the type system and/or have generics to specify when a function may panic depending on other functions (see effect system).
They're good ideas but all have problems:
- Nobody is working on a subset of the std-lib and a lot of things in there can panic
- You always have the issue of panics only getting eliminated when optimizations are enabled
- The no_panic crate works, but cannot tell which panic path still exists, requiring manual analysis.
- The error message of no_panic isn't really nice (probably because of how panics are detected)
- Specifying that a function will never panic is (as many suggested) a Semver hazard.
- Having the panic or no panic info in the type system (or an entire effect system) is a big task, where we likely won't get many benefits for a long time (especially on stable).
Suggestion
Instead I'd like to propose a multi-step plan for improving the workflow for writing panic-free code. Each step can bring a useful benefit and may lead to or simplify other, more complex solutions like an effect system. Hopefully without getting redundant at some point.
Step 1
Add a custom LLVM pass that runs after LLVM optimizations and detects panics that are considered reachable by LLVM. This could also be done by analyzing the output of LLVM, but doing it this way is likely easier. The output would ideally be formatted like rustc error/warning messages or sent through the rustc error reporting system.
The no_panic crate already achieves this (without a custom LLVM pass, through different means), but this likely is not sufficient for Steps 2 and beyond, otherwise the error message would already contain more information.
As many have mentioned doing this before LLVM optimizations is not useful in many cases.
I do not know how difficult this step is.
Benefit of this step: None (as the same can be done using the no_panic crate), but it lays the foundation for the next steps.
Step 2
Add a way to tell rustc which functions it should warn/error on if a panic is considered reachable by LLVM:
#[warn(panic)]
fn foo() {}
#[deny(panic)]
fn bar() {}
#[cfg_attr(not(debug_assertions), warn(panic))] // release builds only
fn baz() {}
Maybe have an easier way to specify warn(panic)
only if optimizations are enabled.
Without step 3, this is probably hard to implement, as it effectively requires knowing which functions can call this piece of code. Step 3 won't make it much easier, but at least you have additional information during the LLVM pass.
Benefit of this step: Users can opt-in to warnings (or even errors) for functions they don't want to panic. At this point it's still rather basic and only outputs a yes/no, but it's still useful to get notified if anything changes in regards to panic reachability and even to get an overview of all functions that can still panic.
Step 3
(All?) Panic messages in Rust contain information that points to the source line that caused this panic. This information should be read/extracted by the LLVM pass and displayed as part of the warning/error message.
Note that this does not make the above mentioned semver hazard worse, since a dependency cannot act on this information. If you add a new panic path without a breaking change that's likely still a problem, but the worst that can happen is that a crate that depends on you will get a warning that informs them about something they might want to have a look at.
At this point there may still be some issues. For example: The following can be called with a function that can panic and a function that can't. All approaches I've seen so far that add the panic information to the type system need to deal with this. In our case we don't, though Ideally the warning/error message could show that foo<function_a>
panics because function_a
panics. For that we need to know which part of LLVM IR is called from which function, but that shouldn't be too big of an issue.
fn foo(f: impl fn()) {
f()
}
Benefit of this step: It makes the error messages a lot more useful and easier to act on. Ideally the hints would suggest to only add the attribute in release mode, check if this is a new panic path that needs to be considered or remove the attribute.
Step 4
Stabilization. I think this is something that can be stabilized way faster than anything related to the type system, an effect system or something that may cause semver hazards.
Benefit of this step: More people can make use of this when trying to write panic-free code, as it's effectively just a linter warning that is emitted at a later time during compilation (any might not be emitted in cargo check
).
Step 5 and beyond
We now have the infrastructure for panic detection, even if they depend on panics. People can more easily experiment with panic-free code (though it's still not trivial to write it) and have better warning/error messages related to it. This infrastructure will likely also be useful for the type system changes, effect system and attempts to avoid semver hazards.
Doing these or a subset of them is likely something we want, but as said above: Those are large topics with some unsolved/unknown issues and likely require something like this LLVM pass, too.
Open questions
- I do not know how complex it is to write such a panic detection, get the line number and send that information through the error reporting system of Rust.
- I do not know how complex it is to know from which functions the function that can panic is reachable.
- I do not know whether there should be a
#[no_panic]
attribute (or part of the function signature) indicating that adding a panic case will be considered a breaking change. A#[deny(panic)]
could already be seen as such, but I think there is still a difference (#[no_panic]
should imply#[deny(panic)]
), but that's a question that can be answered later if#[warn(panic)]
does not imply that that a function won't panic in the future.