Prohibit the use of unwrap() and increase Panics privacy!

rxdiscovery · February 5, 2024, 5:52pm

Hello,

is it possible to add a directive prohibiting the use of unwrap() without exception control? like : #![forbid(unsafe_code)]

for example, have something like this: #![forbid(unwrap)]

I think unwrap should only be used for prototyping or test creation.

Problems with "unwrap()" :

Risk of instability, due to uncontrolled exceptions
Heavier final executable, since it contains file path strings for exceptions (panic)
Exposure of sensitive data, developer identity or project file structure; unwrap() adds code containing file paths, so it records in the final executable all data that could determine the origin of the exception, including the complete path of the source files, which gives the developer's name: ex: /home/{user_name}/project/XXXX/file.rs

more details -> here <-

To verify what I'm saying, you can display all the strings contained in the executable like this: (linux cmd)

cargo build --release
strings ./target/release/{{  project(.exe)  }}  >  str.txt
vi str.txt

I also posted a proposal for cargo clippy here :

github.com/rust-lang/rust-clippy

prohibit the use of unwrap()

opened 05:32PM - 05 Feb 24 UTC

closed 08:49PM - 05 Feb 24 UTC

rxdiscovery

A-lint

### What it does using unwrap() in code exposes it to two risks: - uncontr…olled exceptions (application crash) - exposure of confidential data (such as the project path, which generally contains the OS user's name), since unwrap() integrates the complete path of the source file into the final executable. ### Advantage 1. more stable code 2. improved confidentiality 3. the final executable will be smaller, since it won't include the path strings for panic. ### Drawbacks _No response_ ### Example ```rust let str = "1000".to_string(); let number = str.parse::<i32>().unwrap(); //<------ risk of uncontrolled exception if str does not contain a number ``` Could be written as: ```rust let number = match str.parse::<i32>(){ Ok(n) => n, Err(_) => todo!(), }; ``` thank you in advance :

I think it will allow good practices, such as the use of Match or unwrap_or()...

Thanks to the whole community and to the people who take care of the development of this great language

kpreid · February 5, 2024, 6:09pm

All three of the issues you describe are issues with all panics, not just unwraps. Being able to ensure code is panic-free is desirable and just barely possible, but you may be surprised by how difficult it is to actually write nontrivial code that truly has no panics.

dlight · February 5, 2024, 6:12pm

Unwrap isn't the only or perhaps even the main source of panics. If I had to unscientifically guess, I would say that array, slice and vec subscription x[i] may be the main source of panics in Rust programs (but also that sometimes llvm can remove the panic from the binary; this is also true for unwrap). If you want to prevent panics, probably replacing all indexing x[i] with either x.get(i) or x.get_mut(i) (or better yet, replace it with iterators if possible) is probably also required. At this point, you would also need a lint against slice indexing as well, which is problematic, since sometimes coding without slice indexing makes code very verbose and hard to follow.

So from the point of view of preventing panics specifically, it's nice to either use some tooling to verify your code doesn't panic (either some static checker like prusti, mirai or kani, or the amazing linker hack no_panic or something else), or even better, do this but also compile with panic=abort, which can greatly simplify programs by removing seldom taken error paths (that sometimes aren't heavily tested).

That seems like a big problem. I thought this was already fixed? I'm now wondering if this path leak can be disabled. I've seen some things to automatically replace such paths with standard values, but I can't readily find the thread.

Anyway, apart from bailing out the program in case of errors (which is similar to using ? across the program, and returning Result from main), unwrap has another use: to assert that you know some Option is never None, or some Result is never Err, so that the unwrap will never actually panic, even though the compiler doesn't know that. This is very useful in some kinds of code (which sometimes can be hard to refactor), but unfortunately will lead to a panic code path that is codegen'ed but is never taken. In such cases, if someone is willing to use unsafe, x.unwrap_unckecked() is an option, but it's generally a poor tradeoff (it's much better to take the trouble of redesigning the application so that the unwrap is not needed).

Maybe what Rust needed is two methods: one for "I know this may panic, but I want to close the application abruptly in this case" and another for "this never panics, but the compiler doesn't know that", and we are stuck with a single method for the two cases. (This is a similar predicament to what unreachable!() had - it was either some code path that literally never happens, but also some code path that may happen but in this case you want to bail - and eventually, for the second case, todo!() was added)

burntsushi · February 5, 2024, 6:26pm

This is not considered a general best practice in the Rust ecosystem. There may be cases where outright banning unwrap (or more generally panicking branches) is appropriate, but it is uncommon. See: Using unwrap() in Rust is Okay - Andrew Gallant's Blog

rxdiscovery · February 5, 2024, 6:33pm

I totally agree with you, but unwrap represents a large part of the causes of Panics. If you reduce the number of unwraps, you indirectly increase the stability of applications built with Rust.

dlight:

Unwrap isn't the only or perhaps even the main source of panics. If I had to unscientifically guess, I would say that array, slice and vec subscription x[i] may be the main source of panics in Rust programs (but also that sometimes llvm can remove the panic from the binary; this is also true for unwrap). If you want to prevent panics, probably replacing all indexing x[i] with either x.get(i) or x.get_mut(i) (or better yet, replace it with iterators if possible) is probably also required. At this point, you would also need a lint against slice indexing as well, which is problematic, since sometimes coding without slice indexing makes code very verbose and hard to follow.

So from the point of view of preventing panics specifically, it's nice to either use some tooling to verify your code doesn't panic (either some static checker like prusti, mirai or kani, or the amazing linker hack no_panic or something else), or even better, do this but also compile with panic=abort, which can greatly simplify programs by removing seldom taken error paths (that sometimes aren't heavily tested).

Thank you for your detailed reply, I'm going to test those static_checkers you mentioned

I think you're talking about trim-paths :

https://rust-lang.github.io/rfcs/3127-trim-paths.html

the problem is that many crates use unwrap, so indirectly Panic! you easily end up with lots of potential Panics, and lots of debugging data included in the final code.

I agree with everything you've written, but giving developers the possibility of writing Strict code like javascript with its Strict Mode:

or at least add a warning with Clippy.

pitaj · February 5, 2024, 7:20pm

Clippy has a lint that covers this:

clippy::unwrap_used

rxdiscovery · February 5, 2024, 8:14pm

thank you for this information, @kpreid also told me this on github

kpreid · February 5, 2024, 10:21pm

If you make programmers not write panics, some of them will instead decide to leave out assertions that could have been made, or substitute placeholder values in a situation that “can't happen” (like number_in_an_option_that_should_be_present.unwrap_or(0) even when zero makes no sense).

Under most circumstances, stability should come secondary to correctness. Part of the point of making it easy to panic is to make it easy to stop when evidence of a bug has been discovered.

(Sometimes the right answer is to catch and log the problem and continue or explicitly start over from some point, but passing Result for everything is generally both unergonomic and doesn't provide much information about the failure unless you sprinkle error transformations everywhere, so in that case, using panics and catching them is often the right choice.)

quinedot · February 6, 2024, 1:50am

I don't have a problem with the plan to sanitize compiler-generated paths by default, or otherwise improve the baseline scenario.

However, whenever this comes up I see people frantically hammering boards across a window with a barn door wide open behind them. In a typical Rust development environment, cargo is downloading code off the internet, compiling the code including any build.rs, and running the build.rs executables. Don't do that with your personal account; create a development account.

Even if you're content to fully trust upstream,^[1] and we get sanitation by default, it won't fully solve this problem. It's easy enough to end up with the build path or username in the artifacts with no ill-intent in other ways, for example by including some environment variables (env!^[2]). This could be done non-maliciously by yourself or by any of your dependencies.

So if it's truly a serious concern, you need something like a CI check on the generated artifacts.^[3]

or your build paths are just inherently sensitive for some reason ↩︎
The example for which uses PATH, and if you installed using rustup... ↩︎
In addition to trusting upstream, who aren't limited to doing anything so obvious. ↩︎

farnz · February 6, 2024, 11:52am

Worse, because people are doing this, instead of the system panicking because the "impossible" (according to the original programmer) has happened, the system will appear to function, but will give erroneous outputs.

Sure, it's nice that your missile guidance system hasn't panicked; but it's rather bad if it's avoided panicking by guiding all the missiles to your national government's place of work (Parliament, Senate building, whatever you have).

One of the lessons of the last 50 years of software development is that we only fix bugs that we notice, and we tend not to notice bugs via code review, but rather during testing or when the code misbehaves in production. Panicking is a really good way to change "this can't happen - so if it does, the code does something unpredictable and unreasonable" into "this can't happen - so if it does, the code does something predictably bad". And something predictably bad is easier to debug than something unpredictable.

JeffBurdges · February 6, 2024, 4:27pm

As mentioned above, you should expect panics from array indexing, integer overflows, divide by zero, etc. much more than from unwrap because at least some human thought about their unwrap. In particular, if you want panic free code, then you must avoid binary arithmatic operators too, and instead write checked_add or whatever.

If unwrap concerns you, then instead ask folks write .expect("..."), .checked_add(y).expect("..."), assert!(cond, ...), etc where ... gives some argument the code never panics. In required, you enforce this during code review, since some human must read those arguments anyways.

bascule · February 6, 2024, 5:00pm

FWIW I would love if Rust had some sort of first-class no-panic support, especially for use on e.g. bare metal embedded platforms where you always want to handle every error rather than crashing the program.

I'm not sure a lint is the best way to achieve that, however. Ideally I think the compiler would in some way support a full no-panic profile where features that require panics simply aren't available and would cause a compile error with a reasonable error message explaining the situation.

One way this could be achieved is via std aware cargo, where you could imagine core and std work like crates and have features just like other crates. One of those features could be panic (enabled by default, of course), and disabling that would remove all functionality from the language that could potentially panic.

This would obviously be very hard, with some tough corner cases, e.g. array[n] could still potentially be used, but only in cases where the compiler can prove to itself that the array access would always be in-bounds.

quaternic · February 6, 2024, 10:01pm

I suppose the following is somewhat orthogonal to your main points (that the language should support non-panicking contexts), and rather "if it did support those, how could that be more ergonomic".

There has been some work towards pattern types which would allow expressing at least some common preconditions in the type system (such as u32 is 1.. for a nonzero value of type u32). (Recently active Pre-RFC : Pattern types RFC · GitHub , discussions in zulip afaik)

Many existing APIs which have some preconditions and panic if those don't hold. If those conditions become expressible in the types, it would be desirable for the existing APIs to adapt their signatures, but backwards compatibility must be kept. (The following is my personal views/ideas, not representing any consensus.) For example,

// OLD
/// panics if d == 0
fn div(n: u32, d: u32) -> u32

// NEW
#[panic_if_arg_no_match_bikeshed]
#[cannot_unwind_bikeshed]
fn div(n: u32, d: u32 is 1..) -> u32

Where the latter essentially refers to two functions with different signatures:

#[inline(always)]
fn div_check_patterns(n: u32, d: u32) -> u32 {
    match (n,d) {
        (n, d @ 1..) => div_refined(n, d),
        _ => panic!("...")
    }
}
#[cannot_unwind_bikeshed]
fn div_refined(n: u32, d: u32 is 1..) -> u32 {
    // this code can do the division assuming d != 0
    ...
}

Or alternatively you could see it as: "The caller must check the arguments match, and panic if not. But a wrapper function is provided for when it is cast to a function pointer fn(u32, u32) -> u32".

Code calling div(n,d) where d: u32 is 1.. is already statically known could directly call the refined version, which would also be available in a #[cannot_unwind_bikeshed] context.

This could provide for such a non-panicking subset to still be able to use many standard APIs, but only in the restricted form where the caller must have already refined the inputs to fit the required conditions.

// array indexing is natural
fn index<const N: usize>(this: &[T; N], idx: usize is 0..N) -> &T
// slice indexing would require some fancier features
//(something like being able to substituting "N = dyn usize" into the above? )

Oh, and the thread topic was unwrap, so:

impl Option<T> {
    #[panic_if_arg_no_match_bikeshed]
    #[cannot_unwind_bikeshed]
    fn unwrap(Some(x): Self) -> T { x }
}

The syntax is a bit handwavy, but I think that would be rather neat. ^^ (even if somewhat redundant compared to just destructuring)

rxdiscovery · February 7, 2024, 5:27pm

That's another problem too, all it takes is for someone to commit malicious code on a "crate's" build.rs to wreak havoc.

You'd have to add the option of disabling build.rs execution (Cargo parameter), or simply build in docker containers, but with the slow compilation time and the need to add virtualization to each build, I doubt it would be practical outside the context of a Release build.

mathstuf · February 7, 2024, 5:38pm

I feel like the panic!-using constructs would still be there and instead rely upon the optimizer to remove the actual calls. At that point, is the improvement (for the proposed complexity of hiding potentially-panicking APIs) worth it over no_panic?

rxdiscovery · February 7, 2024, 6:09pm

Interesting information, I did some tests with rust 1.75, apparently it removes the sensitive part of the paths by default in release mode (only if no libraries(crate) are used).

Code :

fn main() {
    println!("Hello, world!");

    let str = "AAAA1000".to_string();
    let number = str.parse::<i32>().unwrap();

    print!("{}", number);
}

I'll use the linux command strings ./xxxx > str.txt to extract the strings contained in the final executable.

Cargo build --release :

I find this:

Cargo build :

I find this:

but, for example, if I add crates, this is what the release mode looks like (a more realistic project) :

use std::convert::Infallible;
use std::net::SocketAddr;

use bytes::Bytes;
use http_body_util::Full;
use hyper::server::conn::http1;
use hyper::service::service_fn;
use hyper::{Request, Response};
use hyper_util::rt::TokioIo;
use tokio::net::TcpListener;

async fn hello(_: Request<impl hyper::body::Body>) -> Result<Response<Full<Bytes>>, Infallible> {
    Ok(Response::new(Full::new(Bytes::from("Hello World!"))))
}

#[tokio::main]
pub async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    pretty_env_logger::init();

    let addr: SocketAddr = ([127, 0, 0, 1], 3000).into();

    let listener = TcpListener::bind(addr).await?;
    println!("Listening on http://{}", addr);
    loop {
        let (tcp, _) = listener.accept().await?;

        let io = TokioIo::new(tcp);

        tokio::task::spawn(async move {
            if let Err(err) = http1::Builder::new()
                .serve_connection(io, service_fn(hello))
                .await
            {
                println!("Error serving connection: {:?}", err);
            }
        });
    }
}

Cargo build --release :

release2-1

You really can know almost everything about the code,

Project file structure
OS used and its version
Language used and compiler version
LLVM version
Developer's name
Crates used
Crates version (and easily deduce vulnerable crates)
The type of exception the application may be subject to
etc..

but there's another problem: there's too much useless data in release mode :

For embedded developers, with little memory available, it can still be optimized.

We need to review the management of error messages in Panics! in release mode, I think it's very dangerous to expose so much information.

and remove unnecessary information for a release ( eventually keep the Rust version for advertising purposes )

bjorn3 · February 7, 2024, 6:50pm

Those are in the .comment section, which is emitted even when there are no panics anywhere. GCC, Clang and the linker also put their version string in there. If you don't want it, you have to strip it yourself. The used rustc version can be derived from the symbol names too.

--remap-path-prefix exists for removing sensitive information about the paths. And a cargo feature to help with this is being worked on.

The used crates and versions can be deduced from the symbol names too.

Use panic_immediate_abort if you don't want panic messages.

rxdiscovery · February 7, 2024, 7:06pm

Thank you for your valuable advice,

I've already specified "abort" in Cargo.toml

[profile.release]
strip = true # Automatically strip symbols from the binary.
panic = "abort"
lto = true
opt-level = 3
debug = false
codegen-units = 1

the compiler still adds the panic strings!

--remap-path-prefix does it work for crates too?

bjorn3 · February 7, 2024, 8:35pm

panic_immediate_abort is a feature you can use when recompiling the standard library itself with -Zbuild-std. It eliminates all calls to the panic handler (which is called even with panic=abort, but in that case aborts after printing a panic message rather than unwinds) and replaces them with aborts. See GitHub - johnthagen/min-sized-rust: 🦀 How to minimize Rust binary size 📦 for an explanation on how to use it. That page also lists -Zlocation-detail=none which doesn't require the standard library and only eliminates the panic location, but not panic message.

You can pass it multiple times to remap multiple paths.

kornel · February 8, 2024, 11:49am

For private paths, the nightly option -Z location-detail=none is great. It removes all the paths from all panics and debug strings.

Topic		Replies	Views
Better panic location reporting for `unwrap()` and friends compiler	13	4191	March 25, 2019
Context on unwrap() libs	7	1277	July 24, 2019
Line info for `unwrap`/`expect`	8	4509	June 12, 2020
Suggestion unwrap_safe for Result<T, !>, or `impl From<Result<T, !>> for T` libs	9	1225	May 26, 2019
RFRFC: Better Option/Result error messages libs	8	3074	March 25, 2019

Prohibit the use of unwrap() and increase Panics privacy!

Related topics