Result<T, E> and Option<T> optimization

On the Rust-lang users forum I've found out that the compiler will actually optimize this kind of code, so that processing time is not wasted:

fn process(data: SomeDataType) -> Result<(), Error> {
    // a lot of processing and/or allocations here
    data.something()?;
    // a lot of processing and/or allocations here
    Ok(())
}

However, it happens only when the compiler

can tell that your other work has no side-effects, but there are going to be many cases where it is unable to make the reordering because it can't tell that it doesn't have side-effects, or maybe the code does something that you don't consider to be a side-effect, but which the compiler does count. (e.g. increment, then decrement a ref-count)

So wouldn't it be better to check the result at it first appearance instead of checking for side-effects and reordering if the ? is used?

For example, the code I've listed above would be compiled to something like this:

fn process(data: SomeDataType) -> Result<(), Error> {
    if (data.something().is_err()) {
        return Err(data.something().err().unwrap())
    }
    // a lot of processing and/or allocations here
    data.something()?;
    // a lot of processing and/or allocations here
    Ok(())
}

And to something like this when the data is not available at the beginning of the function:

fn process() -> Result<bool, Error> {
    // a lot of processing and/or allocations here
    let something: Result<SomeDataType, Error> =  get_some_data_type();
    // a lot of processing and/or allocations here
    something?.get_something();
    Ok(())
}
fn process() -> Result<bool, Error> {
    // a lot of processing and/or allocations here
    let something: Result<SomeDataType, Error> =  get_some_data_type();
    if (something.is_err()) {
        return Err(something.err().unwrap())
    }
    // a lot of processing and/or allocations here
    something?.get_something();
    Ok(())
}

This can help to have code optimized and free of boilerplate code at the same time.

Example of such code:

fn process(data: SomeDataType) -> Result<(), Error> {
    let something = data.something()?;
    let something1 = data.something1()?;
    let something2 = data.something2()?;
    // a lot of processing and/or allocations here
    something;
    // a lot of processing and/or allocations here
    something1;
    // a lot of processing and/or allocations here
    something2;
    // a lot of processing and/or allocations here
    Ok(())
}

Link to the original post on users forum:

No that wouldn’t be better. In particular, it would also change the behavior of existing programs in observable ways. Any change that breaks Rust’s stability guarantees is immediately a no-go already.

The first “processing and/or allocations here” step could e.g. print some stuff to the terminal and such, we want predictable program behavior! There’s going to be lots of code out there that relies on the fact that everything before a ? can return early is in fact executed and not skipped.

The only time when this optimization is okay is when the compiler can tell that not executing the code in question doesn’t make any observable difference (beyond perhaps the improved performance, which can be kind-of visible).

There’s certainly opportunity in enabling the compiler to catch more cases of side-effect free code; in particular allocations are quite often not optimized away even if they could.

If you’re writing some code that can benefit from the kind of transformation you’re describing without changing the code’s behavior in an undesirable way (i.e. in your concrete use-case it’s okay to skip the first “processing and/or allocations here” step), then you’ll have to write your code this way yourself.

Also note that ? is nothing special, it’s just syntactic sugar for something mostly equivalent to match EXPR { Ok(value) => value, Err(e) => return Err(e) }.

12 Likes

IIRC the compiler can not optimize away those because allocations are currently considered observable.

They're not. For example, here's a test from rust's suite that this allocation optimizes out:

1 Like

Hm, then it's weird that Rust is unable to optimize trivial cases like this: Compiler Explorer

It could have something to do with potential panics, but IIUC Box:new may panic as well.

It's not completely trivial because it needs to understand the "reallocation" in the push.

If you use with_capacity(1) instead of new(), then it does optimize out: https://rust.godbolt.org/z/xnGzET7c7

So it looks like realloc does not get the same treatment as alloc? Since otherwise reallocation logic itself should be easy for compiler to analyze purity-wise. Or is it something else?

UPD: The reason could be that do_reserve_and_handle is marked as #[cold], which prevents its inlining and more deep compiler analysis.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.