A new intrinsic: `can_panic_unwind`

The implementation of sort in the standard library is peculiar. Despite Rust's guarantee of moves being free of user code, it must keep the mutable slice always in a state of being a permutation of the original data. Why? Because if a comparison operation panics these must be the elements dropped. It seems very probable that a specialized implementation that can not observe a panic could be more optimized.

The idea would be to have a new intrinsic that allows this optimization even with potential user supplied function type parameters. Note that panic = abort would allow the same optimization in this regard. Also, the intrinsic could be allowed to always fallback to true if it can not determine a more concrete answer, similar to needs_drop.

// The form of the new intrinsic:
fn intrinsics::can_panic_unwind<T: Fn>() -> bool;

fn merge_sort<T, F>(v: &mut [T], mut is_less: F)
    where F: FnMut(&T, &T) -> bool
{
    if !intrinsics::can_panic_unwind<F>() {
        panicfree_merge_sort(v, is_less);
        return;
    }

    // current code
}
1 Like

Change to FnOnce instead of Fn for more generality.

Can we also add a fn can_panic_unwind_for<A, F: FnOnce<A>>(f: &F)

1 Like

So how would this interact with custom panic impls, say, a panic implementation that raises an non-maskable interrupt on an embedded processor?

Why would that matter? I think the only thing this intrinsic is intended to say is "can the compiler prove that calling this function cannot cause start_panic or whatever to be called". At the end of the day, this is about control flowing in some exceptional fashion (be it by unwinding or NMI).

4 Likes

It matter since extra those additional code paths may require tracking of additional state to allow them being implemented. This extra state might mean upfront costs to all code paths. Consider some theoretical method and its current implementation:

fn map_box<T: Clone>(b: &mut Box<T>, map: impl FnOnce(T) -> T) {
    **b = map((**b).clone());
}

However, we can opportunistically spare the costs of .clone() if we know map to never return via panic. This is because this second implicit return path is conceptually typed () and thus misses the vital new T. With T = Vec<u8> this optimization potentially saves many cycles.

fn map_box<T, F>(b: &mut Box<T>, map: F)
    where T: Clone, F: FnOnce(T) -> T
{
    if !can_panic_unwind<F>() {
        /// No one can observe the emptied box, there are no other paths.
        unsafe {
            let ptr = (&mut **b) as *mut T;
            let new_val = map(ptr::read(ptr));
            ptr::write(ptr, new_val);
        }
    } else {
        **b = map((**b).clone());
    }
}

I brought this up with regards to sorting because the current merge sort keeps a guard with Drop which restore a permutation of the input. However, this seems to require an allocation for the working area with size len() / 2. This is a constant overhead and most pointers used in tracking coincide with pointer we are interested in apparently but more fancy in-place merge strategies might require non-constant state of tracking in case of reset. I'm not here to say I found a potential optimization that would rely on this but just reminding that the consideration of panic is not zero-cost and could inhibit finding one.

4 Likes

Since whether a function unwinds or not is not part of the function type, this intrinsic would return true for all functions.

"can the compiler prove that calling this function cannot cause start_panic or whatever to be called"

The answer to this question is that, at least by just looking at the function type, the answer is no, it cannot prove this for any function (except maybe some other internal rustc intrinsics that have more guarantees).


I'm not saying that solving this problem isn't useful, but that we probably want to impose some constraints on the solutions to this problem:

  1. The rust compiler should be able to tell this by looking at the type of a function, without having to do any interprocedural analysis.
  2. The results of the intrinsic should not depend on the optimization level IMO

With these constraints, a language feature to solve this problem would look very differently from just adding a simple intrinsic.

I think that, ideally, we would have a way to query whether executing an expression would unwind, e.g., because users might want to query whether x + 1 can unwind or similar. So instead of an intrinsic, it would probably need to be a new construct, e.g., unwinds $expr -> bool that you can use as const X: bool = unwinds f(a, b, c + d);. Now, this is problematic, because this would require being able to access local bindings from a const item declaration, which is not allowed, so we probably would also need to add a way to construct values of types in such construct for type checking purposes, so you'll end up writing const X: bool = unwinds f(A!, B!, C! + D!); where I've used the bang to instantiate a value of those types.

Notice how the design complexity has exploded. There are probably many different ways of designing such a feature.

A very different more scoped alternative would be to add a cfg!(panic = "abort") feature that lets you query whether the panic strategy is abort, because in that case, you know that unwinding is not in general possible. That's not as powerful as the feature being discussed above, but would be relatively simple to implement, while delivering quite a bit of value.

1 Like

I don't think that is necessarily true. All function have a (hidden, zero-sized) internal type that can be coerced to an fn pointer, and closures explicitely expose the existence of their type. With type deduction it should be possible to make a closure for the purpose of representing an arbitrary expression. Note that the closure solution also cleans up the type stuff (we don't want to replicate C++ here).

/// That bound is not allowed in stable. Intrinsic don't care.
fn can_panic_unwind<F, Args>(_: F) -> bool where F: FnOnce<Args> {
    can_panic_unwind<F>()
}

// Note: closure arguments are a way to pretend having a value of arbitrary
// types for type checking purposes. No need for `std::declval` nonsense.
assert_eq!(can_panic_unwind_ref(|x: u32| x + 1), false);
1 Like

That's a good idea.

Usually, saying that a function does not unwind is a contract between the function author and the caller, and breaking this contract is an API breaking change. So I'd be wary of inferring the "unwinds" part of a fn item type from its body.

For closures, I think that inferring the "unwind" part of the closure type is fine, so your proposal sounds good.

This doesn't seem unreasonable as an optimization opportunity, honestly. The worst that will happen is that someone will write fragile code that depends on an intrinsic. After all, the intrinsic is best effort: if it encounters slice indexing, most arithmetic, or calls to dyn functions, it will (probably) need to return true (modulo fancier deductions).

To worry about compatibility, you'd need to expose this intrinsic in a stable way. I think having the standard library have access to this information is far, far less of a concern.

I think the main benefit of this is that clearly obviously non-panicking functions can be safely treated as such, like <usize as Ord>::cmp.

1 Like

It would mean that the types of your function would change depending on the optimization level, so code like

fn foo(x: i32) -> i32  { if false { x + x } x }
let x: #[nounwind] fn(i32) -> i32 = foo;

might compile in release mode, but fail to compile in debug mode, because on release we would be able to say that foo does not unwind, but in debug we might not be able to say that because optimizations that remove the if false might be disabled.

Exposing whether a function panics as part of the function type, as spelled out in source code, is a very, very bad idea, and not really the purpose of this intrinsic. This intrinsic is intended to allow unsafe library authors to ask "can I perform this optimization that holds only if none of the operations that follow panic". This is easy to do in non-generic code, because you know all the function calls a priori. This cannot be done for generic functions, because you do not know what code those functions will call.

The results of this intrinsic function necessarily depend on optimization level, because they are an optimization hint. The comparison to needs_drop is apt: types do not record their need for drop glue anywhere on their user-visible surface, but a large class of manual optimizations depend upon best-effort analysis of drop requirements.

3 Likes

Exposing whether a function panics as part of the function type, as spelled out in source code, is a very, very bad idea, and not really the purpose of this intrinsic.

Considering how a language feature proposal interacts with other potential language extensions is what every language proposal should do.

The results of this intrinsic function necessarily depend on optimization level, because they are an optimization hint.

How would this be implemented? Right now, the deduction of whether a function can panic happens in an optimization pass within the LLVM pipeline, that is, after we have already "compiled" the Rust code and sent it to LLVM for code generation. There is no simple way of querying this at this stage and propagating that back.

The comparison to needs_drop is apt: types do not record their need for drop glue anywhere on their user-visible surface

While we do need to precisely track drop glue to be able to properly generate drop calls, we do not need to track whether a function panics nor propagate this information through optimizations because LLVM deduces this for us. Also, the result of needs_drop does not depend on the optimization level.

I don't understand what point you're trying to refute, since your proposed language extensions are orthogonal to such an intrinsic. Being able to query the compiler's (and here I mean rustc, not LLVM) belief about panics on a best-effort basis does not preclude attaching visible panicking information to function types (as opposed to the compiler metadata this would query) which provides coarser (but consistent) with the intrinsic's results. Confounding these two features is incorrect.

You can do best-effort deduction during your favorite lowering step by recursively checking that every expression within does not panic; the only difference across optimization levels would step depends on where and how rustc chooses to do constant folding. It's the same as how you'd implement checking a "non-unwinding function" type, except maybe that one might choose to ignore constant folding in the name of not depending on implementation-defined behavior.

That LLVM computes this is irrelevant; we don't need LLVM to deduce it for us. Like, if we wanted to replace all instances of intrinsic::can_unwind() with the value LLVM deduces, we could probably patch LLVM to add that to the IR... but that's utter overkill past what users of this intrinsic need.

The result of needs_drop is provided best effort; that it currently does not depend on the optimization level is, ostensibly, subject to change (maybe a future compiler could somehow devirtualize calls to needs_drop::<dyn T>(), who knows). Enough other implementation-defined, user-visible behavior (like debug asserts and overflow behavior) depend on optimization level that I feel like this is a red herring.

1 Like

@mcy suggested, in the comment I was replying to here, that the "hidden" part of a function type that encodes whether a function is allowed to unwind or not shall be modifiable by compiler optimizations.

I mentioned that it is a bad idea to do so without considering how this interacts with the proposed language features that make this part of the function type public (e.g. the Effects RFC, the "C unwind" rfcs, etc.), since that would mean that, in practice, whether programs using these features typeck or not would depend on the optimization level.

You apparently disagree that doing so is a bad idea and seem to think that what's a bad idea is to expose as part of the function type whether a function unwinds or not (we already do that though, as part of the function ABI, which is part of the function type), but I don't see any rationale in your comments for why you think that. IIUC all the rationale you are giving is for a different language feature that does not encode whether a function unwinds as part of its type, but as meta-data or similar, and then requires an heroic inter-procedural analysis instead.

(maybe a future compiler could somehow devirtualize calls to needs_drop::<dyn T>() , who knows).

Do you think that needs_drop should return false for a Foo with impl Drop for Foo { fn drop(&mut self) { if false { () } } } ? Whether a compiler can devirtualize a trait object, realize that the destructor does nothing, and remove calls to it, does not change the fact that, at the language level, all trait objects have destructors, and optimizations do not change that. What optimizations can change is which machine code we generate for some destructors, which in some cases is none. I don't think the purpose of needs_drop is to be able to observe in Rust whether such optimizations happened, but to query whether a type has destructors.

Instead of your complex alternative, this also works:

fn map_box<T: Clone>(box: &mut Box<T>, map: impl FnOnce(T) -> T) {
  *box = map(*box);
}

And if it didn't, instead of a clone it's probably better to use ManuallyDrop.

But I realize there are probably better motivating examples.

You likely misread but the first parameter is a mutable reference. It would work if it was not (I did not know that before) but does not work behind a reference.

error[E0507]: cannot move out of `**b` which is behind a mutable reference
 --> src/lib.rs:2:14
  |
2 |   **b = map(**b);
  |              ^^^^ move occurs because `**bo` has type `T`, which does not implement the `Copy` trait

error: aborting due to previous error
1 Like

Ah yes, sorry for that. Looks like you are basically writing an (oddly restricted to Box) form of take_mut -- and indeed that needs to take special care of panics caused by the closure.

cfg for panic = abort seems quite promising, as it's simple to understand and use. Hopefully easy to implement, except libstd is compiled separately and doesn't react to cfg changes…

With deduction of panicking from closure body I'd be worried that the analysis will give up quickly, e.g. on Deref, so it'd say "maybe" for almost all functions.

To quote the docs:

Returns true if dropping values of type T matters.

I'm reading this as "returns whether it is possible for code to observe a difference between mem::drop(x) and mem::forget(x)". So, yes, an aggressively-optimizing rustc could choose to make a T: Drop type have needs_drop return false, if it can deduce that calling the destructor cannot be observed. If this reading is incorrect, we have a documentation bug.

My rationale for a resistance to adding a non-unwinding functions comes down to "noexcept is a massive pain in C++", as evidenced by the piles and piles of noexcept overloads in the STL; Java's throws is also a pretty good example. I am not a fan of exceptional control flow, and I don't think we should make the mistake of making panics part of the type system. I'm sorry for not being explicit about that.

I'm not sure why you think this is a corollary of such a feature. I suspect the confusion here is that, much like how you believe T: Drop must imply needs_drop (something the documentation does not promise), you believe that having !can_panic_unwind(f) must imply that f: #[nounwind] fn() must typecheck.

In a world where the information tracked by Column A and Column B is the same, literal value in a compiler data structure, then yes, such implications hold, but such an implementation restriction should never be specified, because it restricts compilers willing to put in more AOT work to prove stronger statements about the programs they are compiling, so to further optimize that program.

I'll leave with an elaboration of my devirtualization example: suppose you have some sealed (sealed with the usual privacy tricks) trait T. For sealed traits, the compiler knows the total implementations of T, and can, ostensibly, determine that no implementor of T requires drop glue; hence, needs_drop::<dyn T>()[1] could return false and be within the parameters specified by documentation (since it is not possible to observe the destructor not being run: all the potential dtors in the vtable are no-ops). This is kind of a dumb niche optimization, but an example of why you might want this (though your example of a dtor that is a pure function is also pretty solid).

[1] Ok, yes, needs_drop requires the type to be Sized. For sake of argument, assume that it does not have this implicit bound.

In any case, this argument does not feel productive. I have made my point, you disagree, I have given you my rebuttal, and I'm going to let other people decide if they think it's you or I who is in the right.

3 Likes

@mcy You are right, I misunderstood what needs_drop does.

1 Like