"Tootsie Pop" model for unsafe code

The idea that calling my optimized assembly language function is going to, by default, de-optimize my Rust code makes zero sense in that scenerio.

I don't know how to feel about calling this "de-optimization". Rust doesn't want to repeat the insanity of C compilers wrt optimizations and UB, so obviously it can only make aliasing optimizations where it can actually prove that aliasing does not occur. If you use unsafe code, then Rust can no longer necessarily prove that pointers don't alias. This is not the fault of the proposal being discussed here, it's just the state of things on the ground here today.

This hasn't mattered much until now because we've leveraged aliasing relatively little for optimization, but it's going to increasingly matter going forward. If you disagree with the proposal here for how to handle these future optimizations, then please propose an alternative, although I'm having a hard time thinking of an alternative that will have all the following properties: preserve correctness, retain backwards-compatibility, and permit raw pointers to be assumed to be unaliased by default.

Most of the other uses I have for unsafe are all "I have to do something unsafe because that's faster than the safe way" (e.g. using core::ptr::copy_nonoverlapping until clone_from_slice performance is fixed.) The final set of uses is "I'm actually trying to make things safer but the language won't let me without unsafe," e.g. coercing a slice to an array reference.

I don't find these examples particularly compelling in this context. It's perfectly justifiable if one has to resort to unsafe code to work around performance bugs in the standard library, but when it comes to discussing future language features we should assume that the performance bugs will be fixed, rather than hamstringing our future designs to account for past deficiencies.

(This is especially relevant in this discussion, since enabling greater aliasing optimizations in safe code will make it even less likely that your C code will be observably faster than your Rust code).

The problem is that unsafe is overloaded to mean lots of different things. There seems to be at least two kinds of unsafety: one that makes aliasing analysis go wrong and one that doesn't. If the Rust compiler cannot figure out which is which, then there should be two kinds of unsafe marker to differentiate them for the compiler.

In my experience, most of the code outside of libcore/libstd that uses unsafe seems to be the kind that doesn't cause trouble for alias analysis, so having a way to tell the compiler that so that the compiler can continue to optimize aggressively seems important.

BTW, it really would be de-optimizing. Imagine that the Rust compiler's optimizer improves as is planned. You write your code in "safe" Rust, which gets 100% optimized with the improved optimizer. Then you add your unsafe block in an attempt to implement a very localized optimization, which then causes your safe Rust code to be de-optimized.

1 Like

Most of the code in my project is actually written in assembly language (~50,000 lines of it), not C. (There is lots of C, but none of it is used for performance; I just haven't replaced it with Rust code yet.) It would be many, many years before rustc could even hope to optimize the equivalent Rust code to the same level as that assembly language code, in general. (The best C compilers get nowhere close.) Thus, the idea that we won't need to use the FFI (or asm!, which is also unsafe) for improving performance in the foreseeable future is wholly unrealistic.

Out of curiosity how does all this compare to Fortran?

I’m not versed with all the low level details of this topic but AFAIK pointer aliasing was one of C’s ā€œmistakesā€ compared to Fortran which does not allow it. Fortran compilers supposed to produce more optimal code because of this. Fortran integrates with assembly code so how does it insure its invariants? Does it also have UBs like C?

If the Rust compiler cannot figure out which is which, then there should be two kinds of unsafe marker to differentiate them for the compiler.

I believe this is exactly the sort of thing that Niko is proposing at the end of his post, with ways to opt back in to aliasing optimizations if you are certain they apply.

You write your code in "safe" Rust, which gets 100% optimized with the improved optimizer. Then you add your unsafe block in an attempt to implement a very localized optimization, which then causes your safe Rust code to be de-optimized.

I still don't consider this a de-optimization. You're assuming that unsafe Rust should be expected to be uniformly faster than safe Rust, but I don't see how that's founded. Optimizers in general have always been able to perform better with the presence of restrictions, and removing those restrictions removes avenues for optimization. That unsafe Rust, which allows strictly more operations than safe Rust, should be inherently slower than safe Rust is both natural and intuitive to me.

I agree with that too. In fact, I am a huge believer in that. But, consider:

unsafe {
     asm_function_that_is_4x_as_fast_as_rust_eqiv(a.as_mut_ptr(), a.len(), c);
}

My point is simply that code like this shouldn't reduce the optimizations that are done elsewhere, just by its presence.

(Note that I'm not assuming asm_function_that_is_4x_as_fast_as_rust_eqiv is faster; I've actually measured it to be so.)

1 Like

If you pass a bad index to unchecked_get, then, yes, you are breaking Rust's invariants. However, when people call unchecked_get, they at least expect to only pass a valid index.

If that expectation holds, all the invariants that hold in safe Rust hold in Rust code with unchecked_get, and so the optimizer does not need to take any extra care. This is different from e.g. creating multiple mutable pointers to the same value, which must not be UB by itself, but still can't be done in safe Rust (and therefore requires the optimizer to take extra care).

FFI is the same - if your FFI is only interacting with Rust code through scalars and raw pointers, the optimizer does not need to know about it.

If you just stick that code in the middle of a function, the compiler would be forced to assume that your assembly code can possibly stash the pointer you gave it somewhere and access it at will, and therefore will be unable to perform some optimizations.

If you put the call behind a private, safe function, that problem would not exist.

[quote="arielb1, post:49, topic:3522"] If you just stick that code in the middle of a function, the compiler would be forced to assume that your assembly code can possibly stash the pointer you gave it somewhere and access it at will, and therefore will be unable to perform some optimizations.[/quote]

We should find some way to annotate the code to indicate it doesn't do that. Because, the vast majority of the time, it doesn't.

This call is in a safe function already. That's why the unsafe is required. I guess you're saying that if I create an otherwise-useless wrapper function that just calls the unsafe function, the compiler can somehow assume that the asm code doesn't "stash the pointer." But, surely it still can stash it, and the wrapper function doesn't do anything to make that impossible.

Regardless, it would be better to find a simpler and more convenient way to annotate unsafe blocks as being aliasing-friendly than by creating these wrapper functions/modules.

The problem is that pretty much every call that uses the result of as_mut_ptr requires the compiler to make additional assumptions about aliasing - in safe code, the compiler would be allowed to move accesses to just after the call to as_mut_ptr as the call to your asm function could not possibly change it.

Because you want to forbid that while permitting other optimizations, you should wrap an interface rustc can understand around your assembly function:

fn asm_is_fast_on_this_one(a: &mut [u32], c: usize) {
    unsafe {
        asm_function_that_is_4x_as_fast_as_rust_eqiv(a.as_mut_ptr(), a.len(), c);
    }
}

OTOH, if we don’t want do pessimize functions that don’t use raw pointers, like callers of unchecked_get, we would need an additional strategy.

Why not something simpler like this?:

    unsafe noalias {
        asm_function_that_is_4x_as_fast_as_rust_eqiv(a.as_mut_ptr(), a.len(), c);
    }

Where noalias (open to bike-shedding on the name) would mean that there's no aliasing happening in the unsafe block.

I think we see the same problem but want to solve it in different ways. I want to make the unsafe boundary to be the scope of the unsafe block while you want it to be the module. However most of your points seem to be in favour of my approach.

...people who don't know what they're doing may very well be lured into believing that if they don't modify code that is directly in an unsafe block then they can't cause any unsafety

This is my point. I want to constrain the unsafey to as small a point as possible.

but this is provably untrue if they're modifying a module that contains an unsafe block elsewhere.

I don't think that this is true at all. Try to prove that by adding an unsafe {} to a module will make modifying other areas unsafe.

I see that there are cases where this can be true but I think the correct approach is to attack these areas rather then giving up and allowing off screen code to affect what is being worked on right now.

I would also argue that if you ensure that all invariants hold whenever you exit an unsafe block then you won't be able to do unsafe things outside of unsafe blocks.

The only hairy bit is calling safe code from unsafe blocks. But I don't think that is solved by either proposal. However since the root of the problem is in an unsafe block I don't think that is the worst problem to have. But to get back on topic...

I think that restraining unsafety to unsafe blocks is not a lost cause. While ensuring that this happens can't be verified by the compiler I think that it fits perfectly with the "you better know what you are doing" nature of unsafe blocks.

Furthermore keeping this boundary ensures that people not working inside unsafe don't need to concern themselves with these problems even without searching the entire module for the unsafe keyword. I also think that this encourages keeping unsafe blocks to small regions of unsafety that need to be understood as a whole, not often requiring tying that together with nearby unsafe regions.

This is my point. I want to constrain the unsafe[t]y to as small a point as possible.

This is already impossible - as can be shown with examples around the Vec<T>::capacity and Vec<T>::len.

In addition, I feel you're conflating three very distinct concerns:

  1. The zone in which operations may violate the type system (this is and will remain the unsafe block)
  2. The zone in which prior violations of the type system may cause unsafe behavior to arise from safe operations (this is, and will remain, up to the programmer to enforce - but the tool they have to do so is generally privacy, and so is almost always the module)
    • It's worth noting that if aliasing is not violated, this can be as small as the unsafe block - but that would require a new annotation that disclaims any aliasing violations, as @briansmith suggests
  3. The zone in which the compiler may not assume aliasing invariants hold (this is currently unspecified, and Niko's post is all about what the boundary of this zone should be)
    • This is closely connected to (2) - specifically, the compiler needs to not introduce UB to sensibly-written safe operations during optimization, which means that (say) reordering updates of capacity and len needs to be looked at very carefully.

I think that having unsafe fields would make this plausible – this should be researched more before deeming it impossible.

When unsafe code depends on safe code, we can use privacy to protect us and prevent outsiders fumbling with the critical state that the unsafe code depends on. However, it can't protect us from the optimizer, and invariants might be broken by a series of inlinings and rearrangements. But if you think the "state" unsafe code can depend on, there's two kinds: external state: passed arguments, globals, values returned by safe function calls, and internal state: the private fields of the type itself and values returned by unsafe, trusted function calls.

We can't trust external state anyway, and it originates from outside of unsafe blocks. It needs checking, and if it isn't checked, the function should be marked unsafe. Then there's the internal state – I think that marking not only code, but such fields and variables themselves unsafe (unsafe block is needed for access), should be sufficient for the compiler to understand that accesses to such fields and variables can't be rearranged. In this model, every time an unsafe code depends on internal state, there should be an unsafe block (either because of an unsafe function call, or because of an unsafe field access), that suppresses some optimizations.

Thus, I think it's plausible to have a system where only the unsafe blocks form the unsafe boundary. I think that the strong point of this system is that it's easy to understand because it's a simple lexical boundary.

Ah, right - I agree with you that unsafe fields are likely a very, very good way forward on that count. The problem is that not only are those a language extension (rather than simply a promise that the compiler will be conservative under X conditions), but that we’d still need something like this thread proposes unless we’re willing to break swathes of unsafe code until they’re updated for the new mechanism.

Moreover, such breakage - as it would occur in optimization - would be very hard to test for: It’d result in functionally incorrect code, rather than failed compiles.

Why can code be reordered around drop? Shouldn’t drop act like a barrier for the memory it touches?

Unsafe fields will not help here, as the issue we are discussing is basically the opposite of what unsafe fields are trying to prevent: unsafe fields are typically fields that have extra invariants (e.g. Vec.len must be in sync with Vec.data), while optimizer bubbles are trying to make the optimizer understand values that don't even maintain the type-system invariants (e.g. aliasing &mut, overlong &).

In any case, Ref should be using a raw pointer like Shared, for the same reason that Rc<T> can't use a &'static T - it is just stupidly violating the type system for no reason.

I stand corrected. In that case, I think I agree that the refcell-ref case is just buggy - the lifetime annotation on the stored reference is just wrong, and that leads to unsafe optimizations.

This is again one of my motivations for supporting a notion of aliasing based on read/write or write/write conflicts - having rules about when values are live and what that means for aliasing seems incredibly complicated, while it is comparatively easy to check for conflicting actions to the same piece of memory.