Add rustc flag to disable mutable no-aliasing optimizations?

Let's see some examples of Rust code so that we have something concrete to work with.

14 Likes

I still think this is actually pretty bad. Note that other raw pointers don't get the same benefit of being able to be used again after the reference is gone.

Consider the internals of std::collections::LinkedList. I don't see how it isn't in violation here if it ever holds a &mut Node<T> to a node thats currently in the list. This would invalidate the pointers to that node that exist in other nodes. If you look at std::collections::LinkedList's implementation, it frequently makes mutable references to nodes, as mutable references are easier to work with than pointers.

Is it possible to write a correct implementation of std::collections::LinkedList? Of course, so long as you always stay in the domain of raw pointers. Unfortunately, Rust's ergonomics try very hard to force you to use references everywhere, which is quite dangerous for unsafe code.

If I'm going to be completely honest... I think we've barely begun to deal with how much existing code isn't correct under these rules, and are getting by because they haven't been enabled, and because the optimizations that make this incorrect rarely apply.

1 Like

I haven’t read that much of its implementation, but it isn’t in violation anywhere that I’ve looked so far. Feel free to point out any particular point in its code that you think might be problematic, but let me first repeat the second part of my own previous post

If you add a Node to a linked list, it gets created as a Box<Node<T>>. That box is turned into a single raw pointer (more precisely, a NonNull). That raw pointer is then copied into other nodes when those are added. In this logic, there is only multiple copies of a single pointer. If you create a reference to the Node from such a pointer, none of its copies are invalidated.

3 Likes

For reference, the rules @steffahn is describing come from Stacked Borrows (see also the links here), the current work-in-progress candidate for Rust's aliasing rules. The paper includes proofs of correctness for several kinds of optimizations rustc would like to perform. (Stacked borrows is even implemented by Miri, so it can dynamically check that you're following the rules!)

But this goes well beyond compiler optimization. User-written code also needs to be able to rely on these rules. If you write code that relies on a different ruleset (a hypothetical -fno-strict-aliasing equivalent) then that code becomes completely incompatible with the rest of the ecosystem, and vice versa. For the concept of "libraries" to continue working, all Rust code must agree on the ruleset for the various kinds of references and pointers.

Fortunately, this does not mean Rust is incapable of expressing the programs you want to write. It just means you have to write them with different types! This is the whole point of types- use different types to mean different things, don't use a compiler flag to change their meaning. No amount of discussion is going to change this.

So to echo burntsushi, if the existing set of types is insufficient for your use cases, you'll need to be more specific about the problems you're trying solve. While Rust fundamentally cannot offer a compiler flag like you suggest, it's still totally possible that the single aliasing model can be tweaked or extended in new ways (as Stacked Borrows has been several times, and will continue to be in the future). But only if we know what exactly you want to do with it!

13 Likes

cc @RalfJung

Sorry in advance for the length. I've wanted to request something like this for a while, for completely separate motivations than the initial request. I also believe what I want is subtle enough that I should explain it precisely.


In particular, my belief is that IMO, &T and &mut T should only impact what raw pointers are allowed to do for the duration of the lifetime of the &T or &mut T. This follows the previous "common knowledge", as described by the the part of the UnsafeCell where it describes the aliasing rules (to be clear, it also notes that are in flux)

If you create a safe reference with lifetime 'a (either a &T or &mut T reference) that is accessible by safe code (for example, because you returned it), then you must not access the data in any way that contradicts that reference for the remainder of 'a .

(from UnsafeCell in std::cell - Rust)

In truth, I'd like this to be more-or-less the extent of Rust's aliasing rules. Unfortunately it likely won't be, because it's weaker than (and thus incompatible with) what's assumed by our use of LLVM's noalias attributes on &T and &mut T, which we use and presumably would like to continue using. And so I'd like a way to opt out of that.

Specifically to @InfernoDeity's question about precise semantics, I don't know enough about formal models of programming languages to explain it in terms of the abstract machine. What I want is for the aliasing required implied by &T and &mut T to not have no impact after those references do not exist.

  • On compilers using LLVM, this wouldn't be sufficient to use noalias on &T or &mut T the way we do now.
    • However, if the compiler can prove a raw pointer to the type was never created, it's fine if it adds noalias.
    • ... With possible debate around "never created" vs "currently exists" — I'd like this to work for ptr-to-int too (which probably requires "never created"), but I'll concede that is entirely separate and based on the fact that I want ptr-to-int stuff to continue working...
  • This would also be required to disable any hypothetical MIR optimizations that were based on this, although IDK what they would look like.
  • However, this continues to allows libraries and code to still rely on the &T and &mut T rules they expect (to @josh's point).
    • That is, (I believe) this doesn't change the semantics of any documented aliasing rules, it changes the semantics of undocumented ones.
    • It also changes ones that would be very hard for libraries to sensibly rely on in the first place, and brings them more in line with the rules they are relying on (for example, I'm unsure how useful stable_deref_trait is under the strict LLVM-noalias-compatible version of alising rules, but it's completely fine here.
  • Doesn't allow users to create aliasing a &mut T that aliases any other reference, or a &T that gets mutable.
    • All code that does this is incorrect, will be incorrect under this flag, and has been known to be incorrect since before 1.0
    • Allowing it via a flag would fragment the language in a big way and if we're going to do that, it should be via the flag we really want: the one that disables the borrow checker (kidding, kidding).

From @steffahn's post:

I'm not convinced here, but I think the details of this will cause us to get too far into the weeds, so I'm gonna delete the bit I wrote about it, in an effort to keep this already very long reply at least stay focused. I'll try and bring up a thread on the UCG or a github issue if I find concrete problems.

I also do think that the distinction you're making is sufficiently subtle that a lot of code is liable to get it wrong, and has in the past, though.


From @yigal100's post:

@tcsc 's point above falls under this category - Rust's unsafe story is being actively worked on, e.g. @RalfJung and others have been on it for quite a while now with good progress.

While the UCG (which I actively participate in) does help here, largely it's interested in new language features to add workarounds for these cases. Consider addr_of!, one of the use cases for addr_of! is to allow code to create a pointer to a derived field without going through a reference, because that reference will invalidate other raw pointers to the type.

The problem with an approach like this is that addr_of! only was added in the latest edition. The vast majority of code that needs to use it isn't doing so, since it didn't exist. If I pull down a 3 year old data structure crate via a transitive dependency, it won't use it, and it probably never will.

For example, the stdlib had this bug for many years slice::swap violates the aliasing rules · Issue #80682 · rust-lang/rust · GitHub (which, to be clear, is an exact issue of the problem I'm discussing), and it was fixed by using addr_of_mut, a tool that was only added recently. IMO, until fairly recently, we had believed this pattern to be correct. Prior to participating in UCG, I certainly had thought so (after all, it's a raw pointer).

Whatever the final design we'll arrive at simply cannot be "let's just forgo Rust's principles of safety".

I think this is either a complete misinterpretation of my point, or not directed at me. Hard to tell.

My motivation is improved safety. That said, maybe this isn't directed at me, and I do completely disagree about wanting the ability to shoot myself in the foot.


From @burntsushi's post

That is, show some Rust code today that you would like to write that has non-ideal codegen.

I think this is a little confused about the point. A flag like this isn't for improving codegen, it's to disable a set of optimizations that were previously unexploited, but now are, and that are particularly hard to reason about.

For concrete examples, I'd like the code in slice::swap violates the aliasing rules · Issue #80682 · rust-lang/rust · GitHub to be correct as it was, without needing addr_of_mut!, as discussed above

Specifically, because while I can live with addr_of!, but I'd like a flag that allows old, more naive unsafe that I might be pulling in via a transitive dependency (written before we knew as much about this as we know now, and before there were even tools like addr_of that could have been used to avoid the problem), to not be miscompiled because the compiler started exploiting a particular kind of UB.

(Honestly, I feel like it should have been from the other side. The people who wanted to turn on a dangerous UB-exploiting optimization should have had to justify it by showing the bad codegen (this feels especialy true given that it needed a hack/heuristic to allow things like !Unpin to continue working)... But such is life).

3 Likes

Yes, I would strongly suggest that. And let's be real. Rust doesn't even have a reliable picture of its abstract machine now. ^^

Okay. Well take for example the skeleton support for Rust that was just added to the Linux kernel. Let me ask you something. Do you think that every mutable pointer passed into Rust code there will be unaliased? Or is it C, and it's going to have pointers to that chunk flying around everywhere? Even if they go through manually and ensure that none of them alias, some dude is going to introduce a commit on the C side that breaks this, and it will be a subtle bug that will torment users for a year and a half until someone figures it out. Do you expect them to use raw pointers for everything in kernel drivers? If so, most of the point of Rust is lost.

I'm sorry, but I disagree. It can and should. Let's step for a moment back into the void, back into C-land.

Say you compile some library with aggressive strict aliasing optimizations, *restrict pointers are everywhere, but your code uses -fno-strict-aliasing.

You have a sea of objects with properties and data that can be pointing almost anywhere. Because you're not watching the control flow (it's just too complicated, who knows where that pointer came from?), you don't notice that you're passing the same pointer twice, to a function that takes only 2*restrict pointers.

What do you think will happen? Nothing good. Maybe nothing, depending on the code. Maybe a subtle bug that tortures you for months.

It's the same here. Unsafe code already comes with a responsibility and an intrinsic risk. You can blow your whole leg off. If you're passing aliased &mut pointers into Rust crates, you deserve whatever you get.

But there's another, rather easy solution here that you're not considering.

The problem stems from crates interacting badly with different aliasing rules.

So, just patch cargo too. Require an explicit option in Cargo.toml, only valid for the top-level crate (error out if it's only detected in a dependency), and which enables -fno-strict-aliasing on every dependency for that crate, whether they want it or not. Problem solved!

Oh, you're worried about std and core? If it's a problem, maybe just unconditionally build them with -fno-strict-aliasing. I doubt any performance lost there will be enough to really care about. Or, or, just have cargo download and build std and core like every other crate if that option is detected. There's probably other solutions there.

What I'm trying to do is remove undefined behavior in unsafe code. I'm not trying to create more UB or confusion or bugs, I'm asking for a compiler switch that will prevent them. That is what Rust's real purpose is.

What's the point of the borrow checker? To prevent accidental human error. What's the point of this flag? To prevent accidental human error, by disabling the rule that makes it an error in the first place.

Even if the compiler doesn't optimizes based on it, a large chunk of all code depends on &mut never aliasing. Take for example memcpy. Even with -fno-strict-aliasing it is UB to pass aliasing pointers as the memcpy implementation is free to copy bytes or larger chunks in whatever order it wants. If the source and destination alias, copyinh a chunk to the destination may overwrite the source in arbitrary ways.

Without this rule inside the compiler it is still an error. Crates depend on it and as such you would still have to follow the rule. Having this flag increases the risk of accidental human error by making it more likely that someone will just ignore the aliasing under the assumption of it being fine.

Disabling mutable no-alias optimizations does not and can not fix your code most of the time. It will only hide UB for the current version of the compiler, standard library and your dependencies. A change in any of these may break your code at any point.

7 Likes

I'm sure some crates do depend on that. And again, if so, that's your bad for passing in an aliased mutable reference. That code will fail regardless of the option. But in your own code, where most of this stuff actually occurs, it could alleviate a source of worry. Maybe you could hold it wrong and it ends up getting into a dependency, but that's your fault, and it's part of the risks of unsafe code that you should have expected anyways. For the vast majority of cases where you would actually use it, -fno-strict-aliasing would prevent bugs, not create them.

You're misinterpreting what the actual rules are.

Initially, every pointer passed into Rust code from C starts out raw, and at the top of the borrow stack for its target memory location. Rust can then freely derive a &T or &mut T reference from it without invalidating anything, regardless of how mutable and/or aliased it is on the C side.

The only way creating such a reference can invalidate anything (and the subtle point that people are getting upset about) is when there are already other things above the pointer on the borrow stack- and those are what gets invalidated. To hit this problem, you have to:

  1. Create a reference from a raw pointer, pushing a new entry onto the borrow stack.
  2. Keep that reference around (or other references or raw pointers derived from it).
  3. Create another, conflicting reference from the first raw pointer (or another one equivalent to it), invalidating the stuff from (2).

But this is nothing more than a special case of the normal &T/&mut T rules - that is, creating the new conflicting reference can invalidate raw pointers... but only if they were derived from a reference that just got invalidated!

3 Likes

You know, I've never been able to get a straight answer anywhere as to what the rules actually are. Everyone I talk to has a completely different idea of what the rules are. I have no idea who to believe. If that's true, that's good, but it still opens up a whole can of worms of potential UB. You can remove the problem from your own crate entirely by disabling the optimizations. But, currently, you can't. Rust doesn't have the option. It doesn't mean people want to create a ton of aliasing &mut everywhere. It's a safeguard in case you do, and don't notice. And frankly, with the rules the mess they are, I don't think it's a bad idea to have that insurance.

Yes you have, they have been stated several times above. @steffahn has laid out the RustBelt rules quite clearly.

You mean RustBelt, the competitor to Stacked Borrows? I thought Stacked Borrows was the one Rust used. You see the problem.

While I don't think Rust should have a -fno-aliasing flag, I think the rules aren't that clear and they're far, far too subtle. It's way too easy to shoot yourself in the foot with Rust, as demonstrated by issue #80682: slice::swap violates the aliasing rules, which tcsc linked to above. I think a lot of people will write code that accidentally hits the exact same issue. Unfortunately I don't have proof of this, as I haven't done a wide audit or investigation. But it's certainly something I've done in the past (and being honest, it's probably something I'll accidentally do again in the future).

3 Likes

RustBelt produced stacked borrows.

Sure, they should be written down somewhere more official if they haven't been already. But that doesn't mean that they haven't been clearly laid out in this thread.

Well, you're not wrong that the rules aren't fully defined yet. Rust is still young, so the process of defining them is still ongoing.

But on the other hand, steffahn has given pretty reasonable descriptions, and I linked you to a bunch of Stacked Borrows stuff, which is where that work is happening. In fact one of those links was to the Miri tool, which will automatically check your program for you! It's not hopeless.

I also don't really see any conflicting claims in answer to your questions, just people talking about different aspects of the same general set of rules. The confusion about RustBelt vs Stacked Borrows is an example of this- they're not competitors, they're just different research projects done by basically the same groups of people at different points in time.

I mean, this is the XY problem. The OP's proposed solution is wrong, but that doesn't mean they haven't identified a flaw in Rust as it stands today.

-fno-strict-aliasing exists because C's aliasing rules suck. There are all sorts of useful things that are literally impossible to do without violating them, such as making a local byte array and reinterpreting it as a struct type. Vendor extensions like __attribute__((may_alias)) lessen the problem, but they're not universally available and are somewhat broken to boot. And even if it's possible to follow the rules, they're very hard to understand, and there are no tools to validate that you're doing it correctly. No wonder people just want to just opt out. Even at the cost of fragmenting the C ecosystem; but then, the C ecosystem doesn't practice code reuse nearly as much as Rust anyway.

Rust is in a better situation in some ways. Type-based alias rules don't exist at all, unless you count borrows only giving permission to a certain number of bytes depending on the type. And for the aliasing rules that do exist, there is an automatic checker.

But in other ways our situation is as bad as C:

  1. Miri doesn't catch all Stacked Borrows violations.
  2. Stacked Borrows is as hard or harder to understand than C strict aliasing.
  3. Stacked Borrows makes some code patterns fairly difficult. Generally speaking, it requires increased use of raw pointers where they wouldn't be needed just to get the code to compile.

Combining 1 and 2, just today, even as a semi-expert, I made a mistake and thought something illegal was legal because Miri allowed it. Or maybe I didn't make a mistake. Who knows?

Here are corresponding things we can do to improve the situation (none of which are new ideas, some of which are probably being worked on already):

  1. Keep improving Miri, with the goal of making it able to catch all Stacked Borrows violations (possibly only in a slower-than-default mode).

  2. Document Stacked Borrows better, with lots of examples.

  3. Add UnsafeAliasedCell or some way to mark specific types as allowing concurrent &mut references. Similarly, add a way to mark a struct as being able to have container_of soundly used on it.

    After all, this should give the OP effectively what they want, a way to opt out of certain aliasing assumptions for their code, without needing to create an ecosystem-splitting language dialect.

13 Likes

I must admit I still think that -fno-strict-aliasing is a good idea, but I'm clearly outnumbered.

It seems, however, that -Cllvm-args='--enable-scoped-noalias=false' does disable most of the problematic optimizations. I still have to make sure not to pass questionable references to anything, but this will have to do. Of course, I'll use UnsafeCell if I am deliberately trying to have shared mutability.

@tcsc You might find this flag useful.

Here are corresponding things we can do to improve the situation (none of which are new ideas, some of which are probably being worked on already):

I don't disagree with these, but I think there's a fourth option: requiring a simpler aliasing model.

I suggested a very rough one above where references only invalidate the provenance of raw pointers while the reference exists (this should still allow a great amount of alias-based optimization, but probably not full llvm-noalias), and is essentially equivalent to the requirement we've historically told unsafe code it must uphold, e.g. "if you have a &mut T or &T you must not access the data in any way that contradicts it during it's lifetime" guidelines that we've documented in the past.

Particularly I think the fact that -Zmiri-track-raw-pointers has to be disabled by default because too much code is broken under it, should be a big red flag that the way miri and SB handles raw pointers is far too restrictive.

I also think it should be considered a breaking change in the language to decide that widespread patterns are UB (and especially to optimize against that UB), this happened once in the past with things like mem::uninitialized::<[u8; N]>() (which used to be considered well-defined so long as you didn't read from it, and now is considered UB), and I worry it will happen for whatever aliasing model we choose, since it's likely to be stronger than what historically we've used.

Similarly, add a way to mark a struct as being able to have container_of soundly used on it.

This one in particular probably just has to be fixed by adjusting the model, since the most common offender here is stuff like allocators, which have to work on all types. Also, if you could modify the type, you could just put the other stuff at the start anyway. There's a lot of other stuff that's broken by forbidding this too.

Thankfully, AIUI even for a very strict aliasing model, it's likely not fundamental, you can see some discussion about this here and on: Storing an object as &Header, but reading the data past the end of the header · Issue #256 · rust-lang/unsafe-code-guidelines · GitHub