Sorry in advance for the length. I've wanted to request something like this for a while, for completely separate motivations than the initial request. I also believe what I want is subtle enough that I should explain it precisely.
In particular, my belief is that IMO, &T
and &mut T
should only impact what raw pointers are allowed to do for the duration of the lifetime of the &T
or &mut T
. This follows the previous "common knowledge", as described by the the part of the UnsafeCell where it describes the aliasing rules (to be clear, it also notes that are in flux)
If you create a safe reference with lifetime 'a
(either a &T
or &mut T
reference) that is accessible by safe code (for example, because you returned it), then you must not access the data in any way that contradicts that reference for the remainder of 'a
.
(from UnsafeCell in std::cell - Rust)
In truth, I'd like this to be more-or-less the extent of Rust's aliasing rules. Unfortunately it likely won't be, because it's weaker than (and thus incompatible with) what's assumed by our use of LLVM's noalias
attributes on &T
and &mut T
, which we use and presumably would like to continue using. And so I'd like a way to opt out of that.
Specifically to @InfernoDeity's question about precise semantics, I don't know enough about formal models of programming languages to explain it in terms of the abstract machine. What I want is for the aliasing required implied by &T
and &mut T
to not have no impact after those references do not exist.
- On compilers using LLVM, this wouldn't be sufficient to use noalias on
&T
or &mut T
the way we do now.
- However, if the compiler can prove a raw pointer to the type was never created, it's fine if it adds noalias.
- ... With possible debate around "never created" vs "currently exists" — I'd like this to work for ptr-to-int too (which probably requires "never created"), but I'll concede that is entirely separate and based on the fact that I want ptr-to-int stuff to continue working...
- This would also be required to disable any hypothetical MIR optimizations that were based on this, although IDK what they would look like.
- However, this continues to allows libraries and code to still rely on the
&T
and &mut T
rules they expect (to @josh's point).
- That is, (I believe) this doesn't change the semantics of any documented aliasing rules, it changes the semantics of undocumented ones.
- It also changes ones that would be very hard for libraries to sensibly rely on in the first place, and brings them more in line with the rules they are relying on (for example, I'm unsure how useful
stable_deref_trait
is under the strict LLVM-noalias-compatible version of alising rules, but it's completely fine here.
- Doesn't allow users to create aliasing a
&mut T
that aliases any other reference, or a &T
that gets mutable.
- All code that does this is incorrect, will be incorrect under this flag, and has been known to be incorrect since before 1.0
- Allowing it via a flag would fragment the language in a big way
and if we're going to do that, it should be via the flag we really want: the one that disables the borrow checker (kidding, kidding).
From @steffahn's post:
I'm not convinced here, but I think the details of this will cause us to get too far into the weeds, so I'm gonna delete the bit I wrote about it, in an effort to keep this already very long reply at least stay focused. I'll try and bring up a thread on the UCG or a github issue if I find concrete problems.
I also do think that the distinction you're making is sufficiently subtle that a lot of code is liable to get it wrong, and has in the past, though.
From @yigal100's post:
@tcsc 's point above falls under this category - Rust's unsafe
story is being actively worked on, e.g. @RalfJung and others have been on it for quite a while now with good progress.
While the UCG (which I actively participate in) does help here, largely it's interested in new language features to add workarounds for these cases. Consider addr_of!
, one of the use cases for addr_of!
is to allow code to create a pointer to a derived field without going through a reference, because that reference will invalidate other raw pointers to the type.
The problem with an approach like this is that addr_of!
only was added in the latest edition. The vast majority of code that needs to use it isn't doing so, since it didn't exist. If I pull down a 3 year old data structure crate via a transitive dependency, it won't use it, and it probably never will.
For example, the stdlib had this bug for many years slice::swap violates the aliasing rules · Issue #80682 · rust-lang/rust · GitHub (which, to be clear, is an exact issue of the problem I'm discussing), and it was fixed by using addr_of_mut
, a tool that was only added recently. IMO, until fairly recently, we had believed this pattern to be correct. Prior to participating in UCG, I certainly had thought so (after all, it's a raw pointer).
Whatever the final design we'll arrive at simply cannot be "let's just forgo Rust's principles of safety".
I think this is either a complete misinterpretation of my point, or not directed at me. Hard to tell.
My motivation is improved safety. That said, maybe this isn't directed at me, and I do completely disagree about wanting the ability to shoot myself in the foot.
From @burntsushi's post
That is, show some Rust code today that you would like to write that has non-ideal codegen.
I think this is a little confused about the point. A flag like this isn't for improving codegen, it's to disable a set of optimizations that were previously unexploited, but now are, and that are particularly hard to reason about.
For concrete examples, I'd like the code in slice::swap violates the aliasing rules · Issue #80682 · rust-lang/rust · GitHub to be correct as it was, without needing addr_of_mut!
, as discussed above
Specifically, because while I can live with addr_of!
, but I'd like a flag that allows old, more naive unsafe
that I might be pulling in via a transitive dependency (written before we knew as much about this as we know now, and before there were even tools like addr_of
that could have been used to avoid the problem), to not be miscompiled because the compiler started exploiting a particular kind of UB.
(Honestly, I feel like it should have been from the other side. The people who wanted to turn on a dangerous UB-exploiting optimization should have had to justify it by showing the bad codegen (this feels especialy true given that it needed a hack/heuristic to allow things like !Unpin to continue working)... But such is life).