Can `Pin::map_unchecked_mut` actually be used safely at all?

HeroicKatora · May 22, 2019, 3:23pm

Consider the following code:

let mut pinned: Pin<&mut T> = ...;
unsafe {
    pinned.map_unchecked_mut(|as_mut| {
        // Do something to prevent moving the member again.
        as_mut.mark_pinned();
        &mut as_mut.member
    })
}

As the safety requirements clearly lay out:

You must guarantee that the data you return will not move so long as the argument value does not move (for example, because it is one of the fields of that value), and also that you do not move out of the argument you receive to the interior function.

What I'm asking is, how is the compiler stopped from invalidating any assumption I could make to prevent that, locally within the closure that sees merely a &mut _? Basically the compiler equivalent of this micorarchitectural CPU bug.

Say I behave perfectly fine within mark_pinned and don't even panic! Within the inner closure we have a &mut _, so what's stopping the compiler from temporarily reading through the mutable reference to a temporary, call the function on that, and then store the result of the temporary back? Is that a guarantee I can rely upon? I have my doubts, similar strategies seem used in ordinary optimization strategies. If the struct is small enough to be passed inline in a register that even sounds like actual speedup! Basically, since ordinary Rust code must not rely on addresses the usual compilation would allow this, and only spill to a stack temporary when some part is addressed individually as far as I can tell.

Does that analysis have a technical flaw? Is there another reason why the compiler can not do this?

Nemo157 · May 22, 2019, 8:54pm

I’m going to try and think about this more tomorrow, but first impression is that to actually use the invariants that Pin has provided you at some point you are going to have to create a raw pointer to one of the references. At that point your code does rely on the address of the reference, and it would be invalid for the optimizer to pass you a reference to some other temporary than the one actually shown in the source.

(If you are provided invariants that you provably never use, is the optimizer within its rights to just ignore those invariants?)

HeroicKatora · May 22, 2019, 9:18pm

Thanks for considering it, let me briefly comment on this statement.

I don't see if/how such a guarantee must be provided to method on T that is defined as

fn mark_pinned(&mut self)

Precisely which semantics of Rust forbid the following transformation, I thought you were not supposed to be able to rely on the value of 'addresses as usize' and e.g. MIRI even panics when you do. Going so far as introducing pointer::align_offset as a separate intrinsic so that pointer alignment checks should never occur manually in the surface language. In C++ this is may even be UB, I think? It wasn't resolved by the standard committee last time I checked. That may be a reason why llvm (and thus rustc) will never exploits it even if technically allowed.

So, is the compiler allowed to equate:

fn something_weird(&mut self) {
    let mut temp = Self::new();
    core::mem::swap(self, &mut temp);
    temp.do_something_that_doesnt_panic();
    core::mem::swap(self, &mut temp);
}

// to:
fn something_weird(&mut self) {
    let temp = Self::new(); // Just for the Drop::drop.
    temp.do_something_that_doesnt_panic();
}

RalfJung · May 24, 2019, 5:24pm

I don't really understand the question. The compiler doesn't just insert moves when you didn't tell it to move stuff. Rust guarantees some basic address stability, otherwise raw pointers would be rather useless.

That's not how transformations work. A transformation is wrong until it is shown that it does not change program behavior, not the other way around. The standard does not explicitly forbid optimizations, it describes the behavior of a program and then the compiler has to make sure that that's what happens.

In your case, do_something_that_doesnt_panic gets called on a totally different object, how is that supposed to make any sense? Like, if Self is a newtype around i32 and new returns 0 and do_something_that_doesnt_panic prints, then your first program will print self and your second program will print 0.

This is a limitation of Miri, not a form of UB in Rust.

This is to facilitate compile-time code evaluation. Doing the same thing "manually" (as one would in C++) is still allowed.

What is UB? Observing the alignment? No that is certainly not UB.

HeroicKatora · May 24, 2019, 5:59pm

I can observe a memory location by cache effects, side channels, etc. yet no compiler or standard would forbid an optimization based on this afaik. No, the machine model defined by the language differs from the actual ISA machine model. In C++, two pointers that compare equal when cast to integer type may compare unequal as pointers, when they refer to different objects that is for example two objects not alive at the same time.

I was sadly not very precise on the UB part. It is though not precisely clear whether one is allowed to cast a pointer to integer, perform (any, even + 1 - 1 etc) arithmetic on that integer, and expect the result cast back to pointer to still be valid. Semantics are, afaik, only defined for adding ptrdiff_t to a pointer iff that does not result in a pointer outside the allocation region. The same goes for casting the pointer to a different pointer type, except for pointers to POD-types and const char*. Thus, I'm also unsure if one could grab a pointer and align it to something other than a multiple of the types native alignement in that language without potentially incurring UB.

Ok, so that brings us back to observing. The compiler only breaks my code when it moves an object to a new address if I can expect the pointer to be equal. This seems like it should be true, to maintain reasonability within the language, for allocations of which the called code is the owner and for values which I have currently borrowed. What you're saying seems to suggest that the identity guarantee holds while any borrow is alive on that value, not only the locally observable borrows, and thus the identity observed is the same as the one observed by an other borrower. Which would (likely) fully answer my question To which extent is it true for others values such as locals?

Would be good if the panic would get that across better. Because I hit it and certainly didn't know.

RalfJung · May 24, 2019, 6:44pm

Indeed they wouldn't. Rust programs don't run on hardware with caches or so. Rust programs run on an abstract machine specified by the Rust standard (once we have one). The standard also defines what is considered observable. Basically, syscalls and volatile memory accesses are observable and not much else.

This is sadly not clear in C either -- some of the hardest open questions revolve around integer-pointer-casts.

But my stanza is yes, if you guarantee that the integer you are casting back is the same as the one you got, then you will get a pointer that you may use to access this memory. It's not the same pointer (it may have a different provenance), but it points to the same object.
If the integer you are casting back is offset o away from the one you got originally, and that offset in bytes could be added to the original pointer while still staying inside the same allocation, then the pointer you are getting back points to that place inside the allocation. While not mandated by the C standard, this is relied upon by many programs, and it matches what compilers (intend to) implement.

I remain puzzled what ptr-int-casts have to do with your original question about map_unchecked_mut... I see no connection.

I don't think I understand what you are saying here. But generally, when you create a pointer to that object, that pointer remains valid until either the object gets deallocated (free for heap objects, StorageDead or stack frame pop for locals), or Stacked Borrows says that there was a conflicting access invalidating your pointer.

Would be good indeed, we have a long-standing open issue for that. Sadly, we also only have finite amounts of time. This one is very slowly making it to the top of my personal Miri priority list -- this kind of feedback helps to to evaluate such priorities, and you are not the first to run into this.

Oh, and to answer the question in the thread title: yes, it is possible to use map_unchecked_mut correctly. For example:

struct Foo<T> { n: usize, x: T }

impl<T> Foo<T> {
  fn get_pin(self: Pin<&mut Self>) -> Pin<&mut T> {
    // Projects to a field in the struct, and follows all the rules
    // for projections. Hence safe.
    unsafe { Pin::map_unchecked_mut(|foo| &mut foo.x ) }
  }
}

Also see the rules for projections.

HeroicKatora · May 24, 2019, 8:11pm

A strong guarantee such as that given implies that comparison is fully valid through arbitrary integer arithmetic, thus surely complicating escape analysis and affecting optimization. Since that was the simplest possible allowed operation that would ambiguate the pointer origin that was essentially aimed at findint out if an optimization pessimisation from that was indeed the case. Probably wasn't very clear.

But overall the situation is pretty much as expected, and not broken. Not that a more formal specification of the semantics could hurt but that is afterall what the unresolved issues in unsafe-code-guidelines are about afterall. unsafe for a good reason

PS: both links are seemingly the same by mistake

system · August 22, 2019, 8:13pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
`Pin` from `&mut` refs? libs	10	2151	January 28, 2019
Re: `Pin` from `&mut` refs Unsafe Code Guidelines	19	1732	May 21, 2019
Why am I not allowed to move arg in Pin::map_unchecked?	3	351	January 5, 2024
[Pre-RFC] pattern matching std::pin::Pin, converting &mut T to Pin and narrowing scope language design	4	506	October 9, 2021
A Formal Look at Pinning language design	25	7837	June 16, 2019

Can `Pin::map_unchecked_mut` actually be used safely at all?

Related Topics