Constructing a case where it wouldn't be the case is difficult, as moving the future between threads necessitates at least an AcqRel synchronization on the future itself. As such, the store and load should have a strongly happens-before relationship: a sequenced-before from the store to the await; an inter-thread happens-before from the await to the resume, coming from the synchronizes-with operation for crossing the thread boundary; and a sequenced-before from the resume to the load, collapsing to that strongly happens-before from the store to the load.
Most of the complexity comes from Consume ordering, which Rust doesn't expose. Theoretically, the future could be moved cross-thread by a Consume ordering (if and only if it's small enough to be directly operated on atomically), which would break the happens-before relationship between your relaxed store and load. However, this is essentially a non-issue, since Rust doesn't expose the ordering and no compiler implements it as anything other than Acquire anyway.
I'm not completely confident on the strongly; the C++20 wording requires the synchronizes-with operations to be sequentially consistent in order to be strongly happens-before rather than just simply happens-before; I believe that's just for that synchronization operation, and strongly happens-before allows crossing over a simply happens-before step. The difference between the two happens-before is irrelevant for most purposes, though; I believe it only matters for the global total SeqCst order, and was added in C++20 to prevent cases where operation A happens-before B, but B precedes A in the SeqCst total order. ↩︎
For soundness, an executor must guarantee that poll operations on the same task are in the happens-before relation. For the sake of argument assume otherwise and consider a variant of the original example: