Generic `Atomic<T>`

CAD97 · September 16, 2024, 5:42am

We've made NonZero generic instead of having multiple distinct NonZero* types. We could do a similar thing to unify the Atomic* types. I drafted out what that would look like; does this seem like a reasonable API shape for std to refactor towards? [playground]

trait Sealed {}

#[allow(private_bounds)]
#[unstable(feature = "atomic_internals", issue = "none")]
pub unsafe trait AtomicPrimitive: Sized + Copy + Sealed {
    type AlignedForAtomic;
}

macro impl_atomic_primitive(
    $Inner:ident $(<$T:ident>)? ($Primitive:ty),
    size($size:literal),
    align($align:literal) $(,)?
) {
    impl $(<$T>)? Sealed for $Primitive {}

    #[cfg(target_has_atomic_load_store = $size)]
    const _: () = {
        unsafe impl $(<$T>)? AtomicPrimitive for $Primitive {
            #[cfg(target_has_atomic_equal_alignment = $size)]
            type AlignedForAtomic = $Primitive;
            #[cfg(not(target_has_atomic_equal_alignment = $size))]
            type AlignedForAtomic = $Inner $(<$T>)?;
        }

        #[repr(C, align($align))]
        #[cfg(not(target_has_atomic_equal_alignment = $size))]
        pub struct $Inner $(<$T>)? ($Primitive);
    };
}

impl_atomic_primitive!(AlignedBool(bool), size("8"), align(1));
impl_atomic_primitive!(AlignedI32(i32), size("32"), align(4));
impl_atomic_primitive!(AlignedU32(u32), size("32"), align(4));
// ... snip ...
#[cfg(target_pointer_width = "64")]
impl_atomic_primitive!(AlignedPtr<T>(*mut T), size("ptr"), align(8));

#[repr(C)]
pub struct Atomic<T: AtomicPrimitive>(UnsafeCell<T::AlignedForAtomic>);

macro impl_atomic_fmt($Trait:ident) {
    impl<T> fmt::$Trait for Atomic<T>
    where T: AtomicPrimitive + fmt::$Trait;
}

impl_atomic_fmt!(Debug);
impl_atomic_fmt!(Pointer);

macro impl_atomic_auto_trait {
    (unsafe $Trait:ident) => {
        unsafe impl<T> $Trait for Atomic<T> where T: AtomicPrimitive {}
    },
    ($Trait:ident) => {
        impl<T> $Trait for Atomic<T> where T: AtomicPrimitive {}
    },
}

// Implement auto-traits manually based on `T` to avoid docs showing
// the `AtomicPrimitive::AlignedForAtomic` implementation detail.
impl_atomic_auto_trait!(unsafe Send);
impl_atomic_auto_trait!(unsafe Sync);
impl_atomic_auto_trait!(Unpin);
impl_atomic_auto_trait!(UnwindSafe);
impl_atomic_auto_trait!(RefUnwindSafe);

impl<T> Atomic<T>
where
    T: AtomicPrimitive,
{
    pub const fn new(val: T) -> Self;
    pub const fn into_inner(self) -> T;

    pub const fn as_ptr(&self) -> *mut T;
    pub const unsafe fn from_ptr<'a>(ptr: *mut T) -> &'a Self;

    pub fn get_mut(&mut self) -> &mut T;
    pub fn get_mut_slice(this: &mut [Self]) -> &mut [T];

    pub fn load(&self, order: Ordering) -> T;
    pub fn store(&self, val: T, order: Ordering);
}

impl<T> Atomic<T>
where
    T: AtomicPrimitive<AlignedForAtomic = T>,
{
    pub fn from_mut(val: &mut T) -> &mut Self;
    pub fn from_mut_slice(val: &mut [T]) -> &mut [Self];
}

macro atomic_int($Atom:ident($Int:ty), size($size:literal)) {
    #[cfg(target_has_atomic_load_store = $size)]
    pub type $Atom = Atomic<$Int>;

    #[cfg(target_has_atomic = $size)]
    impl $Atom {
        pub fn swap(&self, val: $Int, order: Ordering) -> $Int;

        pub fn fetch_add(&self, val: $Int, order: Ordering) -> $Int;
        pub fn fetch_sub(&self, val: $Int, order: Ordering) -> $Int;

        // et al
    }
}

atomic_int!(AtomicI32(i32), size("32"));
atomic_int!(AtomicU32(u32), size("32"));

#[cfg(target_has_atomic_load_store = "8")]
pub type AtomicBool = Atomic<bool>;

#[cfg(target_has_atomic = "8")]
impl AtomicBool {
    pub fn swap(&self, val: bool, order: Ordering) -> bool;

    // et al
}

#[cfg(target_has_atomic_load_store = "ptr")]
pub type AtomicPtr<T> = Atomic<*mut T>;

#[cfg(target_has_atomic = "ptr")]
impl<T> AtomicPtr<T> {
    pub fn swap(&self, val: *mut T, order: Ordering) -> *mut T;

    pub fn fetch_ptr_add(&self, val: usize, order: Ordering) -> *mut T;
    pub fn fetch_ptr_sub(&self, val: usize, order: Ordering) -> *mut T;

    pub fn fetch_byte_add(&self, val: usize, order: Ordering) -> *mut T;
    pub fn fetch_byte_sub(&self, val: usize, order: Ordering) -> *mut T;

    // et al
}

I think this sketch both moves everything onto the generic type that's fully just duplicated between all of the Atomic* types. fetch_update might be possible to provide only once on the generic, but I wasn't able to figure out a good way to both keep it gated like it is today and delegate to a non-generic compare_exchange (required due to AtomicBool being a special case).

jrose · September 16, 2024, 5:48am

Comparison: the Swift folks have spent a while hammering out what they think the relevant traits/protocols are, though the final form is only just now going to be released in Swift 6.0:

github.com

swiftlang/swift-evolution/blob/main/proposals/0410-atomics.md

# Low-Level Atomic Operations ⚛︎

* Proposal: [SE-0410](0410-atomics.md)
* Author: [Karoy Lorentey](https://github.com/lorentey), [Alejandro Alonso](https://github.com/Azoy)
* Review Manager: [Joe Groff](https://github.com/jckarter)
* Bug: [SR-9144](https://github.com/apple/swift/issues/51640)
* Implementation: [apple/swift#68857](https://github.com/apple/swift/pull/68857)
* Version: 2023-12-04
* Status: **Implemented (Swift 6.0)**
* Previous Revision: [1](https://github.com/swiftlang/swift-evolution/blob/d35d6566fe2297f4782bdfac4d5253e0ca96b353/proposals/0410-atomics.md)
* Decision Notes: [pitch](https://forums.swift.org/t/atomics/67350), [first review](https://forums.swift.org/t/se-0410-atomics/68007), [first return for revision](https://forums.swift.org/t/returned-for-revision-se-0410-atomics/68522), [second review](https://forums.swift.org/t/second-review-se-0410-atomics/68810), [acceptance](https://forums.swift.org/t/accepted-with-modifications-se-0410-atomics/69244)

## Introduction

This proposal adds a limited set of low-level atomic operations to the Standard Library, including native spellings for C++-style memory orderings. Our goal is to enable intrepid library authors and developers writing system level code to start building synchronization constructs directly in Swift.

Previous Swift-evolution thread: [Low-Level Atomic Operations](https://forums.swift.org/t/low-level-atomic-operations/34683)

New Swift-evolution thread: [Atomics](https://forums.swift.org/t/atomics/67350)

This file has been truncated. show original

DragonDev1906 · September 16, 2024, 9:10am

Since you brought up NonZero: How do/would/should the combination of them work?

type A = NonZero<Atomic<u32>>; // Wrong/Problematic?
type B = Atomic<NonZero<u32>>;

Unless I'm mistaken your example does not allow Atomic<NonZero<u32>> (unless they are explicitly listed/added), while the current ZeroablePrimitives wouldn't allow NonZero<Atomic<u32>> (which would also be wrong, unless UnsafeCell cannot be 0x00).

Granted, I'm not sure if an atomic, non-zero integer is useful (besides having the compiler ensure it is not zero), but something like Atomic<Option<NonZero<u32>>> could be and would (as far as I can tell) be sound.

jjpe · September 16, 2024, 11:12am

If this API were adopted and merged, would the intent be to expose it as a public API in stdlib? Or is it intended more as a nice and debugged way to maintain consistency between the implementations of various Atomic* types i.e. is it intended primarily as an implementation detail within the bowels of stdlib?

Vorpal · September 16, 2024, 1:07pm

A relevant opsem issue is Packing pointers into double-word width atomics · Issue #517 · rust-lang/unsafe-code-guidelines · GitHub

TLDR: some people want Atomic<(u64, u64)> (or pointers instead of integers, for provenance). Some platforms (such as x86-64) has double width compare and exchange. That led to a discussion on the general case of tuple atomics. Mixed size atomic access are also related and relevant.

I would like to see a design that doesn't preclude such extensions in the future (fancy concurrent algorithms are not just exiting and cool, but quite useful in low level systems programming). I would personally be fine with not supporting the case where there is padding in the tuple, but other people want that as well.

DragonDev1906 · September 16, 2024, 1:26pm

Ideally it would be possible to use any type small enough to fit the atomic compare and exchange of the target platform. Though I don't think we can currently say "generic over all types where size <= 64" (or other sizes).

Regarding Atomic<(u64, u64)>: Are you talking about something that compares and exchanges both u64 at once or a way to compare and exchange them individually (as if it was a (Atomic<u64>, Atomic<u64>))? I assume you mean the former.

farnz · September 16, 2024, 1:31pm

What would be useful is an Atomic<(u64, u64)> where the compiler guaranteed to use the platform equivalent of x86-64's CMPXCHG16B that does both compares and exchanges in a single atomic operation - this should be equivalent to Atomic<u128>, but with the compiler packing 2x u64 into a u128 for you. Similar applies to Atomic<(usize, usize)> on platforms where there's support for a double-usize atomic access.

The latter (Atomic<(u64, u64)> is the same as (Atomic<u64>, Atomic<u64>) doesn't seem particularly useful, because it's trivial to construct given Atomic<u64> exists.

Vorpal · September 16, 2024, 3:06pm

It probably needs to be Copy also. Plus padding being LLVM poison causes issues.

The former (see gh issue that I linked). The other doesn't seem useful at all.

Vorpal · September 16, 2024, 3:08pm

The real issue arises when you have a pair of pointers. If you roundtrip via integers you loose provenance. With pointers it is no longer a question of convenience but about actual missing functionality.

farnz · September 17, 2024, 8:32am

The ideal end-state for me would be that if Atomic<T> exists and Atomic<TwoTs> exists such that core::mem::size_of::<TwoTs>() >= (core::mem::size_of<T>() * 2), then we also implement at least the load, store, and compare-exchange functionality for Atomic<(T, T)>.

That way, if pointers are 32-bit, and the platform supports both Atomic<&mut T> and Atomic<u64>, we also get atomic compare-exchange for (&mut T, &mut T) (plus other pointer types), and the same for 64-bit pointers and Atomic<u128>. But if the largest atomic on a platform is Atomic<u32>, and pointers are 32-bits or larger, you don't get Atomic<(*mut T, *mut T)> or Atomic<(&mut T, &mut T)>, because the pre-requisites aren't met.

Now, there are almost certainly compiler implications for this I've not thought about - especially in the area of pointer provenance - that make this hard to do well, but having all possible pairs of atomics for at least compare-exchange, load and store as well as for single items would be useful for some algorithms.

zackw · September 17, 2024, 1:11pm

Can we generalize it a tiny bit more, and not require the two types to be the same? I have personally written locking primitives (in x86 assembly language :-P) that used cmpxchg16b on, in Rust terms, (&mut WaiterChain, u64).

Atomic<(T, U)> where T and U are both primitive and size_of::<T>() + size_of<U>() <= size_of::<BiggestSupportedAtomic>() shouldn't be notably harder than Atomic<(T, T)> as you describe.

farnz · September 17, 2024, 1:29pm

How does that interact with pointer provenance? It's technically no harder than what I've described, but you're also casting provenance around in the process, which I suspect is a bigger ask than you might expect once you take Strict Provenance into account.

zackw · September 17, 2024, 1:34pm

As far as pointer provenance is concerned, we're "just" copying (&T, usize) values around, aren't we? The fact that this particular copy is tagged to emit a hardware atomic operation instead of an ordinary one should not affect the pointer-ness of the &T field of the tuple.

farnz · September 17, 2024, 1:36pm

Good point - I managed to confuse myself into thinking you meant being able to swap (&T, usize) with (usize, &T), not swapping two (&T, usize) values over.

Vorpal · September 17, 2024, 3:51pm

If they are not same size you immediately run into issues with padding. How does compare and exchange work when there is padding? Currently not very well! Since according to LLVM padding is poision (completely undefined, not any of the well known 255 values a byte can take). It allows it to assume those byte can be anything for the purpose optimisation, and it doesn't need to be consistent.

SkiFire13 · September 17, 2024, 4:13pm

Maybe this is a bit crazy, but couldn't we have a way to zero out padding bytes when needed so that they are no longer uninitialized?

In particular Atomic::from_mut could zero out padding bytes and unsafely transmuting from *mut T/&mut T to &Atomic<T> would be considered unsound if T's padding bytes (if any) are not zeroed out. Any method to that writes to an Atomic (e.g. Atomic::set/Atomic::swap/Atomic::compare_exchange) would also zero out padding bytes before doing so. This would guarantee that any Atomic<T> has the same value for the padding bytes and thus compare_exchange would work correctly.

FYI a byte can take 256 values (257 if we count undefined!)

Vorpal · September 17, 2024, 5:09pm

Freeze (as in the LLVM sense, not Freeze the unstable trait) was discussed in the opsem issue I linked above. I don't know how viable it would be to say that padding in atomics (and in values being compared with atomics) has to be zeroed. This is really a question for the opsem and compiler people I think.

My bad, it's been a long day. My point was that poision / undef are additional values in the LLVM abstract machine.

CAD97 · September 17, 2024, 10:25pm

This is intended to be like NonZero is — consolidate the docs some and enable use with generic syntax, e.g. with the FFI integer aliases.

This shouldn't. AtomicPrimitive is intended to be perma-unstable, like ZeroablePrimitive is for NonZero.

At the least, it would need to block access to Atomic::get_mut.

The real “fun” is pointer provenance versus the ABA problem.

tguichaoua · September 18, 2024, 4:04pm

What about Atomic<Enum>?

This is probably out-of-topic with the original proposition of this thread, but it's sometimes needed to store a finite state in an atomic value. The current way of doing this is using an AtomicUsize and some const values. The advantage with enums is that it's impossible to set an invalid value. Plain-old-data enums (enums with no fields) are basically an integer with some values forbidden. So just like AtomicBool is just a fancy AtomicU8, it should be technically possible to have an enum marked with a repr(N) stored internally as an Atomic<N>.

zackw · September 18, 2024, 5:49pm

For the use case I'm thinking of -- implementing a mutex, double-width CAS used to increment a waiter count and update a "tail of the wait queue" pointer as a single atomic operation -- it would be fine if the two types were required to be the same size, and indeed for them to be required to be exactly half of the size of a larger type for which a hardware atomic operation exists. I imagine people would eventually want to be able to do more, but the feature (of allowing the types not to be the same) is still useful with these restrictions.

Concretely, assuming hardware compare-and-swap for 32, 64, and 128 bit quantities, and usize <= 64 bits, support for Atomic<T> for T = (u16, s16) and (u32, s32) and (u64, s64) and (&U, usize), but not (u32, u16) or (u8, s8), would be sufficient for me.

Topic		Replies	Views
Pre-pre-RFC: Exploring API design space for volatile atomics language design	44	1218	March 11, 2024
Pre-RFC: Extended atomic types internals	8	4721	March 25, 2019
2024 CFP: Exotically sized types before 2027 language design	14	1084	March 17, 2024
[Pre-RFC] Safer Transmutation language design	38	6270	November 30, 2020
Type-safe atomics usage libs	17	1764	August 23, 2021

Generic `Atomic<T>`

Related topics