Pre-RFC: More operations on shared mutable memory

Motivation

Rust provides a type for shared mutable memory ranges: &[Cell<T>]. A &[Cell<T>] can be constructed from a &mut [T] via Cell::from_mut(x).as_slice_of_cells(). While slices of Cell are !Sync, they allow for infallible shared mutation at very low cost.

However, there's a gap in functionality in Rust for operations on slices of cells. There's no copy_from_slice that works with cells, and ptr::copy is unsafe to use. While dst.iter().zip(src).for_each(|(d, s)| d.set(s.get())) can copy from one slice to another, it optimizes poorly and has worse semantics than a safe ptr::copy.

There's also an API hole in core::ptr: there's no possibly-aliasing swap for dynamic sizes. ptr::copy accepts possibly-aliasing memory and a dynamic size. ptr::copy_nonoverlapping is its non-aliasing equivalent. ptr::swap_nonoverlapping takes two pointers and a dynamic size. In contrast, ptr::swap is more like mem::swap, because it only takes x and y to swap, and no count: usize. As far as I can tell, this is because the ptr::swap implementation has O(N) memory complexity, and so directly adapting it for a dynamic-size aliasing swap would require dynamic allocation. However, this can be resolved by using a O(1) space algorithm.

Proposal

Add these functions to the standard library:

// In core::ptr
// Same safety docs and semantics as ptr::swap
#[unstable(feature = "ptr_swap_many")]
pub unsafe fn swap_many<T>(x: *mut T, y: *mut T, count: usize);

// In core::slice
impl<T> [T] {
    // Does a `ptr::copy_nonoverlapping` since the slices cannot alias.
    #[unstable(feature = "copy_cells")]
    pub fn copy_from_cells(&mut self, src: &[Cell<T>]) where T: Copy;
    
    // Does a `ptr::swap_nonoverlapping` since the slices cannot alias.
    #[unstable(feature = "swap_cells")]
    pub fn swap_with_cells(&mut self, other: &[Cell<T>]);
}

// Note these methods all accept `&self`: they manipulate shared mutable memory.
impl<T> [Cell<T>] {
    // Uses `ptr::copy_nonoverlapping`, but if Rust ever changes to allow
    // `UnsafeCell` in `Copy` types, then this can adapt to use `ptr::copy`
    // when `T` contains `UnsafeCell`.
    #[unstable(feature = "copy_cells")]
    pub fn copy_to_cells_from_slice(&self, src: &[T]) where T: Copy;
    
    // Does a `ptr::copy` since the slices may alias.
    #[unstable(feature = "copy_cells")]
    pub fn copy_to_cells_from_cells(&self, src: &[Cell<T>]) where T: Copy;
    
    // Does a `ptr::swap_many` since the slices may alias.
    // OLD: If `self` aliases `other`, prioritize memory in `self`.
    // NEW: If `self` aliases other, panic. This is the behavior of `Cell::swap`.
    #[unstable(feature = "swap_cells")]
    pub fn swap_cells_with_cells(&self, other: &[Cell<T>]);
    
    // `fn swap_cells_with_slice(&self, other: &mut [T])` is unneeded:
    // write `other.swap_with_cells(self)` instead. Argument order
    // is irrelevant since the slices cannot alias.
}

Naming

The cell methods above aim to balance stdlib consistency and brevity in their naming. They specify the destination in the function name before the source, since that is the method argument order. They replace the term "slice of cells" with just "cells" for brevity since the context involves slices, unlike Cell::as_slice_of_cells. An alternative naming scheme could keep the term at the cost of frustratingly long names (and even these keep to_cells):

  • fn copy_from_slice_of_cells(&mut self, src: &[Cell<T>])
  • fn copy_to_cells_from_slice(&self, src: &[T]) where T: Copy
  • fn copy_to_cells_from_slice_of_cells(&self, src: &[Cell<T>]) where T: Copy
  • fn swap_with_slice_of_cells(&mut self, other: &[Cell<T>])
  • fn swap_cells_with_slice_of_cells(&self, other: &[Cell<T>])
7 Likes

The biggest reason I made ptr::swap_nonoverlapping but not ptr::swap_many is that it's very non-obvious to me what a swap should do when things are overlapping.

Even ptr::swap itself is weird for this -- can you tell me from first principles, what's supposed to happen if you swap two overlapping [u8; 4]s, say? I don't know that I could give a good justification for what I'd even want it to be.

2 Likes

The discussion in #80778 that led to Cell::swap panicking in face of overlaps is probably relevant.

4 Likes

The goal of a memory swap is to take two spans X and Y with contents A and B, and to swap them so that reading X gives B and reading Y gives A. However, if these memory locations alias, there is a conflict - it's impossible to guarantee A and B contents are preserved when reading from Y and X. When this is detected, an aliasing-aware swap algorithm can take any number of choices:

  • Refuse to swap any elements and abort (panic).
  • Don't touch the overlapping region, resulting in neither X or Y containing A or B.
  • Preserve the contents of one of the regions. This could be either A or B. ptr::swap chooses to preserve the original contents of A now located at Y, and leaves X with a combination of A and B with some elements lost and some duplicated.

The duplicated elements as a result of the last option means that swap_cells_with_cells in its current form cannot follow the logic of ptr::swap. The elements must either be Copy (not present in Cell::swap) or it must follow a different algorithm like panicking or only swapping nonoverlapping regions.

As an embedded developer, I would prefer a predictable algorithm over an additional panic path. While it's trivial to optimize out a bounds check, adding aliasing checks before this method to avoid a panic path is much more painful for embedded code. This leads me to want to keep the signature without Copy and to only swap non-overlapping regions.

I also would prefer to limit inconsistency in the stdlib, and the only two aliasing swap algorithms present in the stdlib are to panic or to dupe elements. So, no matter what, I'd be a little sad about any slice-of-cell-swapping solution but happy that at least we have one.

This is great background, thank you! Given that we know that these are [T] and we always swap whole T, we don't have to worry about corruption of individual T like Cell<T> must.

I'm not sure what you mean about "whole T" here. Why couldn't I do exactly the same things that give partially-overlapping Cells just as slices? slice::from_refing each side of a Cell::swap would uplift the scalar problem to a slice problem, and that's a safe function.

1 Like

&Cell<T> are also references to "whole" T. The issue is e.g. if you have &[Cell<[V; 2]>], those can be at an offset of odd V, thus your T=[V;2] partially overlap.

1 Like

Yes, in retrospect it is obvious - swap_cells_with_cells has the same questions regarding the soundness of mutating aliasing Cell<T>s where T has safety invariants. Cell::swap is conservative here in refusing to do an aliasing swap, leaving it open whether constructing those references is sound in the first place, since Cell::replace can wreak similar havoc on types with invariants.

So, regrettably, swap_cells_with_cells definitely should panic when the inputs overlap. Relatedly, I'd love for there to be fallible versions of copy_from_slice and kin for tools like redpen::dont_panic to give fewer false positives.

And presumably, the same should apply to copy_to_cells_from_slice - if/when UnsafeCell types can be Copy, the method should check whether the input aliases.

Would the opposite of as_slice_of_cells be sound? I guess not or it would be in std already.

I personally think it should be as sound as any other type punning of stable-layout types - it must have the same layout and has the same mutability properties, so I'm not sure what sort of invariant any library would be trying to uphold by keeping the conversion one-way. The stdlib doesn't have safe versions for many sound operations, and IMO Cell could use more love from the libs team

I also tried to go down the path of as_cell_of_slice or similar, but it doesn’t help you very much because all of Cell’s methods (correctly!) require T: Sized. So while some of these new methods could live on Cell<[T]> rather than on slices or as module-level functions, they’re still necessary.

1 Like

The reason I chose to put these on &[Cell<T>] was:

  1. Because as_slice_of_cells is currently stable and so &[Cell<T>] is the more general form,
  2. &[Cell<T>] has access to [T] methods and so is more useful (as you said), and
  3. Placing the &mut [T] <- &[T] function (copy_from_slice) nearby to &mut [T] <- &[Cell<T>] is good for discoverability. &[Cell<T>] <- &[Cell<T>] and &[Cell<T>] <- &[T] would benefit from being placed near these two methods and similar in shape.
  4. Subslicing and then performing bulk operations is much more ergonomic.

Side thought: impl<T> DerefMut<Target = [Cell<T>]> for Cell<[T]> - has anyone discussed this before? Haven't had much luck searching. An alternative would be impl Index for Cell<[T]>, but keeping most of the useful methods in one specific form seems valuable.