Another alternative to allocators for Vec and other collections

mangelats · October 1, 2023, 8:25pm

For quite some time I've been working on a proof-of-concept of an abstraction for memory management for Vec and other collections to allow SoA and inline memory for fun on my free time. After many iterations I settled on this one. I think it may be useful as it is a different implementation of the ones I've seen.

Background

When I was helping with soa_derive I noticed that we were duplicating the entire Vec functionality for each struct which it should be possible to avoid. We were also managing multiple len, which shouldn't be needed because it should be the same for each array. I also think that it would be useful to be able to have a Vec on the stack.

Mental model and basic trait

After trying different solutions I realized that the main concern is where the values will reside once saved. Which begs the question: could we do an abstraction for it? This is similar to the effords in another post titled "Is custom allocators the right abstraction?" which I didn't know existed until recently (I started before it was posted). I settled with a different abstraction which I called Buffer (ideas for better names are welcomed). You can find it in the proof-of-concept repo.

A buffer is where the data will be saved. Its positions are numbered sequentially (from 0 to capacity - 1). This means that it excludes collections where the underlying memory is held by pointers (like a linked list). The collection is the one keeping track which positions are full since it depends on the collection behaviour. For isntance Vec only needs a len, but a VecDeque also needs a head. The buffer fulfills the RawVec's job, making it not necessary.

The basic trait has the following methods:

capacity: how many values can the buffer hold.
take: extracts a value from the selected position (emptying it).
put: saves the value in the selected position (filling it).
try_grow and try_shrink: tries to change the capacity.

Other methods –like getting a reference– are in other traits, but it is possible that a buffer does not implement them or not returning the same types (eg. &T isn't correct when using a SoA). Another trait is ContiguousMemoryBuffer, it tags buffers where the underlying data is in a single well-aligned contiguous memory slice.

This interface turned out to be really flexible and allows for a lot of different implementations: inline, on the heap, specifically for ZSTs, using an allocator, etc. More exotic buffers like a buffer on the GPU or on a file should be possible but haven't tried to implement them yet.

Composability

It's possible to make a buffer that depends on another one. This makes it follow the composite pattern. Using this fact I made the optimizations and other behaviour changes in different composite buffers, which allows the user to build a stack of desired behaviours. Zero-sized type optimization, small vector optimization, how it grows (eg. exponentially), and structure of arrays are examples of such buffers. Here is an example on what it looks like in my PoC (I called Vec Vector to avoid collisions with std):

type ExampleBuffer<T> = ZstoBuffer< // Optimization for types where T is a Zero-Sized Type
  ExponentialGrowthBuffer<          // Make the buffer to grow exponentially
    SvoBuffer<                      // Add Small Vector Optimization
      128,
      HeapBuffer<T>,                // Save the values on the heap (base buffer)
    >
  >
>;

let example_vector: Vector<u32, ExampleBuffer<_>> = Vector::new();

Any DerefMut also is a buffer that forwards everything to the derefed one. So Box<InlineBuffer<T, 100>> would be an fixed-sized buffer on the heap. It also enables using mutable references to buffers to be used as temporary buffers and even to reuse them.

Resizing

The collection may use try_grow and try_shrink to try to resize the buffer. They return a result type with an enum –ResizeError– for the error type. The possible errors are:

UnsupportedOperation: for example trying to resize InlineBuffer or ZstBuffer
TheoreticalLimitSurpassed: for example trying to grow more than isize::MAX on the heap.
OutOfMemory
UndistinguishableError: buffer has no information on why it failed (eg. allocators have no information on why they failed).

Performance

The PoC is unnoptimized. I tried but I was struggling to properly measure performance. I also saw some layout shifts seemingly overpowering any changes I made. I expect a lot of inlining should be added (since there are a lot of indirections) and maybe even make a composite that its only job is to make the resizing functions cold. But for this I need reliable measurements. Layout randomization seems a possible path, but that means making a new tool and that's a lot of work.

That being said, it allows the final user to tune colelctions exactly as needed and enables non-contiguous memory layouts (like SoA), so my guess is that it should be able to enable performance gains.

Due to the ammount of types the PoC has, I would expect the compilation times to be larger, but I don't even know how to benchmark it.

SkiFire13 · October 2, 2023, 9:06am

I feel like this is a non-starter for a general-purpose allocator API. It would exclude not only LinkedList, but also BTreeMap/BTreeSet and Rc/Arc.

mangelats · October 2, 2023, 12:59pm

Sorry, I think my title may be prone to missunderstanding. What I'm sharing/proposing as an alternative is the change Vec<T, A: Allocator> to Vec<T, B: Buffer>.

This API would not be a replacement to the Allocator API, but something that builts on top of it, working together. You can have a buffer that uses a custom allocator:

let vector_using_allocator = Vector::<u32, AllocatorBuffer<_, MyCustomAllocator>>::new();

AllocatorBuffer<T, A: Allocator> can be found on the proof-of-concept (source code).

In other words: allocators defines how to dynamically aquire and release parts of system memory, and buffers would allow to define where a container like Vec saves its data (which makes them different but related). LinkedList & co. would still use the Allocator API.

matthieum · October 2, 2023, 3:53pm

The composite organization is reminiscent of Alexandrescu Modern C++ Design's Allocator, which is pretty nice.

Since you referenced an old topic of mind, it's not clear if you've seen the RFC that I made (earlier this year) on the idea of replacing the Allocator abstraction with a slightly more powerful abstraction instead: Introduce the Store API for great good.

The Store API seems to solve a number of problems that Buffer could also solve, though in a more generic fashion. In particular:

The Store API allows inline memory, and hybrid inline/heap memory, for all collection types.
The Store API allows more exotic backing storage, such as in shared memory, on file, or behind network calls.

It should be straightforward to port Buffer to use the Store API instead of the Allocator API, and a number of buffers would become obsolete in the process -- such as SvoBuffer. I can only invite you to check its companion repository and see if you can easily build a Buffer on top.

What the Store API doesn't provide, and where Buffer could kick in, is storage transformation, such as SoA.

It is not clear to me, however, how ArrayBuffer fits in a Vec. The API of a Vec<T> expects to manipulate Ts (by values and references), so it's not clear how a Vec<[T; 3]> could suddenly manipulates the "columns" independently from one another.

The commit which introduced the ArrayBuffer doesn't seem to use it, in any way. Did I miss something?

mangelats · October 2, 2023, 9:16pm

I really enjoyed that Alexandrescu's talk! I'm flattered with the comparison.

Your RFC was out of my radar. I'll read it and the accompanying code but at a first glance it seems to fit nicelly. I'll definitely try to make an implementation soon

ArrayBuffer does not modify each column seperatelly but saves the underlying data as such. It's a bit of an outlier because it was part of a bigger effort to try to make a simpler SoA code. The thought was to start with an array (single type), move to a tuple (which support multiple types), and finally a structure (full SoA). I also don't know why there aren't any tests since I seem to remember to have written them, so thank you for the heads up!

system · December 31, 2023, 9:16pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Combining the allocator and storages APIs libs	17	1658	August 17, 2021
Is custom allocators the right abstraction? libs	91	25214	July 8, 2021
Pre-RFC: Storage API libs	21	2691	September 17, 2023
A size type parameter for collections	1	834	March 25, 2019
Collection Traits, Take 2	29	7320	May 8, 2019