As someone working in high-performance systems, I am all too keen on the ability to avoid heap allocations wherever possible, whether for cache-locality or lower-overhead.
The news that custom allocators where being rolled-out, in nightly, was therefore of great interest. See Box and Vec.
Unfortunately, the design of custom allocators prevents inline storage -- that is, a Vec<T, ???>
cannot be self-contained (only pointing within itself) because if you move it its internal pointer points to the previous location.
This is quite a stark limitation.
As mentioned I am working in high-performance systems, in C++. The custom allocators in C++ suffer from the same fundamental issue, and therefore over the last few years I have implemented a variety of Vec-like structures:
-
InlineVec<T, N>
: a Vec-like structure with inline storage and a constant capacity of N. -
SmallVec<T, N>
: a Vec-like structure with either inline storage (up to N elements) or heap-allocated storage. -
ScratchVec<T>
: a Vec-like structure receiving a slice of uninitialized memory as storage.
And of course I also use std::vector<T>
as provided by the standard library.
Up until recently, this was seen as an unfortunate, though inevitable, situation. After all custom allocators do not allow inline storage.
Recently, however, when a colleague complained that one of the Vec-like we had did not support the full breadth of functionality that ScratchVec
did, rather than duplicate the functionality, we decided to take a step back.
Out of our brainstorming RawVec<T, Storage>
was born, where Storage
defines the raw memory storage backing RawVec
, and our 3 previous implementations were entirely redefined in terms of RawVec
:
-
InlineVec<T, N>
is essentiallyRawVec<T, [MaybeUninit<T>; N]>
. -
SmallVec<T, N>
is essentiallyRawVec<T, union { [MaybeUninit<T>; N], Box<[MaybeUninit<T>> }>
. -
ScratchVec<T, N>
is essentiallyRawVec<T, &mut [MaybeUninit<T>]>
.
And there was much rejoicing.
This massively cut down on duplication, and the code for the Storage parameter is relatively simple -- though there's still trimming to do. In fact, it's so good that as we speak I am in the process of applying the same pattern to a RawVecDeque
implementation.
And so, as I was looking at roll-out of custom allocators in Box
, and Vec
, and realized that they would suffer from the same harm-stringing limitation that they have in C++, I realized that maybe Rust could avoid making the same mistake as C++, and go for a more flexible abstraction instead.
You're welcome to have a look at the PoC or to have a look at @RustyYato's similar crate generic-vec
or at the latest release of coca (0.2) which directly uses a ContiguousStorage
trait for its various containers and the explorations of its author.
I do insist on potentially more flexible abstraction. I believe it would work well for any collection with contiguous and homogeneous storage (Box
, Vec
, VecDeque
), and probably for any collection with contiguous but heterogeneous storage (HashMap
?).
It may possibly be adapted for BTreeMap
(non-contiguous) by handing out custom handles that could be converted to pointers (Storage::ref(&self, handle) -> &T
?), which is necessary for the nodes to be able to store the handles without them being invalidated on moves. The Global
allocator -- or any heap allocator -- could simply use *mut T
as handle and not suffer from any conversion drawback.
@TimDiekmann : Triggered by your comment on reddit about the impossibility to inline the allocator within Vec
due to pointer invalidation on moves.