Is custom allocators the right abstraction?

matthieum · February 21, 2021, 4:06pm

After a week-end of work storage-poc now contains generic implementations of:

alternative storage, which uses either the first or the second storage, one at a time.
fallback storage, which uses either the first or the second storage, both at the same time.

I took the opportunity the clean-up the implementation of the small storage, it's now defined as an alternative of inline and allocator storage, or in the code:

type Inner<S, A> =
    alternative::SingleElement<
        inline::SingleElement<S>,
        allocator::SingleElement<A>,
        DefaultBuilder,
        AllocatorBuilder<A>
    >;

I think the crate is in a pretty good shape, and therefore that it's a good time to summarize where it stands, which I am going to do here:

Usecases unlocked

This crates demonstrates that a number of usecases are unlocked by the usage of Storages, rather than the currently proposed Allocator API.

There are essentially 2 features of the crate that unlock usecases:

Inline storage.
Custom handles.

Inline Storage unlocks:

Inline collections:
- InlineString<63> (InlineVec<u8, 63>): a String of up to 63 bytes, entirely stored inlined in 64 bytes. Guaranteed never to allocate, good cache locality.
- InlineBox<T, [usize; 4]>: a Sized type for !Sized types. Allows passing dyn Fn(...), or dyn Future<...> around without allocation, without waiting for unsized_locals.
Small collections, such as SmallString<N>.
const collections: since InlineVec is non-allocating, it should be feasible to store it in a const item, and extending, there's no reason an InlineHashMap couldn't be stored in a const item either.

Allocators cannot allow inline storage, as then when the collection moves the pointer its stores to its elements is now dangling. Storages can, as demonstrated, by relying on custom handles.

Custom Handles, themselves, unlock at least one usecase:

Using Box, Vec, ... in shared memory. Storing pointers in shared memory is only possible if the shared memory is mapped at the same address in every process, which is a big constraint. Using a SharedMemoryStorage which resolves the custom handle to a pointer relative to its own address, however, this problem is solved.

Remaining Work

Unstable Features

The crate requires a few unstable language features:

specialization is inherited: as it uses rfc2580 for meta-data.
coerce_unsized and unsize: to manipulate unsized elements.
untagged_unions: for alternative's handles, maybe?
And the biggest: generic_associated_types which is critical to the whole type Handle<T> = ...; allowing collections not to expose their internal nodes.

It is intended to be part of standard library, however some features will be necessary for any user to implement the traits themselves:

generic_associated_types is always necessary.
coerce_unsized and unsize are necessary for the ElementStorage family of traits -- see below.

CoerceUnsized for Box

The RawBox implementation of the crate does not manage to implement CoerceUnsized. As a work-around, the ElementStorage requires implementing a coerce function to coerce a Handle<T> into a Handle<U>.

If the Handle<T> = NonNull<T>, then this is not a problem. The problem occurs when attempting to define a custom handle embedding the pointer meta-data instead of the pointer itself.

I've left a comment on the tracking issue of RFC2580; I believe the best solution would be for <T as Pointee>::Metadata to be coercible to <U as Pointee>::Metadata if T: Unsize<U>. Since the intent is for the Metadata types to be strongly tied to the compiler, I would expect it is technically feasible.

To move forward or not to move forward?

storage-poc was always intended as Proof Of Concept to:

Demonstrate the technical feasibility.
Showcase collections for each usecase.
Sketch out a potential API.

It has met its goals. It's pretty clearly demonstrated the feasibility, the collections are there for anyone to see, and the resulting API is pretty lean¹ yet enabling all of that -- though I hold no illusion that it's perfect.

¹ The first drafts were much more crowded, I even wondered if each collection would end-up requiring a specialized trait. By contrast, the current API has essentially 4 traits, in a matrix: [Multi|Single][Element|Range]Storage, and each trait has only a handful of functions, with no duplication in sight.

Now is a good time, then, to take a step back and evaluate whether to move forward or not.

I love this quote, from Antoine the Saint Exupéry:

Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away

I believe that the usecases unlocked by the use of Storages over Allocators are compelling enough, but then since they solve problems that I have, I am more than a little biased.

On the one hand, there are strong benefits:

Obsoletes many crates, among which coca, by allowing Box, BTreeMap, Vec, etc... in non-allocating contexts.
Offers an alternative solution to unsized_locals and co: you could pass RawBox<dyn Future, [usize; 4]> as function parameter, or return it; you could implement a non-allocating task-queue as containing RawBox<FnOnce(), [usize; 4]>.
Potentially offers a way to store BTreeMap, or HashMap as const items.

On the other hand, there are clear costs:

Impact on RFC2580: I expect that it requires Metadata to be coercible, which first requires them to be strongly typed.
Impact on Collections: the collections code can be made core, but in exchange it has to be fully overhauled to use handles rather than pointers, and to convert handles to pointers any time it actually needs the pointer.
Impact on Compile-Times: mostly likely, the additional layer of generics will lead to a degradation of compile-times.

Also, it is important to remember that as long as RFC2580 implements coercible metadata, a userspace crate could fork all the std collections to rebase them on storages, and only the people who care would pay the cost. I find it distasteful (duplication), but pragmatically it could work rather well.

So, do we think that a sufficient number of users, and usecases, would benefit from the usage of storages to justify going forward, or not?

@TimDiekmann @RustyYato @CAD97

Topic		Replies	Views
Another alternative to allocators for Vec and other collections libs	5	1005	December 31, 2023
Combining the allocator and storages APIs libs	17	1837	August 17, 2021
[Pre-RFC] Two new structures for two different types of memory libs	4	962	November 9, 2020
Should we unsize custom handles? language design	5	1378	March 23, 2022
Retrofitting legacy code with custom allocator libs	4	320	November 3, 2024