Is custom allocators the right abstraction?

matthieum · February 23, 2021, 10:29am

Preface: I renamed acquire/release to allocate/deallocate, to be closer to the Allocator API.

@CAD97 I tried adding support for converting from Box<T, StorageA> to Box<T, StorageB>, and given that T can be !Sized at that point, this required me to add an allocate method to ElementStorage that only takes the meta-data, not the T, so you were spot on regarding this comment.

With regard to the exact hierarchy of traits; I am not sure.

At some point in the discussion I was afraid that each data-structure would require a unique trait API. I'm very happy that I managed to distill down the requirements to end up with the 2x2 matrix (Single vs Multi and Element vs Range). It's entirely possible that further simplification is still available... but I am not sure if it's possible, or even desirable:

I don't think that Single vs Multi should be erased:
- There's a strong semantic difference since Single doesn't keep track of whether it's "occupied" or not. I am slightly uncomfortable smoothing it out.
- This difference has repercussions on the implementation: Multi requires extra tracking which is just overhead for Single, so a given storage is generally specialized for one or the other anyway -- the only exception being the allocators.
Unification of Element vs Range is even more complicated. Differences are:
- T: ?Sized + Pointee vs T: yet, if considering the range as a single Element, this should work.
- MaybeUninit: in the case of Range storage. If we change the signature of resolve (gonna steal that name...) to going from Handle<T> to NonNull<MaybeUninit<T>>, then it would be smoothed.
- type Capacity. This latter is critical, it's how a Vec<u8, inline::SingleRangeStorage<u8, [u8; 31]>> can take only 32 bytes. At the same time, there's no Capacity for Element Storage; it's meaningless.

I can see building a hierarchy like:

Storage: Handle<T>, deallocate, and resolve.
- ElementStorage: destroy convenience method.
  - SingleElementStorage: allocate, and create convenience method.
  - MultiElementStorage: allocate, and create convenience method.
- RangeStorage: Capacity, try_grow, and try_shrink.
  - SingleRangeStorage: allocate.
  - MultiRangeStorage: allocate.

However I find the Storage trait rather... pointless, on its own? I don't have any usecase that would require it right now, though at a guess resolve may be useful on its own?

Imagining that we paper over the difference between Single and Multi, as uncomfortable as this makes me:

Storage: Handle<T>, deallocate, and resolve.
- ElementStorage: allocate, and for convenience create and destroy.
- RangeStorage: Capacity, allocate, try_grow, and try_shrink.

And imagining that we're okay asking the user to synthetize a SliceMeta<T> out of thin air just to call allocate:

Storage: Handle<T>, allocate, deallocate, and resolve.
- ElementStorage: convenience create and destroy.
- RangeStorage: Capacity, try_grow, and try_shrink.

But to reiterate, this seems like shoehorning to me considering that:

A given container has very specific requirements on the Single/Multi and Element/Range axes, and only requires one combination.
A given storage is tailored to a very specific case on the Single/Multi axis.

So I could see an advantage in carving out a Storage with Handle<T> and resolve. But any further attempt at simplification seems rather artificial for now.

@RustyYato I don't see how to provide Drop:

The storage doesn't keep track of which element is initialized, or not, so doesn't know what to Drop.
The handles would need a mutable reference to the storage to be able to drop, which we can't have if we have multiple handles.
In the case of ranges, only the user knows which elements in the range are initialized or not.

So, I don't see any way to call the destructor of elements because of (3), hence the user would be responsible for that regardless. And I don't see any way to release the memory without extra tracking.

I would say that the Drop wrapper you ask for is going to be called Box, Vec, ... I am not sure there's a good opportunity for an intermediate layer.

RustyYato · February 23, 2021, 2:30pm

It doesn't need to Drop the elements, just deallocate the storage if necessary. See RawVec in std for an example.

matthieum · February 23, 2021, 4:27pm

I am glad we agree that calling Drop on the elements is not possible.

How do you plan on solving the MultiStorage issue that it does not track the multiple allocations?

RustyYato · February 23, 2021, 4:52pm

I only mentioned using Drop for Single*, not Multi*. But given @CAD97's comments I think it would be fine to not use Drop in this case either.

matthieum · March 28, 2021, 11:56am

As mentioned, there's currently one unsolved issue in storage-poc: RawBox is not CoerceUnsized.

I opened a separate discussion to track this particular issue at Should Pointee Metadata be CoerceUnsized? and would appreciate help in figuring the best way to resolve it.

irrst · April 7, 2021, 4:30pm

I was experimenting with my own PoC for the storage API and I found some things I wanted to share, mention and discuss here.

As I'm not a native english speaker, fell free to ask me if you can't understand some parts!

So these are what I have thought about:

The distinction between Single* and Multi* storage isn't really needed. In general, Storage stores information (or inline storage) shared across all allocations and Handle stores information about one specific allocation. But for Single* storages, this distinction is unnecessary because there is no shared information (or shared inline storage) at all. We can actually store everything in handle which would allow that there are many alive handles at any time - which would in turn erase the need for Single* storage.

For better understanding: Inline Single* storages are implemented like this:
```
Storage: inline storage
Handle: some metadata
```
And there is no reason why this cannot be implemented this way:
```
Storage: ()
Handle: inline storage + some metadata
```
This way, the storage can allocate many handles and there is no additional overhead when compared to the original.

You can see an example of this here (Ignore that the trait definition is somewhat different from the traits in storage-poc).
There are some problems with the current typed Storage API:
- compatibility with future custom DST proposals: To allocate DST using the typed API, one of the following conditions must be true:
  - There must be a Sized counterpart of the DST type (which you can CoerceUnsized).
  - You must be able to get both a valid pointer metadata (before initialising the memory) and the layout from the pointer metadata.
  Both conditions can be violated when custom DSTs comes in.
- dynamic allocations: There are some cases where you want to allocate runtime-sized memory for example like language interpreter, game engines, data driven systems, etc. This is not possible when the API is typed.
So the underlying problem is that the current typed Storage does not allow runtime-sized allocations. Using Allocator instead of Storage is also not a real solution because Allocator is less powerful than Storage (you can't implement things like shared-memory allocations, auto-defragmentations or inlined allocations with current Allocator).

Currently, I can see two options to solve this problem:
- Using Layout for the Storage API instead of type parameter T
  
  This way, you can allocate runtime-sized memories at cost of more error-prone API (because it is untyped) and slightly worse performance (because you have to pass Layout around every time).
- Building the Allocator API also around custom handles
  
  This way, Allocator trait becomes as powerful as Storage trait at cost of less ergonomic API.
Edit: When I think about it now, there are actually no real differences between this two options as "untyped Storage" is essentially the same as "handle based Allocator". I'm even not sure whether there must be two separate traits for this.

Indeed, we could just have one untyped, handle based allocation trait and maybe additionally a fully-typed, more ergonomic and less error-prone API implemented on top of that if we want to.
Currently, we can do nothing with an allocated handle; we need to first acquire the underlying pointer (which may change at any time) to do operations with the memory itself.

One problem arises with this approach: We can't use storages in const context.

We can't use pointers in const context and it is unlikely that this will change. This means that we can't do things like const MAP: HashMap<String, u32, InlineStorage> = { ... } although it can be done in compile-time.

(This point isn't really problematic right now as "const collections" are not the main focus of this proposal. But we will have to decide whether to use pointers or some other approach before stabilisation; because after that, we cannot change it anymore.)

Maybe these questions were not needed right now because the current storage proposal is only at the PoC stage; but I think the earlier we raise up unresolved questions, the better we can answer those.

@matthieum: What do you think about this?

As a side note, thanks for this awesome proposal and working on this kind of stuffs!

matthieum · April 8, 2021, 4:05pm

Some interesting points!

I'd love to erase it, I'm just not sure how.

At the moment, the only difference implementation-wise between Single and Multi is that Single doesn't keep track of whether anything is stored, and Multi does.

This implementation-specific difference, however, is somewhat reflected in the API:

Single => it's up to the caller to remember whether something is stored or not.
Multi => the caller can keep (trying to) allocating and deallocating; if there's not enough room they'll get an error.

So the problem is not storing state in the Handle -- that's already the case for the MultiHandle, they generally store either index or pointer -- but deciding whether the allocation should succeed or fail.

Requiring that Single keep track of whether its storage is occupied or not means requiring that at least one bit of state be available for it, and that one bit is rounded up to the alignment of the storage (at least), so it gets rather costly. In effect, it'll often by an 8 bytes overhead.

Unless I'm mistaken in my reasoning, and if I am please point it out to me. I'd be very happy to erase that distinction if there's no runtime cost.

For now, all those proposals have flunked out. It's hard enough to design an API for a known set of usecases, I'd rather not venture in speculation about an uncertain future.

Does it?

If you want to get a raw slice of memory, it seems to me you just need a loose enough type. If you have a MultiRangeStorage, you can ask for a large slice of [u8] and be on your way.

Well, it is currently missing the ability to pass a runtime alignment. I'm not sure if that's a common requirement; if necessary though the RangeStorage::allocate call could take a complete layout, rather than just a size, to enable such a usecase.

Of course, if you use raw-memory then you're on the hook for destructing whatever you place there yourself. This seems fair enough to me.

It does, that's what RangeStorage is all about. It may even support resizing existing allocations -- at the cost of potentially invalidating all current handles.

(Note: RangeStorage is enough for Vec, and you never know the size of a Vec in advance.)

I would expect this to be a temporary limitation of const contexts; I'm not too worried about it.

Thank you, I'm glad to see that people find the idea useful!

riking · April 8, 2021, 6:10pm

At the risk of stating the obvious here, the typical way to manage exclusive access at no runtime cost would be with reference lifetimes - a Handle that holds a &mut Storage. However, we require that Handle be Copy so that's not going to work.

This doesn't work for the more esoteric single-element Storage types like PosixShmemStorage<T>.

irrst · April 8, 2021, 10:50pm

Let's look at inline storages for example:

Currently, inline buffer is stored in the type implementing Storage. This way, you cannot have multiple handles without additional tracking because using one buffer for many handles requires some tracking.

But if the type implementing Storage stores nothing and the actual inline buffer is stored in the Handle, you can have multiple handles without additional tracking because different handles do not share the same buffer. Storage doesn't need to track whether it has already allocated - because the allocation will always succeed regardless of that.

This will require some changes to the API like removing the Clone + Copy bounds from Handle and passing handles by references instead of by values. But this should be an acceptable trade-off.

(I know that I'm not very good at explaining something so please ask me again if you still can't see the point. Then I'll try to make a more understandable comment with examples of the implementation.)

As said above, it will always succeed regardless of how many handles are allocated - at least for single inline storages. I suppose that this will be the case for all other single storages but there may be other cases too.

Oh, I have overseen that RangeStorage allows that. Then I see nothing anymore which would block Storage from being typed.

To be honest, I'm not sure whether this is a temporary limitation because of the problems pointers could cause in const contexts. But given that rust lang devs have already done many things which looked impossible for me, maybe I should not worry about it either. Const collections are not that important anyway.

I suppose you're talking about shared memory storages (if not, please correct me). I don't think that this doesn't work for shared memory allocators:

You can create different shared memory regions for different handles (with random names). Or you can set permissions in a way which will prevent creating them twice. Either way, you don't need the Single* traits to avoid overheads because you don't need to track additionally whether you already have allocated or not.

But I know almost nothing about shared memory and I might be overseeing something. Tell me if this is the case.

matthieum · April 9, 2021, 5:14pm

I had completely missed that in your earlier explanation, and you are correct that this solves the issue of tracking in Storage.

I'm not sure, however, that it is applicable to as many usecases as Storage (or Allocator) currently is.

Think of a Linked List: so, the Storage returns a Handle<Node>, now what?

Where do you store it? You can't store it in a Node (infinite size).
How do nodes refer to it?

I am afraid it isn't.

Handle<T> is Copy because it seeks to replace NonNull<T> which is Copy. It's a crucial property for Handle<T> to be able to be used wherever NonNull<T> is used today; otherwise many collections -- starting from LinkedList -- cannot be easily ported.

A piece of advice: try using your interface.

I didn't realize all the constraints of a good Storage API from the get go, they appeared as I tried to use the API in a variety of usecases.

This is the main reason why in storage-poc, there's almost as much code in collections than in the actual API + Implementations => it's to ensure that the API actually suit the usecases.

The Storage API needs to accommodate a range of usecases:

Box<T> => single element, possibly !Sized.
Vec<T> => single range, resizable, Sized elements.
LinkedList<T> => multiple elements, aliased handles, Sized elements.
SkipList<T> => multiple elements, aliased handles, the elements are Sized, but the nodes shouldn't be.

I dearly hoped I haven't overlooked a collection type which would impose new requirements/constraints on the storage.

(Note: SkipList is not implemented; hopefully being that it's just linking Boxes, it should be possible, but once again, maybe I've overlooked something critical)

irrst · April 9, 2021, 8:01pm

You're right. I assumed pretty naively that Single* storages are not needed - only with one implementation for Vec which obviously cannot prove its capability for other use cases. I should definitely implement other collections too to find out whether discarding Single* storages is possible and ideal. I'll share my experience here if I have some new results.

system · July 8, 2021, 8:02pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Another alternative to allocators for Vec and other collections libs	5	1010	December 31, 2023
Combining the allocator and storages APIs libs	17	1842	August 17, 2021
[Pre-RFC] Two new structures for two different types of memory libs	4	964	November 9, 2020
Should we unsize custom handles? language design	5	1378	March 23, 2022
Retrofitting legacy code with custom allocator libs	4	320	November 3, 2024

Is custom allocators the right abstraction?

Related topics