Is custom allocators the right abstraction?

Element vs Range is essentially whether the storage allocates a T (single element) or a [MaybeUninit<T>] (contiguous range of elements).

In BTreeSet and BTreeMap, the storage just needs to allocate one Node at a time, and that doesn't require ranges -- even if said nodes contains 4 or 6 pointers.

One the other hand, a JaggedVec could be something like:

struct JaggedVec<T> {
    length: usize,
    capacity: usize,
    elements: [NonNull< [ MaybeUninit<T> ] >; 32],
                        ^                ^
}

In which case it asks the storage for a range of values each time.

And bluss' CompactHashMap is implemented as:

struct CompactHashMap<K, V> {
    lookup: HashMap<K, usize>,
    values: Vec<(K, V)>,
}

So that even if HashMap and Vec only allocate a single (contiguous) range each, the CompactHashMap itself needs to allocate 2 different ranges -- although is also poses additional difficulties wrt. sharing the storage between 2 collections, but self-referential is a rabbit hole I'd rather not go down into here.

1 Like

After a week-end of work storage-poc now contains generic implementations of:

  • alternative storage, which uses either the first or the second storage, one at a time.
  • fallback storage, which uses either the first or the second storage, both at the same time.

I took the opportunity the clean-up the implementation of the small storage, it's now defined as an alternative of inline and allocator storage, or in the code:

type Inner<S, A> =
    alternative::SingleElement<
        inline::SingleElement<S>,
        allocator::SingleElement<A>,
        DefaultBuilder,
        AllocatorBuilder<A>
    >;

I think the crate is in a pretty good shape, and therefore that it's a good time to summarize where it stands, which I am going to do here:

Usecases unlocked

This crates demonstrates that a number of usecases are unlocked by the usage of Storages, rather than the currently proposed Allocator API.

There are essentially 2 features of the crate that unlock usecases:

  1. Inline storage.
  2. Custom handles.

Inline Storage unlocks:

  • Inline collections:
    • InlineString<63> (InlineVec<u8, 63>): a String of up to 63 bytes, entirely stored inlined in 64 bytes. Guaranteed never to allocate, good cache locality.
    • InlineBox<T, [usize; 4]>: a Sized type for !Sized types. Allows passing dyn Fn(...), or dyn Future<...> around without allocation, without waiting for unsized_locals.
  • Small collections, such as SmallString<N>.
  • const collections: since InlineVec is non-allocating, it should be feasible to store it in a const item, and extending, there's no reason an InlineHashMap couldn't be stored in a const item either.

Allocators cannot allow inline storage, as then when the collection moves the pointer its stores to its elements is now dangling. Storages can, as demonstrated, by relying on custom handles.

Custom Handles, themselves, unlock at least one usecase:

  • Using Box, Vec, ... in shared memory. Storing pointers in shared memory is only possible if the shared memory is mapped at the same address in every process, which is a big constraint. Using a SharedMemoryStorage which resolves the custom handle to a pointer relative to its own address, however, this problem is solved.

Remaining Work

Unstable Features

The crate requires a few unstable language features:

  • specialization is inherited: as it uses rfc2580 for meta-data.
  • coerce_unsized and unsize: to manipulate unsized elements.
  • untagged_unions: for alternative's handles, maybe?
  • And the biggest: generic_associated_types which is critical to the whole type Handle<T> = ...; allowing collections not to expose their internal nodes.

It is intended to be part of standard library, however some features will be necessary for any user to implement the traits themselves:

  • generic_associated_types is always necessary.
  • coerce_unsized and unsize are necessary for the ElementStorage family of traits -- see below.

CoerceUnsized for Box

The RawBox implementation of the crate does not manage to implement CoerceUnsized. As a work-around, the ElementStorage requires implementing a coerce function to coerce a Handle<T> into a Handle<U>.

If the Handle<T> = NonNull<T>, then this is not a problem. The problem occurs when attempting to define a custom handle embedding the pointer meta-data instead of the pointer itself.

I've left a comment on the tracking issue of RFC2580; I believe the best solution would be for <T as Pointee>::Metadata to be coercible to <U as Pointee>::Metadata if T: Unsize<U>. Since the intent is for the Metadata types to be strongly tied to the compiler, I would expect it is technically feasible.

To move forward or not to move forward?

storage-poc was always intended as Proof Of Concept to:

  1. Demonstrate the technical feasibility.
  2. Showcase collections for each usecase.
  3. Sketch out a potential API.

It has met its goals. It's pretty clearly demonstrated the feasibility, the collections are there for anyone to see, and the resulting API is pretty lean1 yet enabling all of that -- though I hold no illusion that it's perfect.

1 The first drafts were much more crowded, I even wondered if each collection would end-up requiring a specialized trait. By contrast, the current API has essentially 4 traits, in a matrix: [Multi|Single][Element|Range]Storage, and each trait has only a handful of functions, with no duplication in sight.

Now is a good time, then, to take a step back and evaluate whether to move forward or not.

I love this quote, from Antoine the Saint Exupéry:

Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away

I believe that the usecases unlocked by the use of Storages over Allocators are compelling enough, but then since they solve problems that I have, I am more than a little biased.

On the one hand, there are strong benefits:

  1. Obsoletes many crates, among which coca, by allowing Box, BTreeMap, Vec, etc... in non-allocating contexts.
  2. Offers an alternative solution to unsized_locals and co: you could pass RawBox<dyn Future, [usize; 4]> as function parameter, or return it; you could implement a non-allocating task-queue as containing RawBox<FnOnce(), [usize; 4]>.
  3. Potentially offers a way to store BTreeMap, or HashMap as const items.

On the other hand, there are clear costs:

  1. Impact on RFC2580: I expect that it requires Metadata to be coercible, which first requires them to be strongly typed.
  2. Impact on Collections: the collections code can be made core, but in exchange it has to be fully overhauled to use handles rather than pointers, and to convert handles to pointers any time it actually needs the pointer.
  3. Impact on Compile-Times: mostly likely, the additional layer of generics will lead to a degradation of compile-times.

Also, it is important to remember that as long as RFC2580 implements coercible metadata, a userspace crate could fork all the std collections to rebase them on storages, and only the people who care would pay the cost. I find it distasteful (duplication), but pragmatically it could work rather well.

So, do we think that a sufficient number of users, and usecases, would benefit from the usage of storages to justify going forward, or not?

@TimDiekmann @RustyYato @CAD97

22 Likes

It's another nightly feature to wait on, but -Zshare-generics (or similar) should be able to mitigate some of the cost of more complicated Storage traits, by effectively providing MIR for e.g. Vec<T, AllocatorStorage<Global>>.

I'd also like to see a comparison between the compilation cost of Allocator versus Storage; I'm not completely convinced that they'd be significantly different.

Given that custom allocators are nightly only, have been for a good while, and there's been no real "coming soon" pressure on custom allocators (global alloc is "good enough" to unlock initial use cases), I think the benefit of getting it "right" with Storage outweighs the cost of waiting.

This is something we can do better than everyone else, so we should do our best to try to do so. We already have mandatory size on dealloc; we can push further for Storage.

There's also a thin tightrope we could potentially walk to stabilize Allocator without Storage. Basically, stabilize AllocatorStorage, but the only stable use is to pass it along to Vec or another collection that requires the unstable Storage trait. If it's later decided to yeet Storage, AllocatorStorage becomes a deprecated item that's just a noöp Allocator wrapper.

11 Likes

This is a great summary of your findings so far. All applicable portions are more or less in line (pun not intended) with what I found when implementing generic-vec. I think RawBox is sufficient motivation for Storage over Allocator, because there is no way to get non-allocating trait objects without it. The fact that smallbox uses a similar approach seems to back this up (specialized for Box).

I think @CAD97 already pointed this out. This is no different from adding an allocator type parameter. So there isn't any difference over the alternative.

I think it would be best to separate SingleRangeStorage and MultiRangeStorage because with SingleRangeStorage, the Storage could deallocate the memory when it drops, making RangeStorage::release unnecessary. It would also simplify the Drop code for data structures like storage_poc's RawVec<_> and make them panic-safe for free. This might also be applicable to *Storage in general, but I'm unsure about that (If it is generally applicable, then RangeStorage could stay).

This is... surprisingly reasonable, though a little roundabout. This applies equivalently to SingleElementStorage, and is easier to explain, so I'm going to walk through with that.

Originally I though this would be unreasonable to support, as SingleElementAllocatorStorage<Alloc> obviously would have a Handle of ptr::NonNull<T>, so that Box<T, SEAS<A>> would decay back to just being { ptr: NonNull<T>, alloc: SEAS(A) }, which is basically just the current allocator-generic box. However, the storage itself could store the pointer, and provide a Handle of (), so Box<T, SEAS<S>> would be { handle: (), storage: NonNull<T> }.

Basically, SingleElementStorage acts like Box<T>, and MultiElementStorage acts like Box<[T]> (except maybe uninit (maybe), so RawBox).


HOWEVER, I don't think this is the correct way to handle (no pun intended) it. The Storage traits should solely care about acquiring/releasing memory when asked. Let (Raw)Box<T, S>/Box<[T], S> be those types that do the dealloc on drop. This simplifies the Storages' job immensely, and reduces the cost of providing/implementing Storage. Additionally, if SingleStorage isn't in charge of releasing handles and assumes the user does, this allows MultiStorage to be a simple marker trait extension of SingleStorage that lifts the "only one live handle" restriction.

Specifically, (assuming there are no further impl restrictions, and I think this is just a slight reorganization of the existing POC traits), I think the right API is something along the lines of (modulo naming bikeshed)

// NB: Blank lines removed for compactness, also I just use usize for simplicity

/// A storage capable of storing single elements.
unsafe trait Storage {
    /// The handle used to access stored elements.
    type Handle<T: ?Sized + Pointee>: Copy;
    /// Acquire a handle managed by this storage.
    /// # Safety
    /// Only one handle may be live unless this type is `MultiStorage`.
    unsafe fn acquire<T: ?Sized + Pointee>(&mut self, meta: T::Metadata) -> Result<Self::Handle<T>, Error>;
    /// Release a handle managed by this storage.
    /// # Safety
    /// This is an unreleased handle acquired from this storage.
    /// Invalidates the handle.
    unsafe fn release<T: ?Sized + Pointee>(&mut self, handle: Self::Handle<T>);
    /// Resolve a handle managed by this storage.
    /// # Safety
    /// This is an unreleased handle acquired from this storage. The pointer is only valid
    /// until the storage is moved or `acquire`/`release` is called (for any handle).
    unsafe fn resolve<T: ?Sized + Pointee>(&self, handle: Self::Handle<T>) -> ptr::NonNull<T>;
    // helpers and coerce things
}

/// This storage supports multiple live handles.
unsafe trait MultiStorage: Storage {}

/// A storage capable of storing contiguous ranges of elements.
unsafe trait RangeStorage {
    /// The handle used to access stored elements.
    /// Knows the provided capacity.
    type Handle<T>: Copy;
    /// Acquire a handle managed by this storage, capable of holding at least `capacity` elements.
    /// # Safety
    /// Only one handle may be live unless this type is `MultiRangeStorage`.
    unsafe fn acquire<T>(&mut self, capacity: usize) -> Result<Self::Handle<T>, Error>;
    /// Release a handle managed by this storage.
    /// # Safety
    /// This is an unreleased handle acquired from this storage.
    /// Invalidates the handle.
    unsafe fn release<T>(&mut self, handle: Self::Handle<T>);
    /// Resolve a handle managed by this storage.
    /// # Safety
    /// This is an unreleased handle acquired from this storage. The pointer is only valid
    /// until the storage is moved or `acquire`/`release` is called (for any handle).
    unsafe fn resolve<T>(&self, handle: Self::Handle) -> ptr::NonNull<[T]>;
    /// helpers, try_grow, try_shrink, max capacity
}

/// This storage supports multiple live handles.
unsafe trait MultiRangeStorage: MultiStorage {}

(Traits need to be unsafe. Otherwise a valid impl is no-op acquire/release and ptr::null for resolve.)

I'm unsure about providing the Ts upfront to be honest, and could go either way. (The current POC requires the T upfront, my sketch just requires the pointer metadata.) Not requiring the T is probably better, as you maintain the ability to emplace dynamically sized types (for arbitrary storage). [std::alloc example]

fn resolve_mut(&mut self, handle: Self::Handle<T>) is probably required (under current stacked borrows rules), to give mutable provenance to the returned pointer even for inline storages. Either that, or any inline storage that wants to be mutable must use UnsafeCell on its internals. Everywhere I marked to invalidate pointers is being conservative around SB; I'm not sure if conservative enough tbh. Ultimately, I'm not super confident how inline storages, pointer provenance, and stacked borrows interact, and would need to do further study to gain confidence. RustBelt proved our allocation primitives sound (with std::alloc); we definitely don't want to accidentally lose that without a very clear way to gain it back.

One thing I'd be curious to know is if it's possible to collapse RangeStorage of T into just being a Storage of [T]. I'm not sure; this would require more design experimentation to see if it puts undue restrictions on storages to support both single and range allocation simultaneously (rather than, say, providing an impl of RangeStorage based on {Storage of [T]}). IIUC, RangeStorage handles are currently required to remember the capacity they provide (acquire returns ptr::NonNull<[T]>; I took this directly from the POC), rather than passing that responsibility on to the user (which would be a clear reason to split the traits; acquire would just return ptr::NonNull<T>).

Maybe it would look as simple as something like...

/// A storage that can more efficiently manage contiguous ranges of elements.
unsafe trait RangeStorage: Storage {
    /// Attempt to grow the handle to cover at least `capacity` elements.
    /// # Safety
    /// Requested capacity is >= current capacity.
    /// Invalidates resolved handle pointers.
    /// Invalidates the input handle only on success.
    unsafe fn try_grow<T>(&mut self, handle: Self::Handle<[T]>, capacity: usize) -> Result<Self::Handle<[T]>, Error>;
    /// Shrink the handle to cover at least `capacity` elements.
    /// # Safety
    /// Requested capacity is <= current capacity.
    /// Invalidates resolved handle pointers and the input handle.
    /// Output handle may have any capacity between requested and prior capacity.
    unsafe fn shrink<T>(&mut self, handle: Self::Handle<[T]>, capacity: usize) -> Self::Handle<[T]>;
}

This extension-style RangeStorage sketch makes me think that an independent RangeStorage that resolves to ptr::NonNull<T> and requires (lets) the user remember the capacity of each handle separately is probably better. (E.g. [MaybeUninit<T>; N] would just use Handle = () and resolve() => self.arr.as_mut_ptr().

Now I should probably stop discussing this in depth, since IP ownership of stuff I do is murky at best right now. (I will go to SMU legal and get an exception for OSS if I need to but... confrontation,, and the Guildhall people seem to think the agreement doesn't apply to non-coursework anyway,,)

Ok, I haven't got time today to read that in depth, but I should have mentioned why I said this. Currently this is how Vec is implemented. Vec has a "Storage" RawVec that represents the allocation. Vec is a minimal wrapper that keeps track of how many elements are initialized. Dropping RawVec deallocates the allocation.

The Storage already hands out the Handle and provides a release mechanism, why should it matter (for the Single* variants) it the release is done in drop or elsewhere? (There doesn't need to be a handle at all for Single* Storage)

TL;DR of the last post: doing so is basically making the single storage into a trait version of RawBox (for SingleElementStorage) / RawVec (for SingleRangeStorage). While this is a possible design path, I think it more would useful to have the Storage traits be manually managed, make the Multi versions just an extra capability for the Storages (having multiple concurrently live handles), and let RawBox/RawVec be the RAII-ifiers wrapping the raw Storage.

In other words, I think keeping the difference between Single/Multi versions small (to the point of it being the same acquire implementation) is more useful for the abstraction layer than the automatic of memory without another wrapping type that adds that on top.

Plus, this way the RAII is implemented once[1] for all Storages, whereas if you bake it into the storage trait contract, then all Storages have to implement the cleanup-in-Drop logic. Minor, but meaningful.

(There might be extra caveats around Box, though, due to its current use of Unique rather than NonNull and extra magic properties...)

2 Likes

@CAD97 After reading through this new proposal, I think it does outline a better way to model these traits. Especially once the new RangeStorage is factored in.

Ok, I can see that. Maybe along side these traits it would be possible to provide a RAII guard that cleans up on Drop for ease of use (Maybe limited to Handle<()> = () or something similar).

@matthieum what do you think about the revised proposal.

Preface: I renamed acquire/release to allocate/deallocate, to be closer to the Allocator API.


@CAD97 I tried adding support for converting from Box<T, StorageA> to Box<T, StorageB>, and given that T can be !Sized at that point, this required me to add an allocate method to ElementStorage that only takes the meta-data, not the T, so you were spot on regarding this comment.

With regard to the exact hierarchy of traits; I am not sure.

At some point in the discussion I was afraid that each data-structure would require a unique trait API. I'm very happy that I managed to distill down the requirements to end up with the 2x2 matrix (Single vs Multi and Element vs Range). It's entirely possible that further simplification is still available... but I am not sure if it's possible, or even desirable:

  • I don't think that Single vs Multi should be erased:
    • There's a strong semantic difference since Single doesn't keep track of whether it's "occupied" or not. I am slightly uncomfortable smoothing it out.
    • This difference has repercussions on the implementation: Multi requires extra tracking which is just overhead for Single, so a given storage is generally specialized for one or the other anyway -- the only exception being the allocators.
  • Unification of Element vs Range is even more complicated. Differences are:
    • T: ?Sized + Pointee vs T: yet, if considering the range as a single Element, this should work.
    • MaybeUninit: in the case of Range storage. If we change the signature of resolve (gonna steal that name...) to going from Handle<T> to NonNull<MaybeUninit<T>>, then it would be smoothed.
    • type Capacity. This latter is critical, it's how a Vec<u8, inline::SingleRangeStorage<u8, [u8; 31]>> can take only 32 bytes. At the same time, there's no Capacity for Element Storage; it's meaningless.

I can see building a hierarchy like:

  • Storage: Handle<T>, deallocate, and resolve.
    • ElementStorage: destroy convenience method.
      • SingleElementStorage: allocate, and create convenience method.
      • MultiElementStorage: allocate, and create convenience method.
    • RangeStorage: Capacity, try_grow, and try_shrink.
      • SingleRangeStorage: allocate.
      • MultiRangeStorage: allocate.

However I find the Storage trait rather... pointless, on its own? I don't have any usecase that would require it right now, though at a guess resolve may be useful on its own?

Imagining that we paper over the difference between Single and Multi, as uncomfortable as this makes me:

  • Storage: Handle<T>, deallocate, and resolve.
    • ElementStorage: allocate, and for convenience create and destroy.
    • RangeStorage: Capacity, allocate, try_grow, and try_shrink.

And imagining that we're okay asking the user to synthetize a SliceMeta<T> out of thin air just to call allocate:

  • Storage: Handle<T>, allocate, deallocate, and resolve.
    • ElementStorage: convenience create and destroy.
    • RangeStorage: Capacity, try_grow, and try_shrink.

But to reiterate, this seems like shoehorning to me considering that:

  • A given container has very specific requirements on the Single/Multi and Element/Range axes, and only requires one combination.
  • A given storage is tailored to a very specific case on the Single/Multi axis.

So I could see an advantage in carving out a Storage with Handle<T> and resolve. But any further attempt at simplification seems rather artificial for now.


@RustyYato I don't see how to provide Drop:

  1. The storage doesn't keep track of which element is initialized, or not, so doesn't know what to Drop.
  2. The handles would need a mutable reference to the storage to be able to drop, which we can't have if we have multiple handles.
  3. In the case of ranges, only the user knows which elements in the range are initialized or not.

So, I don't see any way to call the destructor of elements because of (3), hence the user would be responsible for that regardless. And I don't see any way to release the memory without extra tracking.

I would say that the Drop wrapper you ask for is going to be called Box, Vec, ... I am not sure there's a good opportunity for an intermediate layer.

2 Likes

It doesn't need to Drop the elements, just deallocate the storage if necessary. See RawVec in std for an example.

2 Likes

I am glad we agree that calling Drop on the elements is not possible.

How do you plan on solving the MultiStorage issue that it does not track the multiple allocations?

I only mentioned using Drop for Single*, not Multi*. But given @CAD97's comments I think it would be fine to not use Drop in this case either.

1 Like

As mentioned, there's currently one unsolved issue in storage-poc: RawBox is not CoerceUnsized.

I opened a separate discussion to track this particular issue at Should Pointee Metadata be CoerceUnsized? and would appreciate help in figuring the best way to resolve it.

I was experimenting with my own PoC for the storage API and I found some things I wanted to share, mention and discuss here.

As I'm not a native english speaker, fell free to ask me if you can't understand some parts! :wink:

So these are what I have thought about:

  • The distinction between Single* and Multi* storage isn't really needed. In general, Storage stores information (or inline storage) shared across all allocations and Handle stores information about one specific allocation. But for Single* storages, this distinction is unnecessary because there is no shared information (or shared inline storage) at all. We can actually store everything in handle which would allow that there are many alive handles at any time - which would in turn erase the need for Single* storage.

    For better understanding: Inline Single* storages are implemented like this:

    Storage: inline storage
    Handle: some metadata
    

    And there is no reason why this cannot be implemented this way:

    Storage: ()
    Handle: inline storage + some metadata
    

    This way, the storage can allocate many handles and there is no additional overhead when compared to the original.

    You can see an example of this here (Ignore that the trait definition is somewhat different from the traits in storage-poc).

  • There are some problems with the current typed Storage API:

    • compatibility with future custom DST proposals: To allocate DST using the typed API, one of the following conditions must be true:

      • There must be a Sized counterpart of the DST type (which you can CoerceUnsized).
      • You must be able to get both a valid pointer metadata (before initialising the memory) and the layout from the pointer metadata.

      Both conditions can be violated when custom DSTs comes in.

    • dynamic allocations: There are some cases where you want to allocate runtime-sized memory for example like language interpreter, game engines, data driven systems, etc. This is not possible when the API is typed.

    So the underlying problem is that the current typed Storage does not allow runtime-sized allocations. Using Allocator instead of Storage is also not a real solution because Allocator is less powerful than Storage (you can't implement things like shared-memory allocations, auto-defragmentations or inlined allocations with current Allocator).

    Currently, I can see two options to solve this problem:

    • Using Layout for the Storage API instead of type parameter T

      This way, you can allocate runtime-sized memories at cost of more error-prone API (because it is untyped) and slightly worse performance (because you have to pass Layout around every time).

    • Building the Allocator API also around custom handles

      This way, Allocator trait becomes as powerful as Storage trait at cost of less ergonomic API.

    Edit: When I think about it now, there are actually no real differences between this two options as "untyped Storage" is essentially the same as "handle based Allocator". I'm even not sure whether there must be two separate traits for this.

    Indeed, we could just have one untyped, handle based allocation trait and maybe additionally a fully-typed, more ergonomic and less error-prone API implemented on top of that if we want to.

  • Currently, we can do nothing with an allocated handle; we need to first acquire the underlying pointer (which may change at any time) to do operations with the memory itself.

    One problem arises with this approach: We can't use storages in const context.

    We can't use pointers in const context and it is unlikely that this will change. This means that we can't do things like const MAP: HashMap<String, u32, InlineStorage> = { ... } although it can be done in compile-time.

    (This point isn't really problematic right now as "const collections" are not the main focus of this proposal. But we will have to decide whether to use pointers or some other approach before stabilisation; because after that, we cannot change it anymore.)

Maybe these questions were not needed right now because the current storage proposal is only at the PoC stage; but I think the earlier we raise up unresolved questions, the better we can answer those.

@matthieum: What do you think about this?


As a side note, thanks for this awesome proposal and working on this kind of stuffs!

3 Likes

Some interesting points!

I'd love to erase it, I'm just not sure how.

At the moment, the only difference implementation-wise between Single and Multi is that Single doesn't keep track of whether anything is stored, and Multi does.

This implementation-specific difference, however, is somewhat reflected in the API:

  • Single => it's up to the caller to remember whether something is stored or not.
  • Multi => the caller can keep (trying to) allocating and deallocating; if there's not enough room they'll get an error.

So the problem is not storing state in the Handle -- that's already the case for the MultiHandle, they generally store either index or pointer -- but deciding whether the allocation should succeed or fail.

Requiring that Single keep track of whether its storage is occupied or not means requiring that at least one bit of state be available for it, and that one bit is rounded up to the alignment of the storage (at least), so it gets rather costly. In effect, it'll often by an 8 bytes overhead.

Unless I'm mistaken in my reasoning, and if I am please point it out to me. I'd be very happy to erase that distinction if there's no runtime cost.

For now, all those proposals have flunked out. It's hard enough to design an API for a known set of usecases, I'd rather not venture in speculation about an uncertain future.

Does it?

If you want to get a raw slice of memory, it seems to me you just need a loose enough type. If you have a MultiRangeStorage, you can ask for a large slice of [u8] and be on your way.

Well, it is currently missing the ability to pass a runtime alignment. I'm not sure if that's a common requirement; if necessary though the RangeStorage::allocate call could take a complete layout, rather than just a size, to enable such a usecase.

Of course, if you use raw-memory then you're on the hook for destructing whatever you place there yourself. This seems fair enough to me.

It does, that's what RangeStorage is all about. It may even support resizing existing allocations -- at the cost of potentially invalidating all current handles.

(Note: RangeStorage is enough for Vec, and you never know the size of a Vec in advance.)

I would expect this to be a temporary limitation of const contexts; I'm not too worried about it.

Thank you, I'm glad to see that people find the idea useful!

2 Likes

At the risk of stating the obvious here, the typical way to manage exclusive access at no runtime cost would be with reference lifetimes - a Handle that holds a &mut Storage. However, we require that Handle be Copy so that's not going to work.

This doesn't work for the more esoteric single-element Storage types like PosixShmemStorage<T>.

Let's look at inline storages for example:

Currently, inline buffer is stored in the type implementing Storage. This way, you cannot have multiple handles without additional tracking because using one buffer for many handles requires some tracking.

But if the type implementing Storage stores nothing and the actual inline buffer is stored in the Handle, you can have multiple handles without additional tracking because different handles do not share the same buffer. Storage doesn't need to track whether it has already allocated - because the allocation will always succeed regardless of that.

This will require some changes to the API like removing the Clone + Copy bounds from Handle and passing handles by references instead of by values. But this should be an acceptable trade-off.

(I know that I'm not very good at explaining something so please ask me again if you still can't see the point. Then I'll try to make a more understandable comment with examples of the implementation.)

As said above, it will always succeed regardless of how many handles are allocated - at least for single inline storages. I suppose that this will be the case for all other single storages but there may be other cases too.

Oh, I have overseen that RangeStorage allows that. :sweat_smile: Then I see nothing anymore which would block Storage from being typed.

To be honest, I'm not sure whether this is a temporary limitation because of the problems pointers could cause in const contexts. But given that rust lang devs have already done many things which looked impossible for me, maybe I should not worry about it either. Const collections are not that important anyway.

I suppose you're talking about shared memory storages (if not, please correct me). I don't think that this doesn't work for shared memory allocators:

You can create different shared memory regions for different handles (with random names). Or you can set permissions in a way which will prevent creating them twice. Either way, you don't need the Single* traits to avoid overheads because you don't need to track additionally whether you already have allocated or not.

But I know almost nothing about shared memory and I might be overseeing something. Tell me if this is the case.

I had completely missed that in your earlier explanation, and you are correct that this solves the issue of tracking in Storage.

I'm not sure, however, that it is applicable to as many usecases as Storage (or Allocator) currently is.

Think of a Linked List: so, the Storage returns a Handle<Node>, now what?

  • Where do you store it? You can't store it in a Node (infinite size).
  • How do nodes refer to it?

I am afraid it isn't.

Handle<T> is Copy because it seeks to replace NonNull<T> which is Copy. It's a crucial property for Handle<T> to be able to be used wherever NonNull<T> is used today; otherwise many collections -- starting from LinkedList -- cannot be easily ported.


A piece of advice: try using your interface.

I didn't realize all the constraints of a good Storage API from the get go, they appeared as I tried to use the API in a variety of usecases.

This is the main reason why in storage-poc, there's almost as much code in collections than in the actual API + Implementations => it's to ensure that the API actually suit the usecases.

The Storage API needs to accommodate a range of usecases:

  • Box<T> => single element, possibly !Sized.
  • Vec<T> => single range, resizable, Sized elements.
  • LinkedList<T> => multiple elements, aliased handles, Sized elements.
  • SkipList<T> => multiple elements, aliased handles, the elements are Sized, but the nodes shouldn't be.

I dearly hoped I haven't overlooked a collection type which would impose new requirements/constraints on the storage.

(Note: SkipList is not implemented; hopefully being that it's just linking Boxes, it should be possible, but once again, maybe I've overlooked something critical)

1 Like

You're right. I assumed pretty naively that Single* storages are not needed - only with one implementation for Vec which obviously cannot prove its capability for other use cases. I should definitely implement other collections too to find out whether discarding Single* storages is possible and ideal. I'll share my experience here if I have some new results.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.