Pre-RFC: Storage API

matthieum · May 13, 2023, 3:05pm

Quick Introduction

If you already know about the Storage API, and its motivation, feel free to skip this section.

The key idea of the Storage API is that the Allocator API is just not good enough for a variety of situations:

Allocators cannot allocate memory in-line, so ad-hoc solutions such as ArrayVec are necessary instead... which is good and all, but who here wants to re-implement hashbrown?
Allocators cannot be used in const contexts, so we can't have non-empty collections in const or static contexts. Not directly, anyway.
Allocators cannot be used with shared memory, and possibly not with segmented memory (GPUs? I don't know much about those...).

So, what if we had an Allocator-like API, which didn't suffer from these limitations? This is what the Storage API hopes to be.

Brief History

Once upon a time (2 years ago) I publicized a repository called storage-poc in order to present an alternative, or a supplement, to the Allocator API. This repository was not quite suitable for inclusion in the standard library, but it did get the conversation started... though nothing much came out of it.

A year ago, give or take, @CAD97 took it upon himself to have a serious go at creating an actually usable Storage API in his storages repository. They most notably demonstrated that my idea of typed handles was just... wrong. Simpler is better. Unfortunately, they also ran out of steam on this.

With the arrival of Spring this year I got the urge to revisit his work, and opened a PR detailing a number of decisions that I thought were flaws in his take. A discussion ensued, but... it's pretty hard to gauge ideas in the abstract.

Thus I resolved to tackle the problem again, drawing from the insights that @CAD97 had, from their work, and from our conversations.

And I think that we are, finally, close to our goal.

Pre-RFC time

So... I've got something, and I really think it looks great... but I don't know about GPUs, and I probably forget a thousands potential usecases. So I need your help in order to polish this... or throw it away.

You can find the 3rd version of a Storage API in the storage repository:

The RFC text itself is in etc/rfc.md.
The traits are in src/interface.rs.
The bridge implementation for Allocator is in src/storage/allocator_storage.rs.
And there's a few extensions, and collections, presented.

I would need help:

Evaluating whether the API is suitable in a variety of usecases. If you have a usecase in mind for which Allocator proved short, please give it a shot and report.
Improving the RFC.
Improving the API documentation (not the collections, those are tests, so to speak).

And of course I am interested in any constructive feedback.

Feel free to reply to this topic, or open issues/PRs on the associated repository as appropriate.

I'll try to reply on this topic within a day, and to issues/PRs within a week, depending on the complexity.

Jules-Bertholet · May 13, 2023, 4:08pm

It's great to see progress on this!

One issue that this draft doesn't address is dynamic dispatch. dyn Allocator is a major use-case of the Allocator trait, but Storage has a Handle associated type so dyn Storage won't work directly.

fdasfe · May 13, 2023, 4:26pm

An idea for a dyn-safe api:

pub unsafe trait SingleStorage{
    fn get_ptr(&self)->NonNull<u8>;
    unsafe fn resize(
        &mut self,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<(Self::Handle, usize), AllocError>;
}

This avoids the need for an associated type by folding the handle into the storage.

matthieum · May 13, 2023, 7:42pm

One issue that this draft doesn't address is dynamic dispatch. dyn Allocator is a major use-case of the Allocator trait, but Storage has a Handle associated type so dyn Storage won't work directly.

It won't, indeed. I am not sure it's necessary that it does -- but please do try and convince me if you think otherwise.

First of all, dyn Storage<Handle = NonNull<u8>> + MultipleStorage + PinningStorage works, and will accommodate any usecase covered by dyn Allocator, so there's no loss of functionality in that sense. And a different handle type can be specified, for example a u32 for pointer compression in sub 4GB heaps... though at that point things are getting fairly concrete already.

The only usecase that seems hard to fulfill would be a dyn Storage with dynamic handle type. No concrete usecase comes to mind, and in the absence of those, I find myself hard-pressed to think up about potential solutions -- it's hard to see whether they'd be a good fit, or not, without a "benchmark" against which to evaluate them.

If you have any such usecase to mind, please do elaborate.

I would note that it is possible to combine concrete Storage with dyn Allocator too. If you have an algorithm taking a dyn Allocator, you can craft a "small" Storage that uses in-line storage for few elements, or defers to the dyn Allocator for larger collections.

It may be a slightly different usecase from what you have in mind, though.

CAD97 · May 13, 2023, 7:54pm

(I haven't looked at the other replies yet, just the OP, and have been drafting this for a bit.)

Interesting: StableStorage provides a different way to present some of the guarantees covered by my SharedMutabilityStorage. Without resolve_mut, though, any inline storages MUST contain shared mutability.

(Uh, as currently written, your Storage doesn't support inline storages at all, because resolve says the pointer remains valid while the handle is valid. This means that moving the storage mustn't invalidate the pointer. Similarly, &mut to the storage also must invalidate any previously resolved interior pointers. The complexity of this is a significant part of why I chose to return a proper reference, since the lifetime validity captures this accurately and in a way already understood. I'm assuming this is just an oversight for the rest of this post.)

(I'm also not certain your documented requirement for PinningStorage is sufficient; I didn't notice any documentation of invalidation caused by dropping the Storage itself anywhere. And the pinning guarantee additionally requires that memory isn't repurposed without dropping the previously pinned value, which I also see no notice of, and is absolutely a valid potential scenario when a Storage borrows its backing memory.)

I also want to note that Storage is not particularly intended to be a dyn-compatible trait. The dynamic layer would remain Allocator and/or GlobalAlloc, so optimizing out e.g. retrieval of a dynamic layout that goes unused should be fairly easy on the compiler.

To minimize the need for click through, here are overviews of the two APIs (with shortened docs, click through for more details):

storages (2022, cad97)

type Memory = [MaybeUninit<u8>];

unsafe trait Storage {
    type Handle: Copy + Ord + Hash + Unpin + Send + Sync;

    // invalidates extant handles
    fn allocate(&self, layout: Layout) -> Result<Self::Handle, AllocError>;
    unsafe fn deallocate(&self, handle: Self::Handle, layout: Layout);

    unsafe fn resolve(&self, handle: Self::Handle, layout: Layout) -> &Memory;
    unsafe fn resolve_mut(&mut self, handle: Self::Handle, layout: Layout) -> &mut Memory;

    unsafe fn grow(
        &mut self, 
        handle: Self::Handle, 
        old_layout: Layout, 
        new_layout: Layout
    ) -> Result<Self::Handle, AllocError>;
    unsafe fn shrink(
        &mut self, 
        handle: Self::Handle, 
        old_layout: Layout, 
        new_layout: Layout
    ) -> Result<Self::Handle, AllocError>;
}

// allocate no longer invalidates handles
unsafe trait MultipleStorage: Storage {
    unsafe fn resolve_many_mut<const N: usize>(
        &mut self, 
        handles: [(Self::Handle, Layout); N]
    ) -> [&mut Memory; N];
}

unsafe trait SharedMutabilityStorage: Storage {
    unsafe fn resolve_raw(
        &self, 
        handle: Self::Handle, 
        layout: Layout
    ) -> &mut Memory;
}

// memory backing a handle has a stable address
unsafe trait PinningStorage: Storage {}

storages (2023, matthieum)

unsafe trait Storage {
    type Handle: Copy;

    fn dangling() -> Self::Handle;

    // invalidates extant handles and pointers
    fn allocate(&self, layout: Layout) -> Result<(Self::Handle, usize), AllocError>;

    // invalidates extant pointers
    // valid until self is moved or mutably reborrowed
    unsafe fn resolve(&self, handle: Self::Handle) -> NonNull<u8>;
    // invalidates extant handles and pointers
    unsafe fn deallocate(&self, handle: Self::Handle, layout: Layout);

    // invalidates extant handles and pointers
    unsafe fn grow(
        &self,
        handle: Self::Handle,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<(Self::Handle, usize), AllocError>;
    // invalidates extant handles and pointers
    unsafe fn shrink(
        &self,
        handle: Self::Handle,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<(Self::Handle, usize), AllocError>;

    // as allocate
    fn allocate_zeroed(&self, layout: Layout) -> Result<(Self::Handle, usize), AllocError> { .. }
    // as grow
    unsafe fn grow_zeroed(
        &self,
        handle: Self::Handle,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<(Self::Handle, usize), AllocError> { .. }
}

// methods do not invalidate extant handles
unsafe trait MultipleStorage: Storage {}

// methods do not invalidate extant pointers
// thus: handles' backing  memory is pinned if the storage is pinned
unsafe trait StableStorage: Storage {}

// moving and mutably reborrowing do not invalidate extant pointers
// thus: handles' backing memory is pinned
unsafe trait PinningStorage: StableStorage {}

The main differences:

Added _zeroed versions of allocate/grow.
Removed SharedMutabilityStorage; this is now implied by MultipleStorage + StableStorage.
Added StableStorage as a less strict PinningStorage and alternative to SharedMutabilityStorage.

Pain points I'm not super happy with:

fn dangling() makes the trait no longer dyn-safe. It's not specifically designed to be, but it's important to note the choice, at least.
Always resolving to NonNull instead of &Memory means the validity conditions need to be spelled out instead of being able to rely on lifetime invalidation to communicate it.
- But it's probably worth it for other API simplifications.
- Maybe resolvable with an extern type Memory<'a>?
No choice except to provide shared mutability.
- It's looking likely that the opsem will not track UnsafeCell precisely, so this doesn't just mean that the storage loses noalias optimization, but everything "next to" it as well, since the absence of an auto trait (Freeze) is infectious.
- I think I'd be okay with MultipleStorage requiring shared mutability, to avoid the need for
It's probably fine to drop Send + Sync as bounds on the handle, but Ord + Hash should probably stay, or at least Eq.
Since resolve doesn't get the memory layout, a "small box" allocator which chooses between inline or outlined allocation based on whether the layout fits needs to store an extra bit to remember if the allocation fits inline or not.
- You might expect a handle of Option<NonNull<u8>> to handle this fine, since the outline allocation needs to remember the heap pointer anyway, but that can be put in the inline storage space, since it's not being used.
- The vtable indirection to get dyn object size in this case is desired. The vtable is probably about to get accessed anyway, to use the object, as well, so the potential cache miss isn't too bad, since it would've occured just following anyway.
- The vtable indirection to get dyn object size for storages which don't need it should be able to optimize out as dead code fairly simply. If it doesn't, that's a fixable optimization problem, either by the optimizer learning to remember unused function parameters as meaningless even when not inlining, and/or by restructuring the storage impl to a thin inlinable impl which drops the layout and forwards to a fat noninlined part which doesn't take the layout.
- Yes, allocation is a very hot path and one that gets monomorphized and recompiled a lot, and thus small changes can have outsized impact on compile times. But Box is special-cased in the compiler already, so it's not out of the realm of possibility that the compiler could special case Box<_, AllocStorage<Global>> to bypass some machinery for the purpose of compile times. (And this special casing may have to occur even without a dead layout_from_ptr_metadata.)
There's still no way to have ArrayVec omit storing a capacity for a fixed-size storage. (I don't think this one is resolvable, unfortunately, at least not with this-decade language features^[1], and it's kinda killed most of my motivation to drive the storages API.)
We probably want to make fn dangling() const. (I'm assuming we'll allow const fn in traits for requried-const.) I don't see any potential need for it to be non-const, since it can basically be any type-valid value. (Perhaps randomizing dangling handles could be a debugging and/or hardening tool?)
(Silly) A stupid idea I had was a storage which stored the allocated objects in the handle itself. It's certainly amusing but likely impractical.

So here's a draft of a potential middle ground (also experimenting with Store instead of Storage since it kinda might make for better compound names):

compromise?

extern type Extern;
// or: type Extern = u8;
pub struct Memory<'a>(PhantomData<&'a [MaybeUninit<u8>]>, Extern);

// dropping the store invalidates handles and resolved pointers
pub unsafe trait Store {
    type Handle: Copy + Ord + Hash;

    const fn dangling() -> Self::Handle;

    // invalidates extant handles
    // invalidates extant resolved pointers
    fn allocate(&self, layout: Layout) -> Result<Self::Handle, AllocError>;

    // invalidates the passed handle
    // invalidates extant resolved pointers
    unsafe fn deallocate(&self, handle: Self::Handle, layout: Layout);

    // pointer is read valid for lifetime 'a
    unsafe fn resolve<'a>(&'a self, handle: Self::Handle, layout: Layout) -> NonNull<Memory<'a>>;

    // pointer is read/write valid for lifetime 'a
    unsafe fn resolve_mut<'a>(&a mut self, handle: Self::Handle, layout: Layout) -> NonNull<Memory<'a>>;

    // invalidates extant resolved pointers
    // if Ok, invalidates the passed handle
    unsafe fn grow(
        &self,
        handle: Self::Handle,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<Self::Handle, AllocError>;

    // invalidates extant resolved pointers
    // if Ok, invalidates the passed handle
    unsafe fn shrink(
        &self,
        handle: Self::Handle,
        old_layout: Layout,
        new_layout: Layout,
    ) -> Result<Self::Handle, AllocError>;

    // see allocate
    fn allocate_zeroed(&self, ...) -> ... { ... }

    // see grow
    unsafe fn grow(&self, ...) -> ... { ... }
}

// resolve returns a read/write valid pointer
pub unsafe trait StoreMut: Store {}

// allocate/grow no longer invalidate extant handles
pub unsafe trait StoreMany: StoreMut {}
// NB: requiring StoreMut does eliminate some potential uses
//     but it's probably worth it to avoid resolve_many_mut

// resolve returns a pointer valid until the handle is invalidated
// (i.e. allocate/deallocate/grow/shrink don't invalidate extant pointers)
pub unsafe trait StoreStable: Store {}
// NB: requiring StoreMut would prevent &mut-based storage
// Q: what's the benefit, without StorePin? Being able to use storage-unaware
//     APIs that want to use raw pointers but don't require full pinning?

// dropping the store does not invalidate resolved pointers
// (i.e. the stored memory is pinned)
pub unsafe trait StorePin: StoreStable {}

The only of my tracked potential pain points I think not mostly addressed are ArrayVec, performance (i.e. of requiring layout for resolve), and that this is creeping up in complexity again. (But since the refinement traits are all pure marker traits, perhaps that helps rein in the complexity some?) This would still be the most complex API added to std, I think^[2], so it's going to still be a hard sell that the benefit outweighs that.

I've just graduated from grad school, so I should have a bit more time to spend on Rust again, at least for a bit. (A sync discussion on the design space could be useful.)

Pattern restricted types allowing disjoint impls to become a full covering impl set might make it possible:
- Add const FIXED_CAPACITY: Option<NonZeroUsize> = None;
- type Capacity<S: Storage<FIXED_CAPACITY: is None>> = usize;
- `type Capacity<S: Storage<FIXED_CAPACITY: is Some(_)>> = Const<usize, {S::FIXED_CAPACITY.unwrap()}>;
↩︎
Pin is a competitor, but the API itself is fairly straightforward, and it's the soundness implications of the API which are complex. I'm deliberately trying to avoid that here, though, and to present an API relatively simple in implications. ↩︎

matthieum · May 14, 2023, 11:22am

@Jules-Bertholet

Thinking more deeply on the dyn Storage: if worst-case handle alignment/size is not an issue, then it is relatively simple to create an adapter to "fit" multiple handle types into a single type.

pub struct ErasedHandle<T>(MaybeUninit<T>);

pub struct ErasedStorage<T, S>(S);

impl<T, S: Storage> ErasedStorage<T, S> {
    pub fn new(storage: S) -> Self {
         assert!(mem::align_of::<S::Handle>() <= mem::align_of::<T>());
         assert!(mem::size_of::<S::Handle>() <= mem::size_of::<T>());

         Self(storage)
    }

    fn to_erased(handle: S::Handle) -> ErasedHandle<T> {
        let mut result: ErasedHandle<T>;

        unsafe { ptr::write(&mut result as *mut _ as *mut S::Handle, handle) };

        result
    }

    fn to_stored(handle: ErasedHandle<T>) -> S::Handle {
        let mut result: S::Handle;

        unsafe { ptr::write(&mut result as *mut _ as *mut ErasedHandle<T>, handle) };

        result
    }
}

impl<T, S: Storage> Storage for ErasedStorage<T, S> {
     type Handle = ErasedHandle<T>;

     //  One-to-one mapping, erasing/recovering as necessary.
}

This allows passing a Storage with a (), u32, usize or NonNull<u8> handle to an algorithm parameterized on dyn Storage<Handle = ErasedHandle<usize>> without fuss, at the small cost of a slightly bigger handle than necessary in some cases.

matthieum · May 14, 2023, 11:53am

Definitely an oversight, I'm afraid there's a lot of word-smithing to be done.

Similarly for PinningStorage, I am not sure I am aware of all the requirements.

Indeed, though if dyn compatibility can be achieved for cheap -- even partially -- then I see no issue worming it in.

I'll add an unresolved question about whether dangling should take &self or not. It's definitely worth discussing the implications of the trade-off.

Hum... that's interesting. I hadn't thought about this issue since when writing a memory allocator it's exactly what happens -- you slice and dice a single chunk of memory -- and yet compilers are happy to consider each allocation to be independent from one another.

Would noalias not be recovered when the user creates a &mut?

Maybe? If an algorithm/container needs Eq, it's easy enough to add in the where clause so I'm not sure why it'd be so important to have it by default. On the other hand, I can't say either which usecases would be prevented by requiring Eq.

As noted in the RFC, it's an unresolved question.

For a single storage, I'd expect reserving a bit in the storage is best, indeed. I'd expect a small single storage basic implementation to be an enum.

For a multiple storage, it depends whether you want memory stability or not:

If memory should be stable, then the storage will have both in-line and off-line portions, and the handle will indicate which to fetch from.
If memory need not be stable, then there are more options.

I expect various situations will lead to various implementations.

I think this one requires a TemplateVec<T, L: Length, C: Capacity, S: Storage> with the user specifying a const capacity for in-line storage, and possibly a reduced length as well (a u8 or u16 could very well fit the bill). I am not sure whether that's a good fit for the standard library; it starts getting fringe.

I actually tried, but the compiler didn't like it

I hadn't thought about randomization for debugging/hardening. I'll add an unresolved question.

There are issues with this proposed middle ground, which I addressed in the alternatives section of the RFC; in particular:

&mut self means that you can't ever use a Storage in concurrent situations (multi-threading) without an external lock.
Requiring the layout as argument to resolve prevents thin handles -- when the metadata is not attached to the handle, but stored within the memory associated with the handle instead.
Requiring the storage to be borrowed prevents moves. It does save some potential source of trouble, but also seems restrictive; a PinningStorage guarantees that resolved pointers remain valid across moves, after all.

Is that an actual requirement for pinning? This seems overbearing, as it essentially requires that the memory be globally allocated, precluding any kind of PoolAllocator or StackAllocator!

If that's necessary, not much can be done, but I must admit I had never considered that the memory may have to remain valid past the lifetime of the Allocator/Storage which allocated it.

Thank you so much for the detailed feedback.

the8472 · May 14, 2023, 2:25pm

Would it be possible to stabilize Allocator, including Vec<T,A> now and retrofit Storages later? E.g. by then relaxing a trait bound or by adding a dummy indirection type now that could be extended later.

matthieum · May 14, 2023, 2:36pm

The goal of having all Allocator automatically implement Storage is precisely the ability to seamless switch the bounds on Vec<T, A> from A: Allocator to A: Storage without issues.

Whether this is actually possible requires an in-depth investigation, however.

For example, just this afternoon I noticed that Box::leak is built upon the assumption that as long as it doesn't drop its Allocator, then the memory remains allocated. This obviously flies right out of the window with InlineSingleStorage, so a Box<T, S: Storage> cannot provide leak without further guarantees on S. I don't think it's a deal breaker -- there's already a A: 'a bound, it would just be another bound -- but it's illustrative of the kind of assumptions that were made and that need to be revisited.

Jules-Bertholet · May 14, 2023, 4:27pm

Not with that attitude, it isn't!

How about this:

Add another associated type, AllocationSize, to the Storage trait
allocate() returns (Self::Handle, Self::AllocationSize) on successful allocation
The method Storage::get_size(size: Self::AllocationSize) -> usize allows you to get the actual size in bytes from the Self::AllocationSize
For most Storages, AllocationSize is usize and get_size is the identity function
For InlineSingleStorage<const BYTES: usize>, AllocationSize is () and get_size just returns BYTES
Vec stores an AllocationSize to keep track of capacity
ArrayVec now just stores an () in its capacity field. Profit!

matthieum · May 14, 2023, 5:21pm

That's one possibility, but it's actually not that great.

The problem that the length and capacity fields of a vector are expressed in number of elements and not expressed in number of bytes. In that sense, it means that Storage::AllocationSize would of course fit the number of elements, but may very well be overkill, and thus an implementation of Vec that truly means to minimize its footprint may wish to use a different type anyway.

And on the other hand, I could perfectly use a fixed-capacity heap-allocated vector, off the Global storage.

At this point, I believe that this is clearly an orthogonal problem to that of Storage. Whoever would parameterize Vec with appropriate Length/Capacity types would be best suited to ensure the combination is sensible.

Jules-Bertholet · May 14, 2023, 6:51pm

Just make people use a type alias: type ArrayVec<T, const N: usize> = Vec<T, InlineSingleStorage<{N * mem::size_of<T>}>.

This can be accomplished with a wrapper Storage around Global.

Abstracting over ArrayVec was the original motivation for Storage, as far as I can tell. I would hardly call it "orthogonal".

CAD97 · May 15, 2023, 3:59am

One more unrelated thought: there's two kinds of Cloneing which could be desired: one which produces an independent storage (e.g. for cloning Box^[1]) and one which produces an interchangable storage managing the same handle(s) (e.g. for cloning Arc). Unfortunately I think we'll need to have both options (i.e. so both Box and Arc can use storages).

Box<T, S>: Clone is kinda weird, and precisely specifying what it needs from cloning a storage. It's satisfied by an independent clone, but that isn't necessary, just sufficient. It's instead just that cloning the allocator must produce a storage which is capable of allocating a new handle independent of the validity of the first handle.

Drafting what I think is sufficient:

Storage: if the storage implements Clone, cloning produces an independent new storage where using either storage does not impact the validity of handles or resolved pointers from the other storage.
SharedStorage: Clone + MultipleStorage + SharedMutabilityStorage: handles from this storage and any of its clones can be used with any of this set of storages. Handles are not invalidated by dropping any of the set until the final clone is dropped.

I think in my original draft with Send + Sync handles, the idea was that <Handle as Default>::default would be the dangling()-equivalent.

Which is why SharedMutabilityStorage exists, so that those storages can. Alloc/dealloc is still &self.

This is only (and necessarily) the case for (potentially) inline storages. Stable plus shared-mutable storages lift the requirement that resolved pointers be considered invalid after moving (or uniquely reborrowing) the storage.

The intent isn't that the actual borrow lifetime must necessarily actually be carried. I fully expect/intend that the lexical lifetime to be transmuted/pointer-cast away most of the time; it's the logical lifetime which (potentially) matters, and "this lifetime would necessarily end" is the most straightforward way to communicate when the pointer validity ends (because the former is a strict subset of the latter, and that it isn't just method usage which is included, but moves and unique reborrows of the storage as well).

Mmm, that's a case I hadn't actually thought about in this context. This looks like two incompatible goals here. Both dynamically-inline storage and outline-metadata handles are desirable, and both are "solved" by the same solution of putting more data inline^[2] with the handle. (For the former, a boolean flag for inline/outline; for the latter, the pointer metadata.)

Basically, Box<T, S> requires either that T: ?Sized + MetaSized and can provide layout to S::resolve, or that T: ?Sized + ?MetaSized + DynSized and that S::resolve works without a layout. (T: ?DynSized can't be de/allocated in Rust, since their layout is unknowable to Rust.)

There might be a partial solution: the ability to ask the storage to do Layout::for_value_raw on a handle and pointer metadata.

unsafe trait Storage {
    // ...

    unsafe fn layout_for<T: ?Sized + ?MetaSized + DynSized>(&self, handle: Self::Handle, meta: <T as Pointee>::Metadata) -> Layout;
}

It's certainly far from a perfect solution — it would require the value's layout to be the originally allocated layout and requires DynSized to be a thing, plus a careful documentation of borrow/access implications — but it should be functional. (The storage would also probably prefer to only be required to provide a layout usable for resolving and deallocating the handle^[3], so fixed-size storages could just return that constant.)

Nearly; pinning requires that

for pinned data you have to maintain the invariant that its memory will not get invalidated or repurposed from the moment it gets pinned until when drop is called. Only once drop returns or panics, the memory may be reused. [...] Notice that this guarantee does not mean that memory does not leak! It is still completely okay to not ever call drop on a pinned element (e.g., you can still call mem::forget on a Pin<Box<T>>). In the example of the [intrusive] doubly-linked list, that element would just stay in the list. However you must not free or reuse the storage without calling drop.

IIRC, there's an open issue on wg-alloc about the need for a clarification about pinning for Allocator (and the potential need for a PinningAllocator refinement trait). Specifically, for Box::<T, A>::pin to be sound, dropping the (last clone of the) allocator is actually allowed to invalidate any allocated pointers, but forgetting (or leaking) the allocator must result in the remaining valid forever (since the allocator was never dropped), and importantly, this is a 'static requirement, even if the allocator itself captures a non-'static lifetime which expires.

I did write a slightly stronger than necessary requirement in that quick draft, because it's a useful sufficient bound. And actually, the StoreStable there is unsound to be implemented for a borrowing storage, since it's never mentioned that handles (and thus the resolved pointers) are invalidated when the storage is dropped, let alone when the storage is leaked and itself invalidated by lifetime expiry.

It's at least not completely orthogonal, since the ability to use a zero-sized capacity is a property of the storage (only giving out a single size of object).

While yes a perfectly size-optimal ArrayVec would use a smaller sized integer for length, capacity isn't a matter of using a smaller integer, it's a matter of just not storing it at all since it's statically known. It's a trivial size optimization to be missing. And since most types tend to be pointer-aligned anyway, using a smaller integer for length isn't even going to be a size improvement. Plus, the std Vec could also potentially use some sort of bounded integer type in the future if that exists to pick the smallest sufficiently sized integer type for the maximum capacity.

You're stuck back in the position where containers which want to be maximally generic aren't able to just be generic over S: Storage still, and need to instead be generic over some VecLike trait instead. If that's still the case, I don't think we can claim the Storage API fully succeeded at its goals.

On the other hand, the primary goal of ArrayVec is to eliminate the allocation, and perhaps a small bit of suboptimal space usage can just be acceptable. I've never seen anyone upset that Vec<ZST> is still three usize big despite being a single usize of meaningful state. (But back on the initial hand, ArrayVec<ZST> has infinite capacity and is only the length integer big.)

For clarity purposes, this'd look roughly as

struct FixedVec<T, S: Storage, const CAP: usize> {
    ptr: S::Handle,
    len: usize,
    store: A,
    _: PhantomData<T>,
}

impl<T, S: Storage, const CAP: usize> FixedVec<T, S, CAP> {
    fn try_new_in(store: S) -> Result<Self, AllocError> {
        Self {
            ptr: store.allocate(Layout::new::<[T; CAP]>())?,
            len: 0,
            store,
        }
    }

    fn try_push(&mut self, value: T) -> Result<(), T> {
        if self.len == CAP { Err(value) } else unsafe {
            self.spare_capacity_mut()[0].write(value);
            self.len += 1;
        }
    }

    fn Drop::drop(&mut self) {
        self.truncate(0);
        self.store.deallocate(self.ptr, Layout::new::<[T; CAP]>());
    }
}

... And, writing this, I made a new realization I hadn't yet: because the vector capacity also determines if there's been an allocation yet, eliminating the capacity means you have to also allocate up front in new, reserve a different marker (e.g. len == MAX or make ptr: Option), or deallocate every time you hit zero length, instead of the natural base case of a zero-sized non-allocation needing "re"allocation to fit more elements. That starts to push me towards thinking FixedVec is a distinct type from Vec.

Plus because of the const fn new, ArrayVec storage is kinda a further special case of not just Storage::Handle = () but also that de/allocate is a no-op.

That would look roughly like

struct FixedStorage<S, const LAYOUT: Layout>(S, PhantomData<T>);

impl<S, const LAYOUT: Layout> FixedStorage<S, LAYOUT> {
    fn fits(&self, layout: Layout) -> bool {
        layout.size() <= LAYOUT.size() && layout.align() <= LAYOUT.align()
    }
}

impl<S: Storage, const LAYOUT: Layout> Storage for FixedStorage {
    fn allocate(&self, layout: Layout) -> _ {
        if !self.fits(layout) { return Err(AllocError); }
        self.0.allocate(layout);
    }

    fn deallocate(&self, handle: S::Handle, _: Layout) {
        self.0.deallocate(handle, LAYOUT);
    }

    // other items elided...
}

// other forwarding impls elided...

Alright, fine...

unsafe trait Storage {
    type Capacity<T>: From<usize> + Into<usize> = usize;

    // other items elided...
}

// for resizable storage handles, just Capacity = usize
// for fixed size allocation,
struct ConstCapacity<T, const LAYOUT: Layout>(PhantomData<T>);

impl<T, const LAYOUT: Layout> From<usize> for ConstCapacity<T, LAYOUT> {
    fn from(_: usize) -> Self { Self(PhantomData) }
}

impl<T, const LAYOUT: Layout> Into<usize> for ConstCapacity<T, LAYOUT> {
    fn into(self) -> usize {
        LAYOUT.size().checked_div(size_of::<T>()).unwrap_or(usize::MAX)
    }
}

Plus you need wording to allow using any layout that "fits" the handle instead of just the layout used for allocation, to permit a layout calculated from a Capacity. And probably this should be a nonzero type.

Box can (and currently is with Allocator) gate clone on the sharing-clone and MultipleStorage instead, but it would be very unfortunate if Box<T, InlineStorage<T>> can't be cloned. ↩︎
To me at least, resolve not taking a layout feels "morally similar" to deallocate not taking a layout — the allocator/storage could just remember the layout if it needs it, but the downstream consumer probably still knows the layout and can provide it, and the allocator/storage can do its job more effectively if it gets provided the layout up front. ↩︎
E.g. "The returned layout is not necessarily that of the T stored behind this handle, but is at least that of the T and is valid to provide with the handle to any of this storage's methods." ↩︎

Jules-Bertholet · May 15, 2023, 4:36am

Ah, good catch! But I think paragraph 2 provides the solution to paragraph 1. Add an extra requirement that, if get_size(Storage::AllocationSize::default()) returns a non-zero value, than the Storage has to behave as if it contains an allocation of that size with Handle::dangling() handle, whether or not allocate was ever actually called. (Also, require all of AllocationSize::default(), Handle::dangling(), and get_size() to be const fn.)

matthieum · May 15, 2023, 4:24pm

Yes and no.

Sharing the same code for both in-line and off-line memory is indeed the core motivation.

Minimizing the memory-footprint (we're talking about shaving off 8 bytes, here) would be a nice bonus, but I don't see it as a core requirement. If it comes for free -- or cheaply -- it'd be sad to miss the opportunity, but if it requires trade-off -- here, it's directly at odd with dyn Storage -- then it's way less attractive suddenly, and if it can be accommodated with alternatives (TemplateVec) that require no change to Storage (and thus no compromise for dyn Storage) then I'd argue it's better to go for the alternative.

This doesn't mean I'm closing the door. I could see extending the storage API to return a range of the number of handles that can be used concurrently, the maximum capacity of a single allocation, etc... if they can be well-motivated. Though, I'd be inclined to add them as extension later if possible, to streamline the core RFC as much as possible.

matthieum · May 15, 2023, 5:02pm

Oh, that's a nice one.

The pointer stability requirement, independently of StableStorage is a particularly nice touch. It's a bit counter-intuitive, since in a collection in general you expect that if a key/index can be used with the original, it can be used with the clone, but a storage is a bit of a weird beast here.

Which makes me think that we may need to talk about the fungibility of handles: ie, when can a given handle be used with a different instance than the original storage. I am thinking of LinkedList::split_off, for example, which is similar to Arc (except with multiple handles).

Both independence and fungibility touch on the same problem: is the actual memory store shared between the initial instance and its "clone", or not?

If not (in-line storage): then the "clone" is actually a fresh new storage, and Box::clone just works.
If yes (heap): then the handles are fungible, but unless it's both a MultipleStorage and PinningStorage you'll run in trouble with Box::clone and LinkedList::split_off as operations on the "new" container may affect the "old".

In fact, you'll run in trouble with LinkedList::split_off unless you have a specific guarantee that handles and pointers remain valid as long as at least one "clone" of the storage exist.

I am not sure I would use Clone for either a "new" instance or a "shallow clone" instance. I feel in either case it'd be confusing. It seems like it'd be better to just add 2 new methods to Storage:

Storage::spawn(&self) -> Self: produces a new, independent, storage.
Storage::share(&self) -> Self: produces a fungible storage, all storages derived via share from the same storage act as if they were the same instance.

Those two methods do come with further questions though:

Is failure an option? Maybe the result should be Result<Self, SpawnError> and Result<Self, ShareError> instead; with Never being usable to indicate they're infallible.
Those methods are not dyn-safe, and cannot reasonably be. They can be guarded by Self: Sized so that dyn Storage is still available, at the cost of reduced functionality. Storage::share can be restored by implementing Storage for Rc<dyn Storage> and Arc<dyn Storage>, but Storage::spawn will remain out of touch I fear.

Well, well, well. That's a snag I hadn't seen coming.

I'll hold off on any answer about other question, be they dangling vs Default, borrowing of storage, thin-handle support, etc...

Those seem like bike-shedding in comparison to the above issue: any Storage RFC is dead in the water if Box and Arc can't be cloned, or LinkedList::split_off cannot work.

Therefore I'd propose we focus the discussion on this issue for now, and if/when solved, we can then tackle all the other questions.

Jules-Bertholet · May 15, 2023, 8:26pm

Not more than Handle is. It's trivial to write an AllocationSize = usize wrapper over any other AllocationSize, and then you can use that with dynamic dispatch.

matthieum · May 18, 2023, 4:49pm

@CAD97 I've been thinking further about the Box::leak, Box::clone and Rc::clone issues.

With regard to Box::leak, I believe that the PinningStorage trait is sufficient, though it should be clarified. That is, if moving the storage doesn't invalidate the pointers, then forgetting the storage doesn't either -- since it's just moving without dropping.

It could be worth specifying the guarantee in PinningStorage documentation.

With regard to Rc::clone, I think a SharedStorage trait should be created, requiring PinningStorage, and introducing the new fn share(&self) -> Result<Self, Self::ShareError>; method. This requires introducing the concept of sharing storages:

Multiple instances of Storage can be sharing together, in which case all storages belonging to the same sharing set act as a single instance:
- Handles obtained from one member of the set can be modified with another.
- Handles and pointers are only invalidated on drop when dropping the last member of the set -- wordsmithing needed to touch on forgetting/resurrecting from parts.
A freshly created storage belongs to its own dedicated sharing set.
A storage obtained by share belongs to same sharing set as the receiver of share.

Note: I double checked, and this approach meshes well with dyn usecases as one can specify &dyn SharedStorage<Handle = X>.

And that leaves Box::clone... Outside of dyn usecases, it only requires the storage to implement the Default trait. Supporting dyn usecases, however, requires a method such as fn spawn(&self) -> Self, but ironically doing so makes Storage non dyn-safe originally. Adding the where Self: Sized bound helps, but spawn remains uncallable in dyn usecases then.

It would be possible to implement Storage::spawn for some types:

&dyn MultipleStorage<Handle = H> can just return self.
Rc<dyn MultipleStorage<Handle = H>> (and the like) can return a shallow clone.

The reference case seems fairly useful, I'm not as sure about heap-allocated storages, but who knows?

Also, I did think about your idea of using Store rather than Storage, and it's growing on me.

I'll do some mass-renaming over the week-end, as well as tackle the PinningStore and SharedStore. I'll see if I use a Rc or LinkedList for the latter.

I'm quite uncertain about spawn, so I'm not sure whether I'll try to do something about it, or not.

matthieum · May 21, 2023, 1:50pm

Exploring further, I noticed that the Allocator API had completely left the issue of Rc::clone, LinkedList::append and LinkedList::split_off on the floor: these methods are only available when the allocator parameter is Global and for no other allocator.

I added SharingStore to the repository, as well as a fairly complete (minus cursors) implementation of LinkedList using SharingStore for append and split_off.

In the RFC, however, I leave SharingStore as a future extension. It's unnecessary in the first place, since it's not available with Allocator at the moment. The existence of LinkedList demonstrates the forward compatibility of the main proposal, so no rush.

CAD97 · May 21, 2023, 5:23pm

Rc/LinkedList just aren't generic over the allocator at all yet. Allocator always requires that all clones of an allocator manage the same handles (sharing storage).

Memory blocks returned from an allocator must point to valid memory and retain their validity until the instance and all of its clones are dropped, [and]

cloning or moving the allocator must not invalidate memory blocks returned from this allocator. A cloned allocator must behave like the same allocator.

Topic		Replies	Views
[Pre-RFC] Allocators, take II ideas (deprecated)	10	4051	September 17, 2014
Combining the allocator and storages APIs libs	17	1866	August 17, 2021
Gauging Interest for an Allocator Trait Working Group libs	23	2327	July 31, 2019
Notes + Interviews on Fallible Collection Allocations	53	7929	March 25, 2019
Another alternative to allocators for Vec and other collections libs	5	1037	December 31, 2023

Pre-RFC: Storage API

Quick Introduction

Brief History

Pre-RFC time

Related topics