Is custom allocators the right abstraction?

TimDiekmann · December 3, 2020, 5:14pm

The storage have to return a pointer or a reference to the current value. The trait is looking similar to this:

pub trait Storage<T: ?Sized> {
    fn as_ptr(&self) -> *const T;
    fn as_mut_ptr(&mut self) -> *mut T;
}

For clarification, I'll use AllocatorStorage<T: ?Sized, A: AllocRef>. Storage<T> is implemented for AllocatorStorage<T>, so it returns a poitner to T. Box's storage parameter defaults to AllocatorStorage<T, Global>, so calling as_ptr on the storage returns T. When creating a Box with new_uninit, the returned type is Box<MaybeUninit<T>, AllocatorStorage<MaybeUninit<T>, A>> so calling Box::assume_init requires to change the storage type. For AllocatorStorage this is basically casting the underlying pointer, but for other storages like [MaybeUninit<T>; N] a mem::transmute (actually a workaround is required) is needed. Generally, this requires memcpy: https://godbolt.org/z/47sxre

I don't know what's the plan on this issue, maybe this case can be optimized out somehow. I'd go that route for now, but I'm not super happy with it.

TimDiekmann · December 3, 2020, 5:19pm

This may be a solution. Currently I'm using associative types instead of generics anyway, so Storage may not need a type at all. For "real" collections, we could use sth. like ContiguousStorage<Value=T>.

Does the new_uninit apply to anything else than Box, Rc and Arc? In other words: does it occur in any place, where contiguous storage is a thing?

CAD97 · December 3, 2020, 5:37pm

Is there a reason that [MaybeUninit<T>; N] can't just be used as the storage for T? In general, my understanding is that storages would deal in MaybeUninit anyway, and the container would be in charge of tracking initialization. This makes working with storages unsafe, but that's not that big of a deal, since they replace working with allocation, which is always unsafe.

TimDiekmann · December 3, 2020, 5:45pm

This is working, you simply pass [MaybeUninit<T>; N] to the box, and it returns Box<T, [MaybeUninit<T>; N]>. However, how the new_uninit API would look like? Let's assume, we call it

fn new_uninit_in(storage: S) -> Box<MaybeUninit<T>, S>

What do you pass?

[MaybeUninit<T>; N] derefs to T, so it can't be used
[MaybeUninit<MaybeUninit<T>>; N] (urgh) derefs to MaybeUninit<T> (fine), but when calling Box::assume_init it requires the conversion from
```
Box<MaybeUninit<T>, [MaybeUninit<MaybeUninit<T>>; N]>
```
to
```
Box<T, [MaybeUninit<T>; N]>
```
which requires the storage to be changed.

Regardless, what is passed to new_uninit, it derefs to one specified type. Changing the type of the Box also requires the Storage to be changed to deref to the new type.

TimDiekmann · December 3, 2020, 5:52pm

I don't know if the storage-API will work for Box (and similar) in general. As you noticed, those types can be used to store dynamically sized types. Since impl Trait in return position, it's probably not strictly needed anymore, but it's still have to be supported. However implementing CoerceUnsized for Box fails, as

The trait CoerceUnsized may only be implemented for a coercion between structures with one field being coerced

but Box<T, S> don't store a field, which needs coercion, the data lives in S. Maybe I get something wrong here, I'm not a compiler expert, but it appears, that this won't work as of now.

RustyYato · December 3, 2020, 6:49pm

TimDiekmann:

What do you pass?

[MaybeUninit<T>; N] derefs to T , so it can't be used

[MaybeUninit<MaybeUninit<T>>; N] (urgh) derefs to MaybeUninit<T> (fine), but when calling Box::assume_init it requires the conversion from
Box<MaybeUninit<T>, [MaybeUninit<MaybeUninit<T>>; N]>
to
Box<T, [MaybeUninit<T>; N]>
which requires the storage to be changed.

Not necessarily, if you look on the main branch for generic-vec, you'll see that impl<T, U, const N: usize> Storage<U> for UninitArray<T, N> { ... } (here) where T and U could have very different sizes, but as long as the alignment of UninitArray<T, N> is larger than U it should be fine. For Box you'll also want the restriction that that the storage size is at least the same as size_of::<U>().

~~(note: currently I have a bug where I don't check the alignments, but that will soon be fixed)~~ now fixed

TimDiekmann · December 4, 2020, 2:38pm

I'll look into your implementation tomorrow. I think I have a solution to this, too but let's see...

Hopefully I'll find enough spare time to push this forward, it's by far the most promising proposal for collections I have seen!

I filed an issue in the WG repository.

Update:

I updated my crate to v0.1.0. It's may worth a look at RawBox and Box. Main features:

typed buffers with only two traits (Buffer and UnmanagedBuffer)
AllocatedBuffer provided
native arrays are supported
Two APIs: raw API with call-side-dependency-injection and non-raw wrapper API, which behaves like the current api
Coercion is working
Unsized types are working

adamnemecek · December 8, 2020, 3:54am

@TimDiekmann looking at the Storage trait, maybe we could consider adding AsPtr and AsMutPtr traits to the std. It feels cleaner than the Storage trait.

TimDiekmann · December 8, 2020, 10:41am

I am not sure if it is worth it. Considering that pointer dereferencing is generally unsafe and Storage (or Buffer as it is called in my crate) is just a backend that the usual user will rarely interact with, I think a Buffer trait is more flexible. Of course, an AsPtr-trait could make sense in other places too, but I don't know how well it is backwards compatible.

I would stick with the Buffer-trait for now.

carbotaniuman · December 8, 2020, 6:47pm

How does this decide whether to use internal storage or the heap storage?

Ericson2314 · December 9, 2020, 2:10am

I do fee likel @matklad's idea is a reason to switch back to &mut self or even self for allocators. With the allocators passed in at the last moment rather than stored, there is no longer the issue of storing allocators for multiple collections in a single threaded context. The dreaded AllocOnce AllocMut Alloc hierarchy can also come up based on whether it's a flat or non-flat data structure.

The main issue is it is also nice to pass in at deallocation time, but how to we prevent dropping without the allocator being passed in? This was my #1 reason for wanting linear types, years ago when I clamored about these things.

matthieum · December 9, 2020, 5:02pm

This is massively invasive, requiring to change every single API, and requiring users to follow a very strict discipline since the same allocator must be passed to every single call -- turning every single function into an unsafe function OR mandating that allocators be able to recognize their own allocations. Honestly, it seems completely impractical.

Note that even in Zig, where there is no global allocator, collections tend to capture the allocator in their constructor and use it throughout, including for destruction.

If you want to start exploring the possibility, I would kindly ask that you open a separate topic as your proposal is at the moonshot stage (not even a mini-draft/prototype, no experience that I can see) which means it'll require significant brainstorming to go anywhere.

matklad · December 9, 2020, 6:44pm

Another interesting point here is that in Zig it's easier to handle "bring your own allocator"-style collections, because it uses value-based defer, rather than type-based destructors. So you, eg, can close over a single allocator local var to drop to local vars holding vectors.

Wodann · December 24, 2020, 10:29am

@TimDiekmann, have you been able to make headway with your reference implementation? It seems like an interesting concept to explore.

Thanks to @matthieum for raising this before we finalised the allocator API. It's always valuable to explore new avenues, at any point in the development process

TimDiekmann · December 27, 2020, 12:53am

I wasn't able to spend much more time recently so the repository is at the latest state. I'll investigate this further as soon as I find some spare time.

matthieum · December 28, 2020, 4:34pm

I've been battling with trait representations.

It's easy enough to get the size and pointer of a slice, store the size and content separately, then reassemble them on the fly, however I could not find a (good) API for traits -- using transmute_copy on the unstable TraitObject representation is rather error-prone -- and more importantly I could not find a generic way to take a ?Sized object, break it down into meta-data + data, and reassemble it.

The closest I've found is the thin crate, and by default it only handles a couple of traits (by implementing a custom trait for them).

I'm thinking we may be lacking a more fundamental API on which RawBox<dyn Trait, 48> can be built.

RustyYato · December 28, 2020, 4:39pm

There isn't one! That's what all these open RFCs are trying to achieve

github.com/rust-lang/rfcs

RFC: Custom DSTs

rust-lang:master ← strega-nil:custom-dst

opened 07:28AM - 11 Nov 18 UTC

strega-nil

+480 -0

This has been a long time coming - similar to #2580, but more general. ~~On t…he unresolved question about `?Sized -> DynamicallySized`, and the need for `?DynamicallySized`. I think this is actually a reasonable change which makes `extern type`s act much nicer around generic functions that already exist, and fixes all the Rust code that has been written already. No longer is, for example, `Box<extern-type>` allowed.~~ I've chosen to make this change. Note that `DynamicallySized` is a new language trait, on the level of `Sized` and `Copy`. [Rendered](https://github.com/ubsan/rust-rfcs/blob/custom-dst/text/0000-custom-dst.md)

github.com/rust-lang/rfcs

RFC: Pointer metadata & VTable

rust-lang:master ← SimonSapin:ptr-meta

opened 06:37PM - 27 Oct 18 UTC

SimonSapin

+397 -0

Add generic APIs that allow manipulating the metadata of fat pointers: * Nami…ng the metadata’s type (as an associated type) * Extracting metadata from a pointer * Reconstructing a pointer from a data pointer and metadata * Representing vtables, the metadata for trait objects, as a type with some limited API This RFC does *not* propose a mechanism for defining custom dynamically-sized types, but tries to stay compatible with future proposals that do. [Rendered](https://github.com/rust-lang/rfcs/blob/master/text/2580-ptr-meta.md)

Even some closed ones

matthieum · December 29, 2020, 12:57pm

I left a comment on RFC 2984 which I think is the most minimal.

A straightforward extension to the RFC is all it takes to be able to break down a pointer into its raw parts (meta-data and pointer to data) and stitch them back together afterwards.

matthieum · January 3, 2021, 2:04pm

I'd like to point to you Coca's author experiments.

Of particular interest is the use of the set_ptr_value feature to reset the data-pointer of a fat-pointer. By using it, @teryror manages to handle Sized, slices, and traits in a generic manner (preserving their meta-data).

It's not ideal as the original pointer is dead-weight, adding up an extra 8 bytes, however it's working right now which is nice to further explore the API design.

Ericson2314 · January 7, 2021, 11:32pm

Oh....shit. I spent ages haranguing people about how we would need linear types for this issue. But linear types that simply require all bindings be defer-ed rather than only working with out unwinding would probably get much less push-back, and are exactly what this needs.

(To be precise, it's the unwinding that requires the defer, not the linear-typed values themselves, but people were and presumably are uncomfortable with no-unwinding changing the language that deeply, so I won't go there.)

I most certainly do not have the bandwidth to write this RFC, but someone else probably should. It will make custom allocator stuff 100x better.

Topic		Replies	Views
Pre-RFC: Storage API libs	21	3538	September 17, 2023
Combining the allocator and storages APIs libs	17	1896	August 17, 2021
[Pre-RFC] Allocators, take II ideas (deprecated)	10	4061	September 17, 2014
Pre-RFC: changing the Alloc trait	14	2002	March 25, 2019
Why "bring your own allocator" APIs are not more popular in Rust?	41	8480	July 31, 2021

Is custom allocators the right abstraction?

Update:

Related topics