(disclaimer: it took me a while to write this, during which I changed my mind multiple times, and I still don’t have a clear proposal, although I’m kind of converging to one, and this essentially presents all the options I’ve considered along the way, so please bear with the ramblings)
The Alloc
trait is mostly specified in RFC 1398 (tracking issue) and afaik, the largest changes to it happened with PR 49669 (tracking issue).
I’m currently reimplementing the Firefox memory allocator (mozjemalloc), in rust. To that end, I’ve been using the Alloc
trait and extracted it into a separate crate. Relatedly, I’ve worked towards making Box<T, A: Alloc>
a thing in std (I have a feature branch ready to go as soon as 1.27 hits beta and is used to build nightly, although it only really works for ZST Alloc implementations for compiler internals reasons).
The Alloc
trait currently essentially contains three different sets of methods:
- the base allocator methods:
alloc
,dealloc
,realloc
,alloc_zeroed
,grow_in_place
,shrink_in_place
,usable_size
. - methods that also return the actual size allocated:
alloc_excess
,realloc_excess
- helper methods:
alloc_one
,dealloc_one
,alloc_array
,realloc_array
,dealloc_array
.
That’s both a lot, and not enough. A lot, because while a typical implementor only really needs to implement a few of those (there are default implementations for all of them except alloc
and dealloc
), when you need to implement generic wrappers, you need to wrap every single one of them.
As discussed in PR 50436, the helper methods (*_one
, *_array
) have their own caveats:
- The helpers are not used consistently. Just one example,
raw_vec
uses a mix ofalloc_array
,alloc
/alloc_zeroed
, but only usesdealloc
. - Even with consistent use of e.g.
alloc_array
/dealloc_array
, one can still safely convert aVec
into aBox
, which would then usedealloc
. - Then there are some parts of the API that just don’t exist (no
zeroed version of
alloc_one
/alloc_array
)
Something not mentioned in the PR, is that their behavior for ZSTs is explicitly implementation defined, which makes the functions essentially unusable for liballoc itself, because it needs a specific behavior for ZSTs, which actually doesn’t even match the default implementation: alloc_one
and alloc_array
don’t allow to distinguish between ZSTs (or empty arrays, for that matter) and allocation failures, so the caller would still have to do work that alloc_one
/alloc_array
does, seriously limiting their usefulness.
The PR only moved the methods to a separate trait, but all in all, clients of the API would probably be better off using Box<T, A: Alloc>
and Vec<T, A: Alloc>
(when they’ll be a thing) rather than those helper methods, and I think they should be removed at some point.
The methods that also return the actual size allocated are typically useful to things like Vec
(RawVec
, really), which would set their capacity according to the actual size allocated rather than the size they requested. I actually think RawVec
should do that, which immediately shows the problem: to do so, there would need to be alloc_zeroed_excess
, grow_in_place_excess
and shrink_in_place_excess
.
Then there is the matter of infallibility.
PR 49669 changed the error type of the various allocation methods to a ZST and removed the argument to the oom
function. PR 50144 further moved the oom
method to a separate language item. In Firefox, we actually need OOM handling to be able to know the size of the allocation, and those changes make this hard if not impossible. But adding the size or layout back as an argument to oom
without adding it back to AllocErr
(and there were good reasons for this change) requires painful juggling in callers.
In the end, I think it would be better to have variants of the allocation methods that handle OOM conditions on their own. IOW, infallible variants of the allocation methods. One way to do that would be add alloc_infallible
, etc.
So that would be three sets of allocation methods, e.g. alloc
, alloc_excess
, and alloc_infallible
. Except it would also be desirable to have alloc_excess_infallible
too, which is growing out of proportion.
One way to reduce the problem in half is to just make the alloc
family return an Excess
, but it turns out that a) the compiler does a bad job at optimizing it out when it’s not used b) it makes it painful to use for callers that don’t care about it.
Another way is to add associated types, and wrapper types. Like:
trait Alloc {
type Result = Result<NonNull<Opaque>, AllocErr>;
unsafe fn alloc(&mut self, layout: Layout) -> Self::Result;
// ...
}
struct Infallible<A: Alloc>(pub A);
impl<A: Alloc> Alloc for Infallible<A> {
type Result = Result<NonNull<Opaque>, !>;
// ...
}
but that means Alloc::Result
must have a trait bound that provides everything Result
can do. Which leads us to separate associated types:
trait Alloc {
type Ok = NonNull<Opaque>;
type Err = AllocErr;
unsafe fn alloc(&mut self, layout: Layout) -> Result<Self::Ok, Self::Err>;
// ...
}
struct Infallible<A: Alloc>(pub A);
impl<A: Alloc> Alloc for Infallible<A> {
type Err = !;
// ...
}
But now we need two trait bounds, and a lot of work for callers, and, surprisingly, implementations too, because they now need:
- to re-define both
type Ok
andtype Err
, otherwise we get errors like:
= note: expected type `unsafe fn(&mut alloc::Global, core::alloc::Layout) -> core::result::Result<<alloc::Global as core::alloc::Alloc>::Ok, <alloc::Global as core::alloc::Alloc>::Err>`
found type `unsafe fn(&mut alloc::Global, core::alloc::Layout) -> core::result::Result<core::ptr::NonNull<core::alloc::Opaque>, core::alloc::AllocErr>`
- and to re-implement the methods that have defauls in the trait. For example this is the error you get in liballoc_system with the associated types re-defined:
error[E0399]: the following trait items need to be reimplemented as `Err` was overridden: `usable_size`, `grow_in_place`, `shrink_in_place`
--> liballoc_system/lib.rs:53:5
|
53 | type Err = AllocErr;
| ^^^^^^^^^^^^^^^^^^^^
Another option is to have multiple traits, and wrapper types:
trait Alloc {
unsafe alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr>;
// ...
}
trait InfallibleAlloc {
unsafe alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !>;
// ...
}
struct Infallible<A: Alloc>(pub A);
impl<A: Alloc> InfallibleAlloc for Infallible<A> {
// ...
}
But now there’s a burden on callers having to import types and traits.
Yet another option is to have one trait, and wrapper types:
// Since we have only one trait, we need it to provide excess *and* fallible allocations.
trait Alloc {
unsafe alloc(&mut self, layout: Layout) -> Result<Excess, AllocErr>;
// ...
}
struct NoExcessAlloc<A: Alloc>(pub A);
impl<A: Alloc> NoExcessAlloc<A> {
unsafe alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
// ...
}
// ...
}
I think Alloc
is in this position where it needs to be nice to both implementers and clients of the API. And forcing all implementors to return an Excess
is not exactly nice, especially when I expect most won’t want to care to.
One more option is to have one generic trait with default parameters, and wrapper types:
trait Alloc<Ok = NonNull<Opaque>, Err = AllocErr> {
unsafe alloc(&mut self, layout: Layout) -> Result<Ok, Err>;
// ...
}
struct NoExcessAlloc<A: Alloc<Excess, Err>, Err = AllocErr>(A, PhantomData<Err>);
unsafe impl<A: Alloc<Excess, Err>, Err> Alloc<NonNull<Opaque>, Err> for NoExcessAlloc<A, Err> {
unsafe fn alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, Err> {
self.0.alloc(layout).map(|Excess(ptr, _)| ptr)
}
// ...
}
// ...
with the necessary trait bounds, etc. There could be additional methods on the Alloc
trait, to transition from one kind to another, like:
fn no_excess(&self) -> impl Alloc<Excess, Err> where Self: Clone;
fn infallible(&self) -> impl Alloc<Ok, !> where Self: Clone;
// ...
I kind of liked this last version, but there are some major drawbacks:
- the infallible variant can’t actually live in the same crate as the
Alloc
trait, because it needs to calloom
, which is in liballoc, while the trait is in libcore. - while users can get by with only
use core::alloc::Alloc
, it’s not straightforward which variant they get by default for a given type that implements it. - I’m not sure it wouldn’t be messy in the rustdoc-generated doc.
Taking a step back, the needs for excess/no_excess, fallible/infallible are not exactly the same:
- An allocator would be expected to implement either the excess variant or the no-excess variant, not even both. Ideally, we’d be able to derive the other variant from the one provided, but it actually doesn’t seem possible to do that in rust, at least not without negative bounds.
- An allocator is not actually expected to implement a fallible and an infallible variant. The infallible variant could be considered as a helper for users of the API, to avoid the hoops required.
Taking one more step back, and looking around on github for users and implementers of alloc_excess
, I’m starting to wonder if it’s actually necessary. (And if it’s not, then this restricts the overall problem to just infallability).
I only found exactly two implementations of alloc_excess
:
- in jemallocator
- in CtoPoolAlloc
If you look closely, you’ll note that they’re not even doing something fundamentally different from what the default implementation would do if they just provided usable_size
(which they do).
Now, on the other end of the chain, the only code that I found that is actively using the extra value in Excess
is… libstd code from old forks of the rust repo, that was since removed.
So, barely any implementors, and no users currently. So let’s hypothesize what a user of the API would want to do. As already mentioned, a good candidate would be RawVec
, and it could do something like:
fn allocate_in(cap: usize, zeroed: bool, mut a: A) -> Self {
unsafe {
let mut cap = cap;
let elem_size = mem::size_of::<T>();
let alloc_size = cap.checked_mul(elem_size).unwrap_or_else(|| capacity_overflow());
alloc_guard(alloc_size).unwrap_or_else(|_| capacity_overflow());
// handles ZSTs and `cap = 0` alike
let ptr = if alloc_size == 0 {
NonNull::<T>::dangling().as_opaque()
} else {
let align = mem::align_of::<T>();
let result = if zeroed {
a.alloc_zeroed_excess(Layout::from_size_align(alloc_size, align).unwrap())
} else {
a.alloc_excess(Layout::from_size_align(alloc_size, align).unwrap())
};
match result {
Ok(Excess(ptr, size)) => {
cap = size / elem_size;
ptr
}
Err(_) => oom(),
}
};
RawVec {
ptr: ptr.cast().into(),
cap,
a,
}
}
}
As mentioned earlier, alloc_zeroed_excess
, which doesn’t exist currently, would be necessary here. Now the question I want to ask is whether there’s much added value in the above, compared to something like the following:
fn allocate_in(cap: usize, zeroed: bool, mut a: A) -> Self {
unsafe {
let mut cap = cap;
let elem_size = mem::size_of::<T>();
let alloc_size = cap.checked_mul(elem_size).unwrap_or_else(|| capacity_overflow());
alloc_guard(alloc_size).unwrap_or_else(|_| capacity_overflow());
// handles ZSTs and `cap = 0` alike
let ptr = if alloc_size == 0 {
NonNull::<T>::dangling().as_opaque()
} else {
let align = mem::align_of::<T>();
let layout = Layout::from_size_align(alloc_size, align).unwrap();
cap = a.usable_size(layout.size()).1 / elem_size;
let layout = Layout::from_size_align(cap * elem_size, align).unwrap();
let result = if zeroed {
a.alloc_zeroed(layout)
} else {
a.alloc(layout)
};
match result {
Ok(ptr) => ptr,
Err(_) => oom(),
}
};
RawVec {
ptr: ptr.cast().into(),
cap,
a,
}
}
}
Well, the code is a little simpler with alloc_excess
, but there could be helpers in Layout
instead of duplicating all the allocating methods in the Alloc
trait.
So, with all that being said, I would like to propose the following changes:
- remove
alloc_one
,dealloc_one
,alloc_array
,realloc_array
,dealloc_array
, either now or afterBox<T, A>
andVec<T, A>
are actually a thing. - remove
alloc_excess
andrealloc_excess
. - add (to be defined) helpers to
Layout
to help adjustLayout
s to fit better tousable_size
. - make the
oom
language item take aLayout
argument. - add a wrapper type to liballoc for infallibility, like the following:
pub struct Infallible<A: Alloc>(A);
impl<A: Alloc> Infallible<A> {
pub fn new(a: A) -> Self {
Infallible(a)
}
pub fn into_inner(self) -> A {
self.0
}
pub unsafe fn infallible_alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !> {
self.0.alloc(layout).map_err(|_| oom(layout))
}
pub unsafe fn infallible_realloc(
&mut self,
ptr: NonNull<Opaque>,
layout: Layout,
new_size: usize) -> Result<NonNull<Opaque>, !>
{
self.0.realloc(ptr, layout, new_size).map_err(|_| oom(Layout::from_size_align_unchecked(new_size, layout.align())))
}
pub unsafe fn infallible_alloc_zeroed(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !> {
self.0.alloc_zeroed(layout).map_err(|_| oom(layout))
}
}
- In order to allow infallible allocations to be used anywhere that accepts an
Alloc
, implement the trait:
unsafe impl<A: Alloc> Alloc for Infallible<A> {
unsafe fn alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
self.infallible_alloc(layout).map_err(|_| AllocErr)
}
unsafe fn dealloc(&mut self, ptr: NonNull<Opaque>, layout: Layout) {
self.0.dealloc(ptr, layout)
}
unsafe fn realloc(&mut self,
ptr: NonNull<Opaque>,
layout: Layout,
new_size: usize)
-> Result<NonNull<Opaque>, AllocErr>
{
self.infallible_realloc(ptr, layout, new_size).map_err(|_| AllocErr)
}
unsafe fn alloc_zeroed(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
self.infallible_alloc_zeroed(layout).map_err(|_| AllocErr)
}
fn usable_size(&self, layout: &Layout) -> (usize, usize) {
self.0.usable_size(layout)
}
unsafe fn grow_in_place(&mut self,
ptr: NonNull<Opaque>,
layout: Layout,
new_size: usize) -> Result<(), CannotReallocInPlace> {
self.0.grow_in_place(ptr, layout, new_size)
}
unsafe fn shrink_in_place(&mut self,
ptr: NonNull<Opaque>,
layout: Layout,
new_size: usize) -> Result<(), CannotReallocInPlace> {
self.0.shrink_in_place(ptr, layout, new_size)
}
}
Now, if you’re a type like e.g. RawVec<T, A>
, and you want explicitly infallible allocations (which is what it does on most its allocations, except in the recently added try_reserve*
methods), you can’t move self.a
to create an Infallible
instance. So should methods using Infallible have a bound on Alloc + Clone
? or should there be a blanket unsafe impl<A: Alloc> Alloc for &mut A
, so that one can use Infallible(&mut self.a)
?
Come to think of it, adding a trait in liballoc for infallible allocations might work better:
trait InfallibleAlloc: Alloc {
unsafe fn infallible_alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !> {
self.alloc(layout).map_err(|_| oom(layout))
}
unsafe fn infallible_realloc(
&mut self,
ptr: NonNull<Opaque>,
layout: Layout,
new_size: usize) -> Result<NonNull<Opaque>, !>
{
self.realloc(ptr, layout, new_size).map_err(|_| oom(Layout::from_size_align_unchecked(new_size, layout.align())))
}
unsafe fn infallible_alloc_zeroed(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !> {
self.alloc_zeroed(layout).map_err(|_| oom(layout))
}
}
impl<A: Alloc> InfallibleAlloc for A {}
On one end, that avoids the problem with self.a
not being movable. On the other end, it doesn’t allow to use fallible or infallible allocator interchangeably anywhere an Alloc
is expected (as in, forcing infallible allocations on a collection type that normally does fallible allocations). For the latter, however, it is still possible to add the same-ish wrapper type:
pub struct Infallible<A: InfallibleAlloc>(A);
impl<A: InfallibleAlloc> Infallible<A> {
pub fn new(a: A) -> Self {
Infallible(a)
}
pub fn into_inner(self) -> A {
self.0
}
}
unsafe impl<A: InfallibleAlloc> Alloc for Infallible<A> {
unsafe fn alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
self.0.infallible_alloc(layout).map_err(|_| AllocErr)
}
unsafe fn dealloc(&mut self, ptr: NonNull<Opaque>, layout: Layout) {
self.0.dealloc(ptr, layout)
}
unsafe fn realloc(&mut self,
ptr: NonNull<Opaque>,
layout: Layout,
new_size: usize)
-> Result<NonNull<Opaque>, AllocErr>
{
self.0.infallible_realloc(ptr, layout, new_size).map_err(|_| AllocErr)
}
unsafe fn alloc_zeroed(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
self.0.infallible_alloc_zeroed(layout).map_err(|_| AllocErr)
}
fn usable_size(&self, layout: &Layout) -> (usize, usize) {
self.0.usable_size(layout)
}
unsafe fn grow_in_place(&mut self,
ptr: NonNull<Opaque>,
layout: Layout,
new_size: usize) -> Result<(), CannotReallocInPlace> {
self.0.grow_in_place(ptr, layout, new_size)
}
unsafe fn shrink_in_place(&mut self,
ptr: NonNull<Opaque>,
layout: Layout,
new_size: usize) -> Result<(), CannotReallocInPlace> {
self.0.shrink_in_place(ptr, layout, new_size)
}
}
This still leaves the question wrt a blanket unsafe impl<A: Alloc> Alloc for &mut A
implementation open.
While on the subject of the Alloc
trait, most methods assume Layout
can’t have an empty size, but Layout
does allow them. This has the interesting side effect that this condition can never be resolved at compile time, which makes using an allocator-aware Vec
in conjunction with the system allocator (as opposed to jemalloc) always want to choose between malloc and posix_memalign (on godbolt).
With alloc_one
, alloc_array
, etc. gone, I think no functions are left that assume Layout
size can be zero… it feels like making it store NonZeroUsize
values would help here.