Pre-RFC: changing the Alloc trait

glandium · May 10, 2018, 8:56am

(disclaimer: it took me a while to write this, during which I changed my mind multiple times, and I still don’t have a clear proposal, although I’m kind of converging to one, and this essentially presents all the options I’ve considered along the way, so please bear with the ramblings)

The Alloc trait is mostly specified in RFC 1398 (tracking issue) and afaik, the largest changes to it happened with PR 49669 (tracking issue).

I’m currently reimplementing the Firefox memory allocator (mozjemalloc), in rust. To that end, I’ve been using the Alloc trait and extracted it into a separate crate. Relatedly, I’ve worked towards making Box<T, A: Alloc> a thing in std (I have a feature branch ready to go as soon as 1.27 hits beta and is used to build nightly, although it only really works for ZST Alloc implementations for compiler internals reasons).

The Alloc trait currently essentially contains three different sets of methods:

the base allocator methods: alloc, dealloc, realloc, alloc_zeroed, grow_in_place, shrink_in_place, usable_size.
methods that also return the actual size allocated: alloc_excess, realloc_excess
helper methods: alloc_one, dealloc_one, alloc_array, realloc_array, dealloc_array.

That’s both a lot, and not enough. A lot, because while a typical implementor only really needs to implement a few of those (there are default implementations for all of them except alloc and dealloc), when you need to implement generic wrappers, you need to wrap every single one of them.

As discussed in PR 50436, the helper methods (*_one, *_array) have their own caveats:

The helpers are not used consistently. Just one example, raw_vec uses a mix of alloc_array, alloc/alloc_zeroed, but only uses dealloc.
Even with consistent use of e.g. alloc_array/dealloc_array, one can still safely convert a Vec into a Box, which would then use dealloc.
Then there are some parts of the API that just don’t exist (no zeroed version of alloc_one/alloc_array)

Something not mentioned in the PR, is that their behavior for ZSTs is explicitly implementation defined, which makes the functions essentially unusable for liballoc itself, because it needs a specific behavior for ZSTs, which actually doesn’t even match the default implementation: alloc_one and alloc_array don’t allow to distinguish between ZSTs (or empty arrays, for that matter) and allocation failures, so the caller would still have to do work that alloc_one/alloc_array does, seriously limiting their usefulness.

The PR only moved the methods to a separate trait, but all in all, clients of the API would probably be better off using Box<T, A: Alloc> and Vec<T, A: Alloc> (when they’ll be a thing) rather than those helper methods, and I think they should be removed at some point.

The methods that also return the actual size allocated are typically useful to things like Vec (RawVec, really), which would set their capacity according to the actual size allocated rather than the size they requested. I actually think RawVec should do that, which immediately shows the problem: to do so, there would need to be alloc_zeroed_excess, grow_in_place_excess and shrink_in_place_excess.

Then there is the matter of infallibility.

PR 49669 changed the error type of the various allocation methods to a ZST and removed the argument to the oom function. PR 50144 further moved the oom method to a separate language item. In Firefox, we actually need OOM handling to be able to know the size of the allocation, and those changes make this hard if not impossible. But adding the size or layout back as an argument to oom without adding it back to AllocErr (and there were good reasons for this change) requires painful juggling in callers.

In the end, I think it would be better to have variants of the allocation methods that handle OOM conditions on their own. IOW, infallible variants of the allocation methods. One way to do that would be add alloc_infallible, etc.

So that would be three sets of allocation methods, e.g. alloc, alloc_excess, and alloc_infallible. Except it would also be desirable to have alloc_excess_infallible too, which is growing out of proportion.

One way to reduce the problem in half is to just make the alloc family return an Excess, but it turns out that a) the compiler does a bad job at optimizing it out when it’s not used b) it makes it painful to use for callers that don’t care about it.

Another way is to add associated types, and wrapper types. Like:

trait Alloc {
    type Result = Result<NonNull<Opaque>, AllocErr>;

    unsafe fn alloc(&mut self, layout: Layout) -> Self::Result;
    // ...
}
struct Infallible<A: Alloc>(pub A);
impl<A: Alloc> Alloc for Infallible<A> {
    type Result = Result<NonNull<Opaque>, !>;
    // ...
}

but that means Alloc::Result must have a trait bound that provides everything Result can do. Which leads us to separate associated types:

trait Alloc {
    type Ok = NonNull<Opaque>;
    type Err = AllocErr;

    unsafe fn alloc(&mut self, layout: Layout) -> Result<Self::Ok, Self::Err>;
    // ...
}
struct Infallible<A: Alloc>(pub A);
impl<A: Alloc> Alloc for Infallible<A> {
    type Err = !;
    // ...
}

But now we need two trait bounds, and a lot of work for callers, and, surprisingly, implementations too, because they now need:

to re-define both type Ok and type Err, otherwise we get errors like:

    = note: expected type `unsafe fn(&mut alloc::Global, core::alloc::Layout) -> core::result::Result<<alloc::Global as core::alloc::Alloc>::Ok, <alloc::Global as core::alloc::Alloc>::Err>`
               found type `unsafe fn(&mut alloc::Global, core::alloc::Layout) -> core::result::Result<core::ptr::NonNull<core::alloc::Opaque>, core::alloc::AllocErr>`

and to re-implement the methods that have defauls in the trait. For example this is the error you get in liballoc_system with the associated types re-defined:

error[E0399]: the following trait items need to be reimplemented as `Err` was overridden: `usable_size`, `grow_in_place`, `shrink_in_place`
  --> liballoc_system/lib.rs:53:5
   |
53 |     type Err = AllocErr;
   |     ^^^^^^^^^^^^^^^^^^^^

Another option is to have multiple traits, and wrapper types:

trait Alloc {
    unsafe alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr>;
    // ...
}
trait InfallibleAlloc {
    unsafe alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !>;
    // ...
}
struct Infallible<A: Alloc>(pub A);
impl<A: Alloc> InfallibleAlloc for Infallible<A> {
    // ...
}

But now there’s a burden on callers having to import types and traits.

Yet another option is to have one trait, and wrapper types:

// Since we have only one trait, we need it to provide excess *and* fallible allocations.
trait Alloc {
    unsafe alloc(&mut self, layout: Layout) -> Result<Excess, AllocErr>;
    // ...
}
struct NoExcessAlloc<A: Alloc>(pub A);
impl<A: Alloc> NoExcessAlloc<A> {
    unsafe alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
        // ...
    }
    // ...
}

I think Alloc is in this position where it needs to be nice to both implementers and clients of the API. And forcing all implementors to return an Excess is not exactly nice, especially when I expect most won’t want to care to.

One more option is to have one generic trait with default parameters, and wrapper types:

trait Alloc<Ok = NonNull<Opaque>, Err = AllocErr> {
    unsafe alloc(&mut self, layout: Layout) -> Result<Ok, Err>;
    // ...
}
struct NoExcessAlloc<A: Alloc<Excess, Err>, Err = AllocErr>(A, PhantomData<Err>);
unsafe impl<A: Alloc<Excess, Err>, Err> Alloc<NonNull<Opaque>, Err> for NoExcessAlloc<A, Err> {
    unsafe fn alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, Err> {
        self.0.alloc(layout).map(|Excess(ptr, _)| ptr)
    }
    // ...
}
// ...

with the necessary trait bounds, etc. There could be additional methods on the Alloc trait, to transition from one kind to another, like:

fn no_excess(&self) -> impl Alloc<Excess, Err> where Self: Clone;
fn infallible(&self) -> impl Alloc<Ok, !> where Self: Clone;
// ...

I kind of liked this last version, but there are some major drawbacks:

the infallible variant can’t actually live in the same crate as the Alloc trait, because it needs to call oom, which is in liballoc, while the trait is in libcore.
while users can get by with only use core::alloc::Alloc, it’s not straightforward which variant they get by default for a given type that implements it.
I’m not sure it wouldn’t be messy in the rustdoc-generated doc.

Taking a step back, the needs for excess/no_excess, fallible/infallible are not exactly the same:

An allocator would be expected to implement either the excess variant or the no-excess variant, not even both. Ideally, we’d be able to derive the other variant from the one provided, but it actually doesn’t seem possible to do that in rust, at least not without negative bounds.
An allocator is not actually expected to implement a fallible and an infallible variant. The infallible variant could be considered as a helper for users of the API, to avoid the hoops required.

Taking one more step back, and looking around on github for users and implementers of alloc_excess, I’m starting to wonder if it’s actually necessary. (And if it’s not, then this restricts the overall problem to just infallability).

I only found exactly two implementations of alloc_excess:

in jemallocator
in CtoPoolAlloc

If you look closely, you’ll note that they’re not even doing something fundamentally different from what the default implementation would do if they just provided usable_size (which they do).

Now, on the other end of the chain, the only code that I found that is actively using the extra value in Excess is… libstd code from old forks of the rust repo, that was since removed.

So, barely any implementors, and no users currently. So let’s hypothesize what a user of the API would want to do. As already mentioned, a good candidate would be RawVec, and it could do something like:

fn allocate_in(cap: usize, zeroed: bool, mut a: A) -> Self {
    unsafe {
        let mut cap = cap;
        let elem_size = mem::size_of::<T>();

        let alloc_size = cap.checked_mul(elem_size).unwrap_or_else(|| capacity_overflow());
        alloc_guard(alloc_size).unwrap_or_else(|_| capacity_overflow());

        // handles ZSTs and `cap = 0` alike
        let ptr = if alloc_size == 0 {
            NonNull::<T>::dangling().as_opaque()
        } else {
            let align = mem::align_of::<T>();
            let result = if zeroed {
                a.alloc_zeroed_excess(Layout::from_size_align(alloc_size, align).unwrap())
            } else {
                a.alloc_excess(Layout::from_size_align(alloc_size, align).unwrap())
            };
            match result {
                Ok(Excess(ptr, size)) => {
                    cap = size / elem_size;
                    ptr
                }
                Err(_) => oom(),
            }
        };

        RawVec {
            ptr: ptr.cast().into(),
            cap,
            a,
        }
    }
}

As mentioned earlier, alloc_zeroed_excess, which doesn’t exist currently, would be necessary here. Now the question I want to ask is whether there’s much added value in the above, compared to something like the following:

fn allocate_in(cap: usize, zeroed: bool, mut a: A) -> Self {
    unsafe {
        let mut cap = cap;
        let elem_size = mem::size_of::<T>();

        let alloc_size = cap.checked_mul(elem_size).unwrap_or_else(|| capacity_overflow());
        alloc_guard(alloc_size).unwrap_or_else(|_| capacity_overflow());

        // handles ZSTs and `cap = 0` alike
        let ptr = if alloc_size == 0 {
            NonNull::<T>::dangling().as_opaque()
        } else {
            let align = mem::align_of::<T>();
            let layout = Layout::from_size_align(alloc_size, align).unwrap();
            cap = a.usable_size(layout.size()).1 / elem_size;
            let layout = Layout::from_size_align(cap * elem_size, align).unwrap();
            let result = if zeroed {
                a.alloc_zeroed(layout)
            } else {
                a.alloc(layout)
            };
            match result {
                Ok(ptr) => ptr,
                Err(_) => oom(),
            }
        };

        RawVec {
            ptr: ptr.cast().into(),
            cap,
            a,
        }
    }
}

Well, the code is a little simpler with alloc_excess, but there could be helpers in Layout instead of duplicating all the allocating methods in the Alloc trait.

So, with all that being said, I would like to propose the following changes:

remove alloc_one, dealloc_one, alloc_array, realloc_array, dealloc_array, either now or after Box<T, A> and Vec<T, A> are actually a thing.
remove alloc_excess and realloc_excess.
add (to be defined) helpers to Layout to help adjust Layouts to fit better to usable_size.
make the oom language item take a Layout argument.
add a wrapper type to liballoc for infallibility, like the following:

pub struct Infallible<A: Alloc>(A);
impl<A: Alloc> Infallible<A> {
    pub fn new(a: A) -> Self {
        Infallible(a)
    }
    pub fn into_inner(self) -> A {
        self.0
    }
    pub unsafe fn infallible_alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !> {
        self.0.alloc(layout).map_err(|_| oom(layout))
    }
    pub unsafe fn infallible_realloc(
        &mut self,
        ptr: NonNull<Opaque>,
        layout: Layout,
        new_size: usize) -> Result<NonNull<Opaque>, !>
    {
        self.0.realloc(ptr, layout, new_size).map_err(|_| oom(Layout::from_size_align_unchecked(new_size, layout.align())))
    }
    pub unsafe fn infallible_alloc_zeroed(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !> {
        self.0.alloc_zeroed(layout).map_err(|_| oom(layout))
    }
}

In order to allow infallible allocations to be used anywhere that accepts an Alloc, implement the trait:

unsafe impl<A: Alloc> Alloc for Infallible<A> {
    unsafe fn alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
        self.infallible_alloc(layout).map_err(|_| AllocErr)
    }
    unsafe fn dealloc(&mut self, ptr: NonNull<Opaque>, layout: Layout) {
        self.0.dealloc(ptr, layout)
    }
    unsafe fn realloc(&mut self,
                      ptr: NonNull<Opaque>,
                      layout: Layout,
                      new_size: usize)
                      -> Result<NonNull<Opaque>, AllocErr>
    {
        self.infallible_realloc(ptr, layout, new_size).map_err(|_| AllocErr)
    }
    unsafe fn alloc_zeroed(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
        self.infallible_alloc_zeroed(layout).map_err(|_| AllocErr)
    }
    fn usable_size(&self, layout: &Layout) -> (usize, usize) {
        self.0.usable_size(layout)
    }
    unsafe fn grow_in_place(&mut self,
                            ptr: NonNull<Opaque>,
                            layout: Layout,
                            new_size: usize) -> Result<(), CannotReallocInPlace> {
        self.0.grow_in_place(ptr, layout, new_size)
    }
    unsafe fn shrink_in_place(&mut self,
                              ptr: NonNull<Opaque>,
                              layout: Layout,
                              new_size: usize) -> Result<(), CannotReallocInPlace> {
        self.0.shrink_in_place(ptr, layout, new_size)
    }
}

Now, if you’re a type like e.g. RawVec<T, A>, and you want explicitly infallible allocations (which is what it does on most its allocations, except in the recently added try_reserve* methods), you can’t move self.a to create an Infallible instance. So should methods using Infallible have a bound on Alloc + Clone? or should there be a blanket unsafe impl<A: Alloc> Alloc for &mut A, so that one can use Infallible(&mut self.a)?

Come to think of it, adding a trait in liballoc for infallible allocations might work better:

trait InfallibleAlloc: Alloc {
    unsafe fn infallible_alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !> {
        self.alloc(layout).map_err(|_| oom(layout))
    }
    unsafe fn infallible_realloc(
        &mut self,
        ptr: NonNull<Opaque>,
        layout: Layout,
        new_size: usize) -> Result<NonNull<Opaque>, !>
    {
        self.realloc(ptr, layout, new_size).map_err(|_| oom(Layout::from_size_align_unchecked(new_size, layout.align())))
    }
    unsafe fn infallible_alloc_zeroed(&mut self, layout: Layout) -> Result<NonNull<Opaque>, !> {
        self.alloc_zeroed(layout).map_err(|_| oom(layout))
    }
}
impl<A: Alloc> InfallibleAlloc for A {}

On one end, that avoids the problem with self.a not being movable. On the other end, it doesn’t allow to use fallible or infallible allocator interchangeably anywhere an Alloc is expected (as in, forcing infallible allocations on a collection type that normally does fallible allocations). For the latter, however, it is still possible to add the same-ish wrapper type:

pub struct Infallible<A: InfallibleAlloc>(A);
impl<A: InfallibleAlloc> Infallible<A> {
    pub fn new(a: A) -> Self {
        Infallible(a)
    }
    pub fn into_inner(self) -> A {
        self.0
    }
}
unsafe impl<A: InfallibleAlloc> Alloc for Infallible<A> {
    unsafe fn alloc(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
        self.0.infallible_alloc(layout).map_err(|_| AllocErr)
    }
    unsafe fn dealloc(&mut self, ptr: NonNull<Opaque>, layout: Layout) {
        self.0.dealloc(ptr, layout)
    }
    unsafe fn realloc(&mut self,
                      ptr: NonNull<Opaque>,
                      layout: Layout,
                      new_size: usize)
                      -> Result<NonNull<Opaque>, AllocErr>
    {
        self.0.infallible_realloc(ptr, layout, new_size).map_err(|_| AllocErr)
    }
    unsafe fn alloc_zeroed(&mut self, layout: Layout) -> Result<NonNull<Opaque>, AllocErr> {
        self.0.infallible_alloc_zeroed(layout).map_err(|_| AllocErr)
    }
    fn usable_size(&self, layout: &Layout) -> (usize, usize) {
        self.0.usable_size(layout)
    }
    unsafe fn grow_in_place(&mut self,
                            ptr: NonNull<Opaque>,
                            layout: Layout,
                            new_size: usize) -> Result<(), CannotReallocInPlace> {
        self.0.grow_in_place(ptr, layout, new_size)
    }
    unsafe fn shrink_in_place(&mut self,
                              ptr: NonNull<Opaque>,
                              layout: Layout,
                              new_size: usize) -> Result<(), CannotReallocInPlace> {
        self.0.shrink_in_place(ptr, layout, new_size)
    }
}

This still leaves the question wrt a blanket unsafe impl<A: Alloc> Alloc for &mut A implementation open.

While on the subject of the Alloc trait, most methods assume Layout can’t have an empty size, but Layout does allow them. This has the interesting side effect that this condition can never be resolved at compile time, which makes using an allocator-aware Vec in conjunction with the system allocator (as opposed to jemalloc) always want to choose between malloc and posix_memalign (on godbolt). With alloc_one, alloc_array, etc. gone, I think no functions are left that assume Layout size can be zero… it feels like making it store NonZeroUsize values would help here.

SimonSapin · May 10, 2018, 11:46am

There’s clearly a lot of design space to explore here. And it is important to do so, but a more immediate concern is that stabilization of GlobalAlloc and Layout is imminent. So:

Can the proposed changes be implemented on top of GlobalAlloc (possibly by adding new default methods), in particular for std::alloc::Global, or would any variation of them require breaking changes?

(For example, I think making Layout::from_size_align return an error for size == 0 is a breaking contract change, even though it doesn’t affect APIs.)

raphaelcohn · May 11, 2018, 6:40am

Hi,

I’m the author of CtoPoolAlloc - for the purpose of your post, you can effectively dismiss it; it was an experimental implementation of an allocator for persistent memory.

All of the c-based allocators I’m interested in wrapping and using with Rust don’t support (or don’t support in any reliable or efficient way) alloc_excess, including those in libnuma, PMDK and DPDK.

What some of them do support, which isn’t mentioned anywhere else, is they take additional arguments, such as a NUMA node / socket identifier… In terms of allocating memory, it’s useful on modern systems to be able to tell the allocator where the memory is most likely to be used (rather than simply relying on the thread’s current core / NUMA node)…

gnzlbg · May 11, 2018, 2:44pm

Since one can implement returning the Excess by just returning the requested size, every single allocator supports this.

Other allocators do support Excess. For example, jemalloc has both functions that either return a different excess as part of their invocation, and it also has specific methods to query the available size for a particular allocation. For the second option, cross-language inlining or appropriate function attributes allow LLVM to remove the call to the functions computing the excess if the excess is not used by the caller. That is, dropping the Excess with jemalloc at least does not incur a run-time cost over not returning the Excess at all.

We already have some attributes specifically for allocators (e.g. rustc_nounwind), so the most reliable option here would be to just add an attribute (e.g. rustc_readonly) to avoid relying on cross-language inlining for this.

What some of them do support, which isn’t mentioned anywhere else, is they take additional arguments, such as a NUMA node / socket identifier…

You should be able to use this by constructing a collection from an allocator that has these parameters set:

let v = Vec::with_capacity_alloc(100, NumaAlloc(node(2),socket(3)));

Extending the collections to "funnel" any possible option that an allocator might want is probably not going to happen any time soon. But the collections should be able to work with stateful allocators without issues, and probably have some utility function that allows obtaining a &mut to the current allocator.

impl<T,A> Vec<T,A> {
    fn get_alloc(&mut self) -> &mut A;
}

That should allow doing anything you want.

gnzlbg · May 14, 2018, 8:40am

One way to reduce the problem in half is to just make the alloc family return an Excess, but it turns out that a) the compiler does a bad job at optimizing it out when it’s not used

Note that this is an open bug, pretty trivial to fix, and the general direction towards the fix already has consensus. It is just that nobody has bothered implementing the fix yet.

ow the question I want to ask is whether there’s much added value in the above, compared to something like the following:

At least for jemalloc, the cost of throwing the Excess away from some of their methods, and recomputing it using usable_size, when compared to not throwing the Excess away was sometimes ~2-10x for some allocation classes.

gnzlbg · May 18, 2018, 9:31am

FWIW http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0901r0.html was discussed with the jemalloc devs in the C++ Jacksonville meeting and they’ve agreed to experimentally try to offer an API that always returns the Excess.

IMO at this point I am back to my initial opinion that all allocator methods should always return the Excess. We can always write a wrapper in a crate that just drops it “for convenience” but that doesn’t need to be part of std and it simplifies the API in std. That would allow us to kill usable_size , and half of the Alloc trait API.

Allocators that don’t know the Excess can just dummily return the requested size.

Allocators that don’t know the Excess when allocating, but have a separate usable_size method internally can just call it in such a way that the computation is optimized away if the Excess is not used, or just provide to implementations of the Alloc API for their allocator, one doing that, and one just returning the requested size. That is, they can provide two newtypes MyAlloc, MyAllocAccurate, with two different implementations if they want.

Allocators that do know the Excess when allocating can just return it.

gnzlbg · May 18, 2018, 9:32am

I would prefer if this was the case for GlobalAlloc as well, but it seems that that ship has sailed @SimonSapin ?

SimonSapin · May 18, 2018, 12:56pm

GlobalAlloc is close to stabilization, but that hasn’t happened yet so in theory everything is possible. However given the initial desire to have a minimal API in order to minimize the number of questions to resolve in it, since this is a significant change/addition, in order to convince everyone at this point you’d have to come to the tracking issue or to an RFC with a very concrete proposal: at least an exact definition of the trait, and perhaps a port of std to it with benchmarks that demonstrate your performance claims.

they’ve agreed to experimentally try to offer an API

This sounds like our Nightly. But in the case of GlobalAlloc shipping something is way past due already, so the bar is high for a proposal that’s “blocking” (as opposed to something that can be added later).

gnzlbg · May 18, 2018, 1:38pm

in order to convince everyone at this point you’d have to come to the tracking issue or to an RFC with a very concrete proposal: at least an exact definition of the trait, and perhaps a port of std to it with benchmarks that demonstrate your performance claims.

That's doable. From my experience with Rust, changing fn foo(x: usize) -> *mut ptr { ... ptr } to fn foo(x: usize) -> (*mut ptr, usize) { ... (ptr, x) } won't have a measurable performance impact that is measurable in any way. I can send a PR for this if you want, we can run perf, and if everything is ok I can post the definition on the RFC. (We don't even have to merge this).

But in the case of GlobalAlloc shipping something is way past due already, so the bar is high for a proposal that’s “blocking” (as opposed to something that can be added later).

I agree, but having the _excess, _infallible, ... allocator methods is IMO getting out of hand.

The allocator API should be infallible, it should return the excess (even if it is trivial on most allocators), and if we want it to fail in some cases, or omit the excess in data-structures where we don't care, we should build on top of GlobalAlloc a zero-cost abstraction that unwraps, drops the excess, etc.

C++ has a gazillion allocation functions, and they are still adding more because that's not enough. So to me this is more a matter of future proofing the API than of raw performance

glandium · May 18, 2018, 1:56pm

While you can drop excess, you can’t drop infallability. The lowest level has to be fallible.

glandium · May 18, 2018, 2:06pm

From my experience with Rust, changing fn foo(x: usize) -> *mut ptr { ... ptr } to fn foo(x: usize) -> (*mut ptr, usize) { ... (ptr, x) }

FWIW, from actually trying, before starting this thread, I can tell you that even if the caller doesn't use x and everything is inlined, somehow rust doesn't generate the same code. For some reason the code is larger when the allocation functions return Excess. It might not have a measurable effect in micro benchmarks, but who knows how those codegen mishaps accumulate in larger codebases.

SimonSapin · May 18, 2018, 2:18pm

I was not arguing that this particular feature should be added later. Only that if you want significant changes to GlobalAlloc to happen now before stabilization, you need to do more work coming up with a details proposal and convincing more people than just posting a few sentences with foo placeholders in a side discussion with only the three of us.

gnzlbg · May 18, 2018, 2:18pm

@glandium sorry I meant fallability (I always confuse the negative with the positive version), that's what I meant with writing a wrapper that just unwraps and makes it infallible (did I get this right?).

For some reason the code is larger when the allocation functions return Excess.

They aren't inline, so that might be it?

glandium · May 18, 2018, 10:48pm

Everything was inlined, I was comparing the assembly of one function containing everything.

system · March 25, 2019, 8:30am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Pre-RFC] Allocators, take II ideas (deprecated)	10	4050	September 17, 2014
Combining the allocator and storages APIs libs	17	1845	August 17, 2021
pre-pre-RFC: Execution Context language design	8	1522	March 25, 2019
Pre-RFC: providable language design	7	1606	March 25, 2019
Subteam reports 2016-03-25 announcements	1	1403	March 25, 2019

Pre-RFC: changing the Alloc trait

Related topics