`GlobalAlloc::dealloc` is too restrictive, provide additional method with weaker requirements

GlobalAlloc::dealloc states that the provided memory layout must be the same as the layout used to allocate the memory block. This implies that the alignment of these two layouts must be identical.

However, doesn't the deallocation method rely only on:

  • the size of the requested and provided memory, and
  • the alignment of the address it provided?

In other words, an allocator may return memory aligned to [N+1..], even when an N-aligned layout was requested.

If the deallocation method relies on the alignment of the pointer it returned, why not provide a wrapper deallocation method without an alignment requirement? Let's call it deallocate_unaligned:

(Edit 2): As user SkiFire13 stated, this implementation is undefined behavior (UB) because some allocators rely on the alignment of the provided layout to function correctly. Additionally, it depends on the Allocator::deallocate method, which requires the same layout that was used during allocation.

trait Allocator {
    // Existing methods ..
    
    // This method should also be added to `GlobalAlloc`
    /// The provided `layout.size()` must fall within the range `min ..= max`, where:
    /// 
    /// * `min` is the size of the layout most recently used to allocate the block, and
    /// * `max` is the latest actual size returned from `allocate`, `grow`, or `shrink`.
    fn deallocate_unaligned(&self, ptr: NonNull<u8>, layout: Layout) { 
        let address = ptr.as_ptr().addr();

        // What if the address is 0?
        let align = address & (!address + 1);
        let layout_with_recovered_alignment = 
            unsafe { Layout::from_size_align_unchecked(layout.size(), align) };
        
        self.deallocate(ptr, layout_with_recovered_alignment);
    }

Note that allocator implementations that are insensitive to layout alignment during deallocation (e.g., C's free) may simply override this method to call the inner deallocation function, incurring no additional runtime overhead. (The documentation was copied from Memory fitting.)

Such a method would allow memory reuse for types of the same size (in bytes) but with different alignments. For example, it would enable in-place mapping of [T] to [U].[1]

Consider this example:

#[repr(align(2))]
struct Wrapper([u8; 2]);

pub fn eq_aligned(inp: Vec<u16>) -> Vec<Wrapper> {
    assert!(align_of::<u16>() == 2);
    assert!(align_of::<Wrapper>() == 2);

    inp.into_iter()
        .map(|t| Wrapper(t.to_ne_bytes()))
        .collect()
}

pub fn ne_aligned(inp: Vec<u16>) -> Vec<[u8; 2]> {
    assert!(align_of::<u16>() == 2);
    assert!(align_of::<[u8; 2]>() == 1);
    
    inp.into_iter()
        .map(u16::to_ne_bytes)
        .collect()
}

Now, look at the generated assembly in Compiler Explorer. If I'm not mistaken, this size and alignment check is preventing an optimization. This is understandable, as GlobalAlloc requires the layout used for deallocation to match the one used for allocation. However, if a method like the previously mentioned deallocate_unaligned existed, Vec could potentially use it to reuse memory and deallocate it with a layout of a different alignment.

I understand that the size of the memory layout is crucial for deallocation. However, does the deallocation method truly depend on the alignment of the layout used during allocation, or on the alignment of the pointer it returned?

What prevents us from introducing a method like deallocate_unaligned to allow deallocation using a layout with a different alignment?

I haven't been able to find a definitive answer. Libraries like jemalloc and glibc perform some unreadable sorcery.[2]

Edits

  • Edit 1: Changed the title.
  • Edit 2: Noted that the implementation of deallocate_unaligned is UB.

Footnotes


  1. More specifically, it would allow in-place mapping of [T] to [U] when !T::IS_ZST && !U::IS_ZST && size_of::<T>() % size_of::<U>() == 0. ↩︎

  2. However, for instance, rulloc's Block::from_allocated_pointer, used when deallocating, implicitly trusts the alignment of the provided layout. Yet, if you look at Bucket::allocate, you’ll notice it could compute alignment from the provided pointer instead of relying on the given layout’s alignment. ↩︎

Yes, asking for overaligned memory often results in a different allocation strategy than for normally-aligned memory, which the deallocator would need to know. Some deallocators can figure that out after the fact, others take advantage of not having to do so.

…hypothetically. I have no citations to show you at the moment.

5 Likes

An allocator may use the alignment provided in the requested layout to e.g. select a different arena to allocate from. The resulting allocation may then be more aligned than requested, but you still need the original requested alignment to select the correct arena to deallocate from.

As such your proposed implementation of deallocate is UB because it may call deallocate with a different alignment than the one originally requested.

3 Likes

Thanks for the feedback. Yes, you are right—this implementation is undefined behavior (UB) because it relies on the Allocator::deallocate method, which requires the same layout that was used during allocation. A better approach would be to use a helper trait, something like this:

/// An allocator that doesn't require a layout for deallocation, growth, or shrinking.
pub trait OpaqueAllocator: Allocator {
    // Required methods
    unsafe fn deallocate(&self, ptr: NonNull<u8>);

    // Provided methods
}

Types that benefit from reusing existing memory could take advantage of specialized implementations for this kind of allocator.

That said, would the Rust-lang team consider adding such a trait if the performance benefits were substantial?

Currently, even the default allocator on windows requires knowing the alignment in order to properly deallocate memory. rust/library/std/src/sys/alloc/windows.rs at master · rust-lang/rust · GitHub

6 Likes

Rust has taken the opportunity to define a more useful allocator interface that unlocks more efficient allocator designs in the future. The requirement to pass the correct alignment is part of unlocking extra efficiency in allocator design.

So yes, the requirement is annoying, but we've collectively decided that it's worth it.

1 Like

Thanks for pointing me to Rust's default allocation method! I had been searching for it but couldn't find it.