Deallocate Vec without needing capacity

I recently read Perhaps Rust needs "defer" and was a little surprised that the author couldn't find a sound way to drop a Vec with just the pointer and length, and no capacity. After all, the implementation of drop for a Vec doesn't need to know the capacity, just the pointer (to deallocate memory) and the length (to drop items within the Vec).

Playing around with miri, I did find a couple of methods that do currently work without needing the capacity, but I'm not entirely sure they are sound.

The first is to pass the length as the capacity when calling from_raw_parts. However, that explicitly violates the safety condition documented on from_raw_parts that " capacity needs to be the capacity that the pointer was allocated with."

The second is to use ptr::slice_from_raw_parts_mut to get a slice from the pointer and length, then call Box::from_raw on that, and drop the resulting Box. This at least doesn't obviously violate any safety conditions I'm aware of. But it also isn't obviously sound either. It isn't entirely clear to me if the Layout of the memory allocated for a Vec<T> is guaranteed to be the same as for a Box<T[]>.

I think that this could be addressed in a number of ways:

  • Add a new method with a signature like fn drop_in_place(ptr: *mut T, len: usize); on the Vec struct that can be used to drop and deallocate a Vec, without needing to keep track of the capacity.
  • Relax the safety requirements on Vec::from_raw_parts so that capacity must be <= the capacity it was allocated with, and >= the length, so that if you no longer have access to the original capacity, you can still use the length as the capacity to reconstruct the vec. From what I can tell, the only issue using a smaller capacity could cause (provided it at least as large as the length), is that if you add additional items you could end up needing to realloc sooner than necessary.
  • Document using slice_from_raw_parts_mut and Box::from_raw as a way to accomplish this, and garantee that it is safe to convert between a Vec<T> and Box<[T]> using the raw pointer and length as an intermediate step.

This only works because the specific allocator does its own bookkeeping of the size of the allocations (which is the norm in c an c++) . But because the standard dealloc function takes the Layout it is possible to replace it with an allocator that doesn't do this kind of bookkeeping.

10 Likes

It isn't; Vec::into_boxed_slice must reallocate without any excess capacity. (But if length and capacity are equal, the std API for this conversion does not do any reallocation.)

The other direction is <[_]>::into_vec or a From impl.

5 Likes

The capacity is needed to deallocate the memory, since it's required by the Allocator/GlobalAlloc trait.

The layout of the memory allocated with a Vec<T> is the same as an array of Ts with length equal to its capacity. So you still need to know its capacity to do this.

1 Like

Rust's allocator API uses Layout to identify allocations, and that includes the capacity (allocated size) in deallocation. Allocators are allowed to take advantage of this, e.g. have separate pools of objects grouped by the size, and find the correct pool based on the size from Layout instead of looking it up from the pointer.

I hoped that at least the unused size could be optimized out for allocators that don't need it, but unfortunately Rust allocators are not implemented like other types. Instead of being called directly in an inlineable way, they are called through an extern __rustc_dealloc method, which can't be inlined, because for LLVM optimizations it must have a special function attribute marking it as the deallocator, but LLVM very disappointingly loses function attributes when it inlines a function.

3 Likes

We could recover this by having allocator attributes Const flags describing allocator properties · Issue #124 · rust-lang/wg-allocators · GitHub

2 Likes

Ok, so at least for some allocators (at least theoretically), it isn't sound. And a library wouldn't even be able to rely on the Global allocator, because it might be a custom allocator.

But what about for allocators that don't need the size for deallocation, such as malloc and jemalloc? If an application knows it is using an allocator that doesn't need that information, is it ok to convert the vec to a box without a realloc?

No because you can't rely on implementation details. Some code might make decisions based on the capacity or pass assumes to LLVM which makes violating the invariants UB.

2 Likes

No, but it's ok for the implementation of the allocator you're using to check that case and make the realloc always trivially succeed.