At the moment, Rust's std and alloc crates assume the existence of a single global allocator, and that it's impossible to run out of memory. These are great assumptions for user-land processes, but, in principle, perfectly robust and perfectly re-usable library would love to opt-out of those.
My understanding is that "perfect" libraries, which do not want to bake this assumption in, would like to either accept a caller-provided allocation, or an allocator. Zig's APIs are a great example here --- every function that can allocate has an a: &Allocator parameter.
However, I actually haven't seen libraries, parameterized by the alloctor, in the wild. Libs like regex or fontdue look like they could, in theory, expose an API which use caller-provided allocator, but they are written under the alloc assumption anyway.
Why is this the case? Is it because no one actually needs this additional flexibility? Or is it because Rust type-system makes makes expressing this pattern significantly more awkward than just using a global allocator?
I don't think that's the case here, it seems this doesn't need any language/library features? The ongoing custom allocator work is orthogonal -- it makes the global infailable allocator model more flexible, but it doesn't fundamentally change it.
In more concrete terms, it seems that we have all the building blocks here since a long time? I am imagining an API like this could work:
Obviously, there are a lot of details library designers would have to nail down here, but, fundamentally, there's no requrenment that the API is provided by std.
A big problem is awareness and lack of examples. If you do not know more about allocators than that multiple exist and the standard library is at least my reference for good API design, not finding anything of help results in me ignoring custom allocators. Also, unsafe, if you try to keep it simple. Less unsafe is better.
I think it wouldn't be a bad idea to make that more flexible by:
Putting the functionality behind a trait, allowing for one of a multitude of potential allocators. This does assume that each allocator can adhere to the same API, which I don't know for a fact to be true.
Is there a way to life those fns into the safe realm? It's psychologically unsettling to see unsafe sprinkled all over code. These days I kind of see lots of unsafe in a crate as a bit of a red flag, and this would drop buckets of unsafe all over collection-like crates.
I mean, yeas, actually making something usable,safe-ish & ergonomic of out RegexAllocator would require a bunch of design work, but that's besides the point.
The point is that work isn't blocked on anything language-wise, but I find it surprising that I can't name a single library which tries to do that. So I am wondering why is that the case?
This I can answer from my own subjective experience: it's just easier to rely on the global allocator, and it's seldom an actual goal for me to control allocation strategy. The implicit assumption there is that the global allocator is "good enough".
I suppose the actual design of such a struct would be close to Waker/RawWaker? It'd be unsafe to construct but there is a public representation, and the use should be safe. It's not quite obvious how to do all of this in a user-defined crate outside of std if one wants to potentially make use of the vtable-hacks for a performant integration with trait impls.
That's not quite true. With regards to allocated data structures there are a lot of paper cuts that the standard libraries' containers solve by virtue of being integrated with the compiler and using unstable features. For example, coercion of Box<T> -> Box<dyn Trait>, as well as a couple of const methods, and the #[may_dangle] attribute. I suppose a fully custom alloc, independent of its GlobalAlloc but with functionality comparable to it, is slowly getting possible but currently you have to be quite inventive to work around restrictions. Plus, you need to duplicate a fair amount of code for the actual data structures.
In other regards, without-alloc tries to do something similar but actual integration with non-global allocators was halted on said paper cuts and the unclear direction of the allocator traits.
Still, I consider this path easier than retrofitting full allocator customization into std and there are a lot of lessons we can learn from the manner in which this was done in C++.
The std::alloc::Allocator trait exists... on nightly. I think there are some takeaways from that trait. For example, the API isn't that simple. We need to take alignment into consideration, being able to grow an allocation is common, etc. (If those functionalities aren't given, the burden on every library maintainer is even greater -- allocate and copy every time you need to grow, etc. The needs of the many library maintainers outweigh the implementation burden of the few allocator maintainers.)
Also, if I'm a library maintainer and I know this is in the pipeline, do I really want to define some custom type for supporting fallible allocation? If there's going to be a standard way to do this in the ecosystem generally, I'd be setting myself up for churn by (re-)creating my own. Additionally, allocators are infectious. What if Regex allocates a Vec? The gains are quite limited until the standard library types support custom allocators. Certainly I'll want to use the same interface that they do.
What if I limit myself to allocating types outside of the standard library? Hopefully they all offer such an interface too... that takes a compatible allocator type. Without using a standard and common approach, the risks of an incompatible soup of allocation approaches seems high. Isn't defining this interface a role the standard library should fulfill?
I also suspect demand is low. The common case is to not care about the allocation. If you do decide to support some sort of allocation struct (or trait) and implement it, you'll need your own unsafe allocate, write data, and transmute function for creation... and one for drop, too. You'll also need to add some reference to the non-global allocator to your data structure. Every mutating method that potentially grows your allocation will need a second version that returns Result<_, AllocationError>. If there's not a lot of people clamouring for it, all those costs may be too great to justify.
I think you need to ask why C libraries use custom allocators?
The cases where I've used custom allocators in C are kinda C-specific:
instrumentation to log or check all allocations and deallocations. That's because these are often buggy in C programs, so it's necessary to watch them.
memory pools. Related to the point above - deallocation is hard in C, and freeing an entire pool is easy. Mempools are also used for performance, but modern allocators aren't that slow, and Rust is reasonably good at avoiding allocations in the first place.
soft memory limits. That's still useful in Rust, but a global limit is often good enough. I use cap in every service. The bigger problem is that Rust hard aborts on OOM instead of panicking.
enforcing bigger alignment for SIMD. malloc aligns for double, but not for 4Ćdouble. This is not great in Rust (e.g. in case of Vec<u8>), but doable where #[repr(align())] or slice::align_to can be used.
returning "owned" objects from C libraries, which implies that the caller frees them with their free(). In some environments (e.g. Windows DLL) this is not safe, because each library's free may be incompatible. However, Rust has Drop, so it can enforce use of the correct (de)allocator every time.
AIUI, the biggest reason library types aren't parameterized by allocator is that std isn't. 99% of libraries use std/alloc abstractions to handle allocation, rather than do so directly, so only can be as generic as std is.
Also, nobody really wants to add another pointer just to be generic over the dealloc, which will be global dealloc 99.9% of the time.
As for runtime injection rather than type parametrization, the problem is Drop. The type has to know how to dealloc itself at an arbitrary timing, without outside input. Also, providing the wrong de/alloc to an allocating method would be super quick UB.
How does this handle mixing allocators? Can I do the equivalent of two pushes to a vector with two different allocators?
My instinct, here, is that doing at the type/object level (and not the method level) is going to be right for Rust, and that's what the allocator WG is working on with things like Box<T, A>.
I think it isn't widespread in the ecosystem because the ecosystem follows std's lead, for good or bad. The same reason dynamic dispatch isn't used much in the ecosystem, even in cases where it would clearly be the right trade off (e.g. a huge amount of code getting monomorphized with minimal chance of optimization benefits, not in a hotpath in any reasonable application).
They're also waiting for libstd to stabilize an Alloc API which will be standard, so that they don't have to break compatibility to upgrade to it in the future. This project has been ongoing for many years now and hasn't shipped yet, for various reasons.
I recently had an experience where it seemed likely plausible that we would have benefited from allocating some serde deserialization into a local arena allocator; we just switched our global allocator from malloc to jemalloc and it was Good Enough.
I agree that type parameters is a better approach than fn parameters - for convenience - but there's nothing wrong with mixing allocators if both of their heaps outlive the value being allocated into them (Zig isn't memory safe, so I guess that's the user's responsibility to ensure).
In my case, and since you picked on regex here (:P), I think it is actually exactly the point. In general, I view putting an allocator in the public API as a complication of the public API. That complication needs to either be worth it or it needs to not be a complication. In Zig land, I think it would "not be a complication," because the entire ecosystem is built around a custom allocator design. So I think this probably mirrors the "nobody does it because std doesn't do it" explanation that others have given.
As for whether it would be worth it... It's legitimately unclear to me who exactly would use regex if it had a custom allocator API that is otherwise not using it today. I think that goes for a lot of my libraries. The "custom allocator" use cases are things I am very ignorant of, because I don't tend to work in that space. And because Rust doesn't have great support for custom allocators, I think we are probably not seeing at least some set of folks that consider such a thing table steaks. So there's a negative feedback loop here, probably. And I don't know how to measure it.
This. Multimedia crates like lewton would very much like to use a custom allocator with a bounded capacity to prevent OOM conditions, but it's very awkward to implement and use - so much so that it never went past a prototype.
It would not even have to expose this to the user on every function, but merely use it internally, and even that is already too awkward.
How does this handle mixing allocators? Can I do the equivalent of two push es to a vector with two different allocators?
Zig does both in this case. It has an ArrayList and an ArrayListUnmanaged. The former stores the allocator as a field to be reused, the latter takes an allocator on every method.
The biggest issue in providing custom allocation support in the 'user' ecosystem is that most user crates that want to allocate use Box and Vec and since those don't support custom allocators, any support for custom allocation would require custom versions of Box and Vec. I imagine that the last thing someone wanting to write regex wants to do is to start by implementing an allocator aware versions of Box and Vec. This means that only environments that can't avoid custom allocation actually invest time in rewriting the most basic types and everyone else just uses std.
Interesting! I would actually expect crates that want to be flexible with respect to allocation to mostly not use Vec, Box, String and similar individually owned containers.
To take example from the lewton crate, it defines the following type:
If I were writing this as a re-usable C library, I would probably(not that I actually have any sustantional C experience) try to avoid this nested Vec of Vecs. Instead, Iād allocate the whole header as a single blob of bytes, which stores the strings inline, with some meta information about them at the start of the block.