Solving function ecosystem division

As Rust is becoming more widely used, many different developers need to use it. There is huge variety in their requirements. System developers want to be able to fully control where allocations occur [1]. And the casual user wants to have as little friction as possible, not having to worry about the unlikely scenario that the operating system is unable to supply them with memory.

These goals are at odds with each other. For the time being I think that language development has danced between these two worlds and tried to satisfy both equally. I feel like that this is going to become increasingly difficult in the future. While adding fallible functions to Vec has worked out better than providing FallibleVec [2], I still think having to use try_push instead of push in an environment where one is only allowed to use fallible allocation (the Linux kernel) is a detriment to ergonomics. Especially when we look at worse examples like Box::try_new_uninit. We do not need Box::new_uninit, as it panics, but we would like to reuse the name...

I have not done enough thorough research to find if this has been proposed already. This is because I have no idea what to search for. I will name this feature "profile" temporarily (yes it clashes with compilation profile, so it should have a different name, but I could not come up with a better one). There are probably many things that we could do different syntactically, please leave bike-shedding for later.

The solution

What if we solve this issue of having to cater to multiple diverging worlds at the same time? I think we could introduce so called "profiles". They can be used in an impl block like this:

impl<T> Vec<T> in @default {
    pub fn push(&mut self, item: T) {}
}

I have used @ as a marker for profiles. The @default profile is the default profile! It would need to include the whole of std for backwards compatibility. But now we could add some more profiles:

pub profile no_std;

impl<T> Vec<T> in @no_std {
    pub fn push(&mut self, item: T) -> Result<(), T> {}
}

With the magic of profiles, we can have the same function name, but different signature and functionality. While this change is not backwards compatible, I think it could be done over an edition. Also you are not locked out from using the function from another profile:

#![profile(no_std)] // this sets the default profile

fn main() {
    let mut v = Vec::new();
    v.push(42).expect("ran out of memory");
    <Vec<i32> in @default>::push(v, 42);// this will panic if there is not enough memory
}

This feature would need extremely good documentation. I think every function that is only available in a specific profile should be annotated similar to how std::os::unix is. There should be some easily accessible drop down menu to select the current profile regardless of the current position.

Future extensions

There are many ways to extend this feature. For example adding #![deny_profile(default)] which errors on <Vec<i32> in @default>::push(). Or allowing orthogonal profiles that can be enabled independently. Another more elaborate extension could be to add a profile @implicit_alloc and allow implicit allocations for dyn Futures returned by dyn TraitWithAsyncFn. This could be a bit extreme, but on the other and also be exactly what Rust needs to be more accessible.

Disadvantages

More complexity for API designers:

  • which profile should I choose?
  • when should I create my own profile?

More possibility for bad APIs:

  • why does this API use @default even though it is compatible with #![no_core]?!
  • these two methods from the same profile are not even remotely related!

"Profile soup"

Users might have to enable too many profiles at the same time, so we would end up with:

#![profile(no_std, foo, bar, foo_and_bar, )]

Users can still enable profiles that conceptually do not fit together:

#![profile(no_std, implicit_alloc)]

We could of course first only allow stdlib to create profiles, preventing the profile soup and we could lint against them not fitting together. But the first two problems still exists...


Please let me know what you think. For the moment I think we should constrain this feature to implementing inherent methods only.


  1. https://github.com/rust-lang/rfcs/pull/3271 ↩ī¸Ž

  2. Feedback from adoption of fallible allocations ↩ī¸Ž

This seems very similar to Tyler's proposal for capabilities and contexts.

Also, with the full benefit of hindsight, if we'd had ? as an ergonomic way to propagate failure, I wish that we had just made all these methods fallible.

1 Like

What do you mean by similar? That this problem can be solved by capabilities? I am not so sure, as the signature would be entirely different... But I might be misunderstanding you.

But this hurts readability:

environment where one is only allowed to use fallible allocation (the Linux kernel)

data.push(5);

Is it safe in the kernel? Maybe, if there is some config at the top of lib.rs?.

3 Likes

I do not really understand your points. Could you elaborate on

and

Because I think pushing an item into a vec is safe (why would it not be?) and I think in general this feature would improve readability.

What is the main benefit of having this over simply disallowing some functions in certain contexts? Rust doesn't have general function overloading, so I don't really get, why we should allow overloading here.

#![profile(no_std)] // this sets the default profile

fn main() {
    let mut v = Vec::new();
    v.try_push(42).expect("ran out of memory");
    <Vec<i32> in @default>::push(v, 42);// this will panic if there is not enough memory
}

if we adapt the try_ convention consistently, it shouldn't increase the workload significantly. Also there might be usecases where end users want to opt in to having an allocation failure available even on other setups, e.g.

let mut uintbuffer: Vec<u32> = vec::new(); // Some potentially huge vec
let mut special_snowflake_ids: Vec<u32> = Vec::with_capacity(10); // Some normal vec
if buffer.try_reserve(2 << 40).is_err() {
     // Downscale memory usage to avoid using such a huge buffer at the expense of performance
     // ...
}

It is really annoying when we cannot use the normal names for the operations. In the kernel no allocation may panic. That is just a fact. Why do we have to give up ergonomics for a feature that we are forced to use by our constraints?

My proposal also allows that:

let mut uintbuffer: Vec<u32> = vec::new(); // Some potentially huge vec
let mut special_snowflake_ids: Vec<u32> = Vec::with_capacity(10); // Some normal vec
if <Vec<_> in @no_std>::reserve(buffer, 2 << 40).is_err() {
     // Downscale memory usage to avoid using such a huge buffer at the expense of performance
     // ...
}

Honestly, it's hard for me to imagine a situation where one would want to use both panicking and fallible methods on Vec. Either you don't care about OOM and the most you do is occasional Vec::with_capacity or Vec::try_reserve, or you care about OOM and all panicking methods are a footgun which shouldn't be used.

This proposal sounds a bit like feature-gating parts of std, except that features are additive while profiles are not. There is also the possibility to use several profiles, but again, I find it hard to imagine why would one do it.

I believe in the kernel that you must use the try_ family of functions. And there should be lints/clippy that catches misuse.

1 Like

The whole point of the proposal is to permit using the same named function but with different semantic meaning. Why should we litter every call that might allocate with try_ when we do not have calls without try_? I think dropping the prefix when it is obvious results in better readability.

Which is why I think #![deny_profile(default)] is important.

The entire point of this proposal is to provide better ergonomics. I do not see a way to achieve this with features.

This is the part that scares me:

with different semantic meaning.

Sometimes push is push and sometimes it is try_push. I believe this causes more problems than it helps you.

15 Likes

I still don't understand why would you want to use both default and no_std profiles in the same crate.

Nor do I see any other possible profiles, or uses of profiles beyond the alloc crate. If it's just to solve the OOM panic issues, we need a finer scoped tool and not a brand new feature with unclear semantics. If it has uses beyond that, it would be nice to include proper motivation.

Features are additive, but there were always suggestions for something non-additive. Let's call it "alternatives", to distinguish from usual features. Now, if features and alternatives could be used in sysroot libraries, we could introduce no_oom and handle_oom alternatives for alloc. The functions would be gated by the corresponding alternatives. Downstream crates could introduce similar aternatives, if they want to support both fallible and infallible allocations.

The core issue that all proposals for fallible allocation hit is that fallibility is infectious. Anything which resorts to Vec::push now becomes fallible. Standard traits, like Extend, FromIterator and ToString no longer work, because they don't support fallibility. Modifying traits will introduce even more downstream problems.

3 Likes

This is a list of traits from stdlib, which would be unapplicable to the standard allocating objects (alloc::*) if fallible allocations were required:

  • All FromIterator and Collect impls. Can't collect if the allocation may fail.
  • All From<T> impls, for various T. For example, Vec has 10 From impls, while Box has 28. A few of them may stay, because the impl requires no reallocation (e.g. impl From<Box<[T]>> for Vec<T>), but most must become TryFrom impls/
  • Extend, Clone.
  • For Vec<u8>, Cow<T> and String, the impls of io::Write and fmt::Write will need to go, or at least to change the implementation significantly.
  • For String, Add<&str> and AddAssign<&str>.

That's just the traits directly implemented for the collections.

4 Likes

This also actually requires reallocation to ensure capacity == len.

Sorry, I meant the reverse. Edited.

On the other hand... no, not really, I don't think ? is sufficient here. It is the right behavior for desktop applications to by default abort (or at least unwind) on allocation failure, because allocation failure is either a programmer mistake (too large allocation to ever succeed) or going to be OOM killed by the OS.

Even without that caveat, now you're forcing every function to carry annotation for the allocation effect. Again, for a hosted program, allocation is just an ambient thing that you can do. Even Haskell and even Koka don't encode allocation as an effect.

Rust is first and foremost a practical language. I definitely think that encoding allocation in the default allocator as a capability is an interesting approach, but I also believe that Rust made the practical choice in making allocation infallible by default.

... This does give me an interesting idea, though. As a HUGE what if: Global had an associated allocation error type of !? Then you could gate the infallible allocation methods on Err=!. This doesn't solve the need for the try_ prefix, but it does offer an answer to enforcing use of fallible allocation.

(This also makes me realize I haven't considered how fallible allocation and Storage interact properly...)

11 Likes

Haskell and Koka are functional languages. Haskell is based on a garbage collector, while Koka uses reference counting with smart ownership analysis, but still supports arbitrary object graphs. This makes them unable to care about allocations by design, since the allocation process is a hard requirement for the semantics of the program and is hidden from the programmer.

While most application code wouldn't care about OOM, it would still be valuable to have explicit allocation effects. You may not care about OOM with a global allocator, but you may care about OOM with local arenas, you may care that the global allocations can be easily swapped with local ones, and you likely care which functions have the capability to allocate, for performance and correctness reasons (allocations are generally slow, and uncontrolled dynamic allocations can lead to resource exhaustion).

It would be super valuable to have reified allocation effects, just as it's great that we have reified exceptions (Result) and reified asynchrony (Future). It's just that it's really hard to fit in Rust's design, doubly so if you want backwards compatibility.

I have proposed that in the recent RFC, although it was offtopic and likely went unnoticed. But I doubt that it can be made to work in current Rust. If we want a seamless transition, then we want fn push(..) -> Result<(), !> to be usable wherever fn push(..) -> () is currently usable, but that is possible only if we introduce a type coercion Result<(), !> -> (). Coercions are a huge load of complexity and hidden issues, so it's unlikely to fly. Also, I don't see how this approach would solve the infectious ecosystem split that I talked about above (i.e. what would you do to Clone, Extend and FromIterator traits?).

This would be an easier sell if we didn't have to care about backwards compatibility, but still, it requires at least some language features which currently don't exist in Rust.

7 Likes

But this would be enabled on a module/crate level, in the kernel the current push semantics will not occur.

You already mentioned the only real use case:

I just wanted to present this idea. I think we should solve the problem of splitting the ecosystem. Personally I do not think that plastering try_ methods everywhere is going to be the solution. It is distracting for people who do not use them and annoying for people who do not use the normal versions.

I agree, these traits are not usable in a fallible allocation world. Can we improve this situation? I think we could at some point extend Profiles to trait definitions and implementations.

While adding fallible functions to Vec has worked out better than providing FallibleVec

I don't understand why the linked post doesn't even discuss the mere possibility of having cheap conversion between those types. Some Into/AsMut impls should significantly reduce the friction (granted, not zero friction)

What if we solve this issue of having to cater to multiple diverging worlds at the same time? I think we could introduce so called "profiles". They can be used in an impl block like this

This only seems to solve the naming problem, for which types are sufficient imo. It doesn't solve the other problem of detecting panicking code paths in 3rd party crates.

I think to enforce this in a watertight way we'd need an effects system and mark functions with a no_oom_panic (or its negation?) effect. And crates/modules/impls could then blanket-apply that effect to all their functions to avoid repetitive annotations.

1 Like