pre-pre-RFC: Execution Context

Summary

Introduce a notion of execution context to Rust.

Motivation

This is primarily motivated by an issue with the Alloc trait that has been discussed in Tracking issue for custom allocators in standard collections (and possibly elsewhere, although that’s all I could find).

In the current Alloc trait, dealloc is a method. This means that if you want to create a type which takes an A: Alloc type parameter, and you want to have some of your code call dealloc, you need to store an instance of A (or a reference to A) internally. For the global allocator, A is a ZST, so this doens’t matter. For non-global allocators, however, A or a reference to it is usually going to be at least a machine word in size.

For types that store large objects (e.g., Vec), this isn’t a big deal. For types which store small objects, however, this is a huge deal. Box<T, A: Alloc>, for example, doubles in size at a minimum from storing a pointer to T to storing a pointer to T and a reference to A. If you’re implementing something like a linked list or a tree, this represents a near-doubling of the size of your data structure.

In my mind, this points to a broader point about Rust: Rust’s story that resource cleanup always happens safely via ownership + drop implicitly relies on ambient authority. For example, Box<T> can only dealloc its contents on drop because the global heap is ambient authority. When resource cleanup needs to rely on non-ambient state, we’re left with no good options. Storing an A: Alloc is such a case.

Guide-level Explanation

My proposal is to introduce a notion of “execution context” to Rust. It would allow ambient authority to be replaced by context-scoped state. An execution context would be:

  • Set by a caller
  • In place for a function call (or a scope within a function call) and all of its sub-calls
  • Any code could query for the current execution context
  • Any code could set a sub-context which would override the existing context for the duration of some smaller scope, similar to variable shadowing

It would be very similar to Go’s context, Scheme’s parameters, or Haskell’s Reader monad. In the case of an allocator, if a custom (non-heap) allocator were used (for example by a particular data structure), code (for example, methods on that data structure) would set the allocator as an element of the execution context. drop implementations - crucially, including Box - would have access to the allocator in order to perform deallocation without needing to store it inside themselves.

Reference-level Explanation

There is none. I don’t actually know how we’d implement this - I just want to prompt discussion.

Some important questions include:

  • Would this need explicit language support?
  • How can we ensure that this works in no_std?

I’m especially interested to hear feedback on:

  • Would this actually solve these sorts of ambient authority problems, or are there holes in this idea that folks notice?
  • Are there other good examples (besides custom allocators) of Rust code that would benefit from such a system?

How would you make sure a Box gets deallocated using the right allocator?

1 Like

That's a good question; I'm not sure. I wonder if we could somehow use move semantics to ensure that once an object is created inside of a context with a particular allocator, it could never be used outside of that context.

E.g., you could imagine an AllocWith<T, A: Alloc> wrapper that had an interface such as:

impl<T, A: Alloc> AllocWith<T, A> {
    fn with<O, F: Fn(&T) -> O>(&self, f: F) -> O {
         let _ctx = set_context(&self.alloc);
         f(&self.t)
         // _ctx goes out of scope
    }
}

Alternatively, I wonder if there's some way that we could leverage the lifetime system to have lifetimes carry contexts. That way, you could safely have AllocWith implement Deref and DerefMut for T, and ensure that the appropriate allocator was set for the duration of the references returned by deref and deref_mut. I have a feeling that would require language support, although maybe there's some clever way to create an object and use the lifetime system to ensure that it lives as long as the references returned.

(credit to Eli Rosenthal for the AllocWith idea)

You could tie this together with the “Send” trait. I.e., have the context be an isolated world gated by Send.

Currently Send is usually used to isolate threads, but there’s no reason I can see why it should not be used for units smaller than that. Alternatively you could make a new trait similar to “Send” but specific to this purpose.

So is the idea that just as a thread is an execution context, and the compiler knows what operations constitute sending between those two contexts, you could just generalize to any pair of contexts, even within the same thread? So you might have something like unsafe trait SendContext {} and then unsafe trait Send: SendContext {}, and any T: !SendContext couldn't be sent between contexts?

Just don’t use a LinkedList then.

To add to discussion, here’s a method of attack that’s come out of discussion around embedded languages and GC:

Attach a unique lifetime to your Alloc context and give it out in a closure. So, used like:

MyCustomAlloc.with(|alloc| {
    let boxed = Box::with_alloc(5, alloc);
    // etc.
}

fn MyCustomAlloc.with(impl for<'opaque> FnOnce(MyCustomAllocRef<'opaque>));

This doesn’t solve how to get ahold of the allocator to deallocate (here alloc is effectively a PhantomData<invariant 'opaque>), but it prevents the closure from handing out any values tied to the lifetime, as the closure has to accept any lifetime.

That is,

let leakedAlloc;
MyGlobalAlloc.with(|alloc| leakedAlloc = alloc);

cannot be done without safe code to smuggle 'opaque into a different unrelated lifetime.

Perhaps relevant? Start of an effects system RFC (for async etc) - is there any interest in this?

Definitely relevant, although I think it doesn't solve 100% of our use cases. E.g., what happens if you really want to use a particular allocator for some of your allocations? The design described there would effectively redefine what the "heap" means in a particular code segment, which wouldn't allow you to use one allocator for a particular type of allocations and a different allocator for everything else.

EDIT: I have to think about this more. I may have spoken too soon.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.