Pre-RFC: Contextual paramaters

binarycat · November 10, 2024, 9:51pm

Problem Statement

Within certain codebases, there is certain state that needs to be made available to almost every function. Examples are loggers, async runtimes, and object capabilities.

Frequently, global variables are used for this purpouse, but this can cause problems, such as the common problem of awaiting a future that expects a tokio runtime without first spawning such a runtime.

Proposal

Add a new context binding mode. This can appear on any identifier pattern, but has additional meaning when used on a function parameter.

When a function parameter is annotated with context, it is a contextual parameter. contextual parameters must come after all other non-contextual parameters. It is an error for a function to have two contextual parameters of the same type.

When a function with contextual parameters is called, some or all of those parameters may be ommited. if they are, the compiler searches the scope of the calling function for a binding of the correct type that is annotated with context^[1], and uses those bindings as the argument^[2].

Example

fn do_thing(context rt: Executor<'_>) {
   // <run whatever async tasks>
}

fn main() {
  // note that `context` comes after all other binding modes
  let mut k#context executor = smol::Executor::new();
  do_thing();
}

This may be a contextual parameter, or any other binding ↩︎
it recommended that context bindings implement Copy, as otherwise they will be moved ↩︎

jhpratt · November 11, 2024, 1:43am

See Tracking Issue for externally implementable items · Issue #125418 · rust-lang/rust · GitHub, which is the most recent proposal to do this (and more flexible!)

binarycat · November 11, 2024, 2:02am

that's... completely different? that seems to basically be an equivelent to C's weak linkage, where I am proposing something like scala's context parameters.

are you confusing bindings with items?

do you think they're the same because they can both be used to customize logging behavior? that's like saying mut statics are the same thing as the builder pattern.

one is process-global and requires integration with linkers, what i'm suggesting would be local to a specific call stack (thread, task, etc.) and would be purely syntactic sugar.

jrose · November 11, 2024, 2:35am

We do have one of these today, sort of: #[track_caller]. That’s not quite the same, of course, since in practice it’s guaranteed to have a compiler-generated source, but it is an example of an implicit parameter at the ABI level.

jhpratt · November 11, 2024, 4:24am

Ah, you want it for a specific call stack. I (wrongly) assumed your intent given the problem statement.

josh · November 11, 2024, 5:12am

You may want to look at this post about "contexts and capabilities" from a few years ago. There's still an ongoing desire to consider something like that.

DasLixou · November 11, 2024, 5:29pm

It seems to have a similar intent as Pre-pre-RFC/Working Prototype: Borrow-Aware Automated Context Passing - #2 by Radbuglet Just posting to link those

mathstuf · November 11, 2024, 9:48pm

Presumably you mean "two contextual variables of the same type in a given scope" as otherwise the automatic selection is ambiguous if there are two local context variables of the same type. There are also issues if the context parameter is something of &dyn ContextTrait or the like and multiple implementers are in scope. Let's say we have a context wrapper that logs or something:

struct LogContext<C>(Context);
impl<C> ContextTrait for LogContext<C> where C: ContextTrait { /* impl */ }

fn uses_context_with_logging(context ctx: &dyn ContextTrait) {
    let k#context wrap = LogContext(ctx);
    use_ctx(); // Does this get `ctx` or `wrap` if it wants a `&dyn ContextTrait`?
}

matthieum · November 15, 2024, 6:06pm

I am not sure that loggers / reporting / telemetry are a good example, here.

I think it's important to make a distinction between:

Dev Ops: logging, reporting, telemetry, ...
Function: async runtime, object capabilities, ...

The difference between the former and the latter being that the former should NOT affect the function of the library/application, and is purely of use for the developers/operators. Or otherwise said, if one where to deactivate all logging, reporting, telemetry, etc... either at compile-time or run-time, the user should not experience any difference (beyond performance fluctuation, perhaps).

Is that such a problem in practice?

If you develop a tokio application, you're likely to have #[tokio::main] on main, and thus all your code will be executed within the context of a runtime.

I think this argument would be better off mentioning the importance of making implicit dependencies explicit, in a way that the compiler can machine-verified them, rather than as just a comment in the documentation.

Weaknesses of context parameters

Which really should be mentioned in the RFC.

Plumbing Pains

One advantage of implicit (global) parameter passing is that the caller of the thingy need not be aware of the capabilities required by the callee.

This is all the more important for logging for example. Imagine if each time you would want to add a log statement, you would first have to pass the logger through 4 or 5 callers, and then modify all the call-sites of those callers since they now require the logger argument too.

This is pure pain, and why I strongly believe that using context parameters for logging is a mistake. The ergonomics are just terrible.

Borrowing Pains

It should be noted that depending on the framework, there may not be any borrowing issue when using a global: if there's no re-entrancy path -- ie if no user-code is executed while borrowing the global -- then the global can be safely borrowed mutably.

This means that the global can pass through agnostic 3rd-party code without issue.

If using context parameters, however, unless the 3rd-party code which invokes the function wishing to use the context parameter actually passes them -- the exact same types, too -- then the only way to access this parameter is to "bundle it" in the callback. Very quickly this'll require reference-counting, which itself will require some cell, ... this leads to both a memory overhead, and possibly a performance overhead, for example for RefCell and Mutex.

In this sense, a global is more ergonomic...

epage · November 15, 2024, 6:46pm

There are a couple of RFCs like [RFC] externally implementable functions by m-ou-se · Pull Request #3632 · rust-lang/rfcs · GitHub that could replace the need for Context parameters for some use cases, like logging, assuming you don't need to supplant one of these within a given scope which is the big benefit for Context.

binarycat · November 15, 2024, 8:48pm

It's not a problem for applications, but it is a problem for libraries, and it part of the reason why alternative async runtimes struggle so much. just looking at the documentation of reqwest, can you tell me if it requires a tokio runtime? I honestly have no idea what the answer is.

parasyte · November 15, 2024, 11:45pm

Yes.

Not trying to be critical of the concern. It might be ideal to make global contexts apparent in the API surface area. But some of those details can be inferred through other means, like the dependency tree in this case.

I'm sympathetic to some of the example use cases in tmandry's article. Particularly arena allocator, string interner, wasmtime fuel, etc. Some others are less concerning: global logger, global metrics, global async runtime. These global contextual items usually don't have a good reason to change ^[1] at runtime. "Set it and forget" is good enough.

Or be "context sensitive", as it were. ↩︎

binarycat · November 15, 2024, 11:57pm

Merely depending on tokio at compile time does not mean something depends on a tokio runtime, if you only use tokio for its traits, your code will be portable across runtimes. However, most functions in tokio::task^[1] require a tokio runtime.

which reqwest happens to use ↩︎

parasyte · November 16, 2024, 12:24am

True, but niche. Do you have a better example that illustrates the point?

binarycat · November 16, 2024, 1:07am

how is an issue that effects all of async rust niche?

parasyte · November 16, 2024, 1:12am

Depending on tokio for its traits is niche, and has little to do with contextual parameters. I am asking if you have a better example than reqwest for documentation that doesn't tell you if the API depends on the runtime or not.

2e71828 · November 16, 2024, 5:19am

In one of my hobby projects, I have two async runtimes active concurrently: One is single-threaded to run the display and the other is multi-threaded for long-running background tasks. Any Future that interacts with the graphics card can only run on the former, but as far as I can tell there isn’t currently a way to enforce this at compile-time.

Nemo157 · November 16, 2024, 8:32am

hyper is such a crate, it uses tokio’s io traits and allows you to provide your own executor, timer etc. so it can run on a non-tokio runtime.

josh · November 16, 2024, 9:01am

There's a world of async out there that doesn't use tokio, and it's very easy to end up being surprised by a library that expects a tokio runtime without being a tokio application.

Also, even if you're using tokio, it's still quite possible to have a thread started with std::thread::spawn that isn't running a runtime.

matthieum · November 16, 2024, 11:42am

That's a good one. I always run single-threaded, so it hadn't occurred to me.

I wonder if this should be considered a "hole" in tokio (or any runtime, really). Perhaps there should be a way to tell tokio that any "orphan" thread should automatically join "this" multi-threaded runtime group unless otherwise overridden.

With that said, I do agree that explicitness of which runtime to use on which thread is useful, and that function parameters are a good way to make such requirements explicit.

I do wonder whether context parameters are the best way, though. Especially the requirement to plumb them through all intermediate layers.

An alternative would be:

Allow getting a handle to the current tokio runtime via a runtime::get_handle function, which reads a global/thread-local variable. And its sister API, runtime::get_handle_or, which executes the callback to create the runtime if it doesn't already exists before returning a handle to it..
Pass that handle at API boundaries.

It makes the library/module dependency explicit, allows a library to easily provide support for alternative runtimes (notably by asking for a trait object, with the trait implemented for multiple runtimes), doesn't require plumbing all the way through, and allows the caller to realize that maybe they should spawn a runtime.

No language change necessary; it's a pure library concern.

Topic		Replies	Views
Blog post: Contexts and capabilities in Rust language design	69	13125	February 7, 2023
Contextual keywords and type language design	4	1305	March 25, 2019
pre-pre-RFC: Execution Context language design	8	1524	March 25, 2019
New Context type for structured, scoped shared state language design	2	658	January 18, 2024
Non-generic statics in generic contexts	15	1116	February 7, 2022