Within certain codebases, there is certain state that needs to be made available to almost every function. Examples are loggers, async runtimes, and object capabilities.
Frequently, global variables are used for this purpouse, but this can cause problems, such as the common problem of awaiting a future that expects a tokio runtime without first spawning such a runtime.
Proposal
Add a new contextbinding mode. This can appear on any identifier pattern, but has additional meaning when used on a function parameter.
When a function parameter is annotated with context, it is a contextual parameter. contextual parameters must come after all other non-contextual parameters. It is an error for a function to have two contextual parameters of the same type.
When a function with contextual parameters is called, some or all of those parameters may be ommited. if they are, the compiler searches the scope of the calling function for a binding of the correct type that is annotated with context[1], and uses those bindings as the argument[2].
Example
fn do_thing(context rt: Executor<'_>) {
// <run whatever async tasks>
}
fn main() {
// note that `context` comes after all other binding modes
let mut k#context executor = smol::Executor::new();
do_thing();
}
This may be a contextual parameter, or any other binding ↩︎
it recommended that context bindings implement Copy, as otherwise they will be moved ↩︎
that's... completely different? that seems to basically be an equivelent to C's weak linkage, where I am proposing something like scala's context parameters.
are you confusing bindings with items?
do you think they're the same because they can both be used to customize logging behavior? that's like saying mut statics are the same thing as the builder pattern.
one is process-global and requires integration with linkers, what i'm suggesting would be local to a specific call stack (thread, task, etc.) and would be purely syntactic sugar.
We do have one of these today, sort of: #[track_caller]. That’s not quite the same, of course, since in practice it’s guaranteed to have a compiler-generated source, but it is an example of an implicit parameter at the ABI level.
Presumably you mean "two contextual variables of the same type in a given scope" as otherwise the automatic selection is ambiguous if there are two local context variables of the same type. There are also issues if the context parameter is something of &dyn ContextTrait or the like and multiple implementers are in scope. Let's say we have a context wrapper that logs or something:
struct LogContext<C>(Context);
impl<C> ContextTrait for LogContext<C> where C: ContextTrait { /* impl */ }
fn uses_context_with_logging(context ctx: &dyn ContextTrait) {
let k#context wrap = LogContext(ctx);
use_ctx(); // Does this get `ctx` or `wrap` if it wants a `&dyn ContextTrait`?
}
I am not sure that loggers / reporting / telemetry are a good example, here.
I think it's important to make a distinction between:
Dev Ops: logging, reporting, telemetry, ...
Function: async runtime, object capabilities, ...
The difference between the former and the latter being that the former should NOT affect the function of the library/application, and is purely of use for the developers/operators. Or otherwise said, if one where to deactivate all logging, reporting, telemetry, etc... either at compile-time or run-time, the user should not experience any difference (beyond performance fluctuation, perhaps).
Is that such a problem in practice?
If you develop a tokio application, you're likely to have #[tokio::main] on main, and thus all your code will be executed within the context of a runtime.
I think this argument would be better off mentioning the importance of making implicit dependencies explicit, in a way that the compiler can machine-verified them, rather than as just a comment in the documentation.
Weaknesses of context parameters
Which really should be mentioned in the RFC.
Plumbing Pains
One advantage of implicit (global) parameter passing is that the caller of the thingy need not be aware of the capabilities required by the callee.
This is all the more important for logging for example. Imagine if each time you would want to add a log statement, you would first have to pass the logger through 4 or 5 callers, and then modify all the call-sites of those callers since they now require the logger argument too.
This is pure pain, and why I strongly believe that using context parameters for logging is a mistake. The ergonomics are just terrible.
Borrowing Pains
It should be noted that depending on the framework, there may not be any borrowing issue when using a global: if there's no re-entrancy path -- ie if no user-code is executed while borrowing the global -- then the global can be safely borrowed mutably.
This means that the global can pass through agnostic 3rd-party code without issue.
If using context parameters, however, unless the 3rd-party code which invokes the function wishing to use the context parameter actually passes them -- the exact same types, too -- then the only way to access this parameter is to "bundle it" in the callback. Very quickly this'll require reference-counting, which itself will require some cell, ... this leads to both a memory overhead, and possibly a performance overhead, for example for RefCell and Mutex.
It's not a problem for applications, but it is a problem for libraries, and it part of the reason why alternative async runtimes struggle so much. just looking at the documentation of reqwest, can you tell me if it requires a tokio runtime? I honestly have no idea what the answer is.
Not trying to be critical of the concern. It might be ideal to make global contexts apparent in the API surface area. But some of those details can be inferred through other means, like the dependency tree in this case.
I'm sympathetic to some of the example use cases in tmandry's article. Particularly arena allocator, string interner, wasmtime fuel, etc. Some others are less concerning: global logger, global metrics, global async runtime. These global contextual items usually don't have a good reason to change [1] at runtime. "Set it and forget" is good enough.
Merely depending on tokio at compile time does not mean something depends on a tokio runtime, if you only use tokio for its traits, your code will be portable across runtimes. However, most functions in tokio::task[1] require a tokio runtime.
Depending on tokio for its traits is niche, and has little to do with contextual parameters. I am asking if you have a better example than reqwest for documentation that doesn't tell you if the API depends on the runtime or not.
In one of my hobby projects, I have two async runtimes active concurrently: One is single-threaded to run the display and the other is multi-threaded for long-running background tasks. Any Future that interacts with the graphics card can only run on the former, but as far as I can tell there isn’t currently a way to enforce this at compile-time.
There's a world of async out there that doesn't use tokio, and it's very easy to end up being surprised by a library that expects a tokio runtime without being a tokio application.
Also, even if you're using tokio, it's still quite possible to have a thread started with std::thread::spawn that isn't running a runtime.
That's a good one. I always run single-threaded, so it hadn't occurred to me.
I wonder if this should be considered a "hole" in tokio (or any runtime, really). Perhaps there should be a way to tell tokio that any "orphan" thread should automatically join "this" multi-threaded runtime group unless otherwise overridden.
With that said, I do agree that explicitness of which runtime to use on which thread is useful, and that function parameters are a good way to make such requirements explicit.
I do wonder whether context parameters are the best way, though. Especially the requirement to plumb them through all intermediate layers.
An alternative would be:
Allow getting a handle to the current tokio runtime via a runtime::get_handle function, which reads a global/thread-local variable. And its sister API, runtime::get_handle_or, which executes the callback to create the runtime if it doesn't already exists before returning a handle to it..
Pass that handle at API boundaries.
It makes the library/module dependency explicit, allows a library to easily provide support for alternative runtimes (notably by asking for a trait object, with the trait implemented for multiple runtimes), doesn't require plumbing all the way through, and allows the caller to realize that maybe they should spawn a runtime.
No language change necessary; it's a pure library concern.