Pre-pre-RFC/Working Prototype: Borrow-Aware Automated Context Passing

tema2 · August 4, 2024, 11:56am

I think an IDE can definetly help with that, because it can know at all times precisely which kinds of context a function has.

The default silent way of the feature is in fact global inference of type of that last implicit ctx argument, which you today do by hand.

Radbuglet · August 4, 2024, 11:57am

Bumping Pre-pre-RFC/Working Prototype: Borrow-Aware Automated Context Passing - #58 by Radbuglet since I feel like this is exactly what you’re looking for.

tema2 · August 4, 2024, 12:46pm

Q: if we supply a mutable ref inside of ctx, can we have two futures created with single context? what about spawning here? unfortunate if we cannot do that.

A: my take on that was that we just forbid anything but shared refs as ctx due to this issue. But even then:

Q: case of traits - if we bound impl trait to a context does the type implement trait outside of it? is ctx clause implied by trait bound then? how to deal with bounds on methods vs bounds on traits?

Q: what to do with defered execution stuff (async, generators, closures, more to come?) do we need to capture ctx upon creation which is desirable for allocator for example, or do we want to supply it all on the each resume, like for logger? is closure going to be callable outside of the ctx where it was created? (really a subset of previous question)

A: my take is that we make a distinction based on what is bound: the trait or the method. In the first case type doesn't even implement the trait outside of ctx. in case of the method - it just cannot be called. Are there cases where first is ever desirable?

EDIT: A2: if storing a ref is desirable - with bound constructor it be.

Q: inherent impls?

EDIT: A: we do inherent impls bound with context just act as if all their methods were bound instead.

Q: what to do with values of the ctx blocks? do we even allow that? (related to previous)

EDIT: A: since they may borrow from the ctx (allocator use case for example) I think we don't allow this - i believe this is too little benefit for such borrowck complications;

Q: do closures raw impl Fn type implements its very trait outside of the ctx where it was created? Do they capture the ctx, or receive it on each call?

EDIT: A: they get it on each call, and thus they need a matching ctx to be called.

Q: if we have allocator as a ctx and have box be allocated in one, passed down and there alloc gets shadowed by an arena, how to deallocate that box?

A: Proper solution requires Box::new() to capture the reference from ctx - lifetime of allocator ref?

EDIT: A2: we provide it through the generic:

impl<'a> Box<T> with Allocator(&'a alloc) {
   fn new(val: t) -> Box<T, impl Allocator + 'a> {...}
}

P.S. i reread the old threads - the implementation was never a concern for anything but dyn trait case...

EDIT: also we perhaps must forbid context dependent Drop, Deref and co impls;

Vorpal · August 4, 2024, 7:42pm

Doesn't help when doing code review in azure devops (or gitlab or whatever you happen to use). Nor when looking at diffs in git in the terminal or whatever your git GUI of choise is. You are lucky if even syntax highlighting works on unified diffs...

A programming language shouldn't rely on having an IDE around for you to be able to read it.

Vorpal · August 4, 2024, 7:59pm

Yes, that is the best variant of this I have seen so far. It is something I could find acceptable.

It does have a downside though, in that it still has "spooky action at a distance" for if borrowck passes or not. Since that is only for private functions that is not a semver hazard thankfully.

It is a bit of an annoyance though, I could see that it could cause annoyed swearing during development, and two PRs that pass CI individually to not pass together.

Of course that can already happen for a bunch of different reasons, and you really should be using not rocket science rule just like rustc does. I have floated that idea before at my workplace, but haven't managed to get much traction, for various reasons (builds are already slow, full system integration tests take a long time to run and we still have some flaky integration tests, etc, etc). I cannot imagine that we are unique in that.

So while I like the concept and can see uses for it, I do still have some reservations. And this is the sort of feature that is most useful in a large code base (more than tens of thousands of lines of code). And I don't think I can make a final judgement call on this until we have seen what using it in such a situation would actually be like.

If you want to convince people about this, you should ask yourself how this feature could be abused, how it could go wrong. And then come up with ways to counteract those or explain why they aren't a problem. (Note: I haven't had time to read your pre-RFC yet, so maybe you already have a large section on potential drawbacks.)

farnz · August 5, 2024, 8:23am

I have a couple of details I'd like you to flesh out, because I think they'll improve the final RFC.

Take the following:

realm Subsystem1;

realm Subsystem2;

realm App: Subsystem1, Subsystem2;

ctx SYS_VALUES in Subsystem1 = Vec<u32>;
ctx SYS_FLAG in Subsystem1 = bool;

ctx SYS_VALUES in Subsystem2 = Vec<i64>;
ctx SYS_FLAG in Subsystem2 = FlagType;

First, how do you handle the conflict between the two sets of names in App? Compiler error? Some sort of disambiguation syntax?

Second, in the snippet:

fn quux(v: i32) use Subsystem2 {
    if v == -42 {
        *SYS_2_FLAG = true;
    }
}

fn do_something(v: i32) use Subsystem1 {
    let sys2_values = SYS_VALUES.iter()
                                .map(|v| (*v).into())
                                .collect();
    let sys2_flag = FlagType::from_sys1_bool(SYS_FLAG);
    // What goes here?
    quux(v);
}

How do I "switch" realms around temporarily so that quux is called from a Subsystem2 realm, not a Subsystem1 realm like do_something? How do I extend that to turn my Subsystem1 realm into a App realm?

tema2 · August 5, 2024, 8:25am

just say the realm is namespace also? like module? for the naming conflict cases.

yigal100 · August 5, 2024, 12:23pm

This concept is not novel, but rather a repetition of previous errors, such as AOP. This is a profoundly misguided and fundamentally flawed notion. Rust should not entertain any form of it.

On a more technical level, I find the naming to be quite confusing. Capabilities are a well-established (decades old) term of art that refers to the exact opposite of this notion of implicit contexts and side channels. Capability-based security is the concept behind e.g Wasm whereby the security hazard of ambient authority and the "confused deputy problem" is eliminated by explicitly passing capabilities down the call stack.

Those individuals who mistakenly attempt to find a compromise fail to acknowledge a fundamental truth: introducing additional realms, using statements, or any other means to essentially pass parameters to functions simply creates another syntax to achieve the same old outcome. By doing so, Rust—an already complicated language—moves closer to the infamous write-only Perl, known for its motto, "there's more than one way to do that." Hard, hard pass on this foolishness.

A wise individual avoids situations that require cleverness to escape.

Instead of creating new methods to pass parameters to functions, let's use established patterns like explicit inversion of control (also known as dependency injection).

tema2 · August 5, 2024, 1:57pm

As for abuse of feature i can tell that if we allow only shared refs as ctx then we already rule out all the abuse related to accumulators, weird data recollection and co. (modulo interior mutability)

The other prospect is sharing too much data in single ctx, but that's an issue today already with god objects, just bloated app ctx (on frontend for example), etc. Solved by facets from one of the above posts.

I'd like to know how realistic it is to use the feature for business modelling...

My objection to full explicitness is that it somewhat defeats the purpose:

fn foo() use Relm {
   //uses Relm
}

is essentially the same as

fn foo(Relm: &Relm) {
   //uses Relm
}

modulo that you manifest Relm argument not in call site but one level upper.

also all what relm achieves can be done with older with proposal and one additional struct saying all that declaration of a relm conveys:

Example from above:

struct Subsystem1 {
   SYS_VALUES: Vec<u32>,
   SYS_FLAG: bool
}

struct Subsystem2 {
   SYS_VALUES: Vec<i64>,
   SYS_FLAG: FlagType
}
...
fn quux(v: i32) with (ref mut ss: Subsystem2) {
    if v == -42 {
        ss.SYS_2_FLAG = true; // I believe this should be forbidden, kept for cleverness
    }
}

fn do_something(v: i32) with (ref ss: Subsystem1) {
    let sys2_values = ss.SYS_VALUES.iter()
                                .map(|v| (*v).into())
                                .collect();
    let sys2_flag = FlagType::from_sys1_bool(ss.SYS_FLAG);
    // What goes here? - A: with block
    with (&Subsystem2 { SYS_VALUES: sys2_values, SYS_FLAG: sys2_flag}) {
        quux(v);
    }
}
...

if we force that use bound from this syntax variant, or with bound from older one, wherever this feature is used then feature instantly looses the core point as such.

consider the snippet:

...
fn main() {
   let conf = load_conf()?; //some conf, usual
   with (&conf) {
      let logger = init_logger()?; 
      // for code readability it has been moved to a function
      // if the impl is custom - to its file
      let resources = get_static_resources(); // file IO, mmaps, whatever
      runtime::run(async move {
         httpserver::serve(&conf.url, resources).await?
         // all the handlers can read the config, use the logger, allocator, etc
         // in the more usual setup some router may insert a tracing implementation, or smth. 
      });
   }
}

You'll be right to say that web servers have their own DI facilities and we should use those.

However, why do they have them in the first place? Why does every server carry some hashmap of TypeId to Arc<T> inside?
People can make their own DI, and use it, perhaps even better than the language provided one. Yet it always will be explicit, you have to pull some actix_web::Data<Ctx> in handlers, which perhaps is right to do if pulling DB object, or extract middleware result.\

But you never want that for stuff like allocator, logger, config, static resources (if they are made as object).

I don't think we should sacrifice expressive power due to possible misuse.

Arena's benefit:

#[arena(5 * input.len(), b)] 
fn some_algo(input: Vec<Input>) -> Data {
}

As can be guessed, that makes stack based allocator of configurable size put's ref to it in the ctx, and wraps the body into with block.
Same help for tracing macro, benches, etc.

Also, this can come in hand when we will be making async methods in traits, via the with core::alloc::Allocator bound (the alloc crate should provide platform specific impls).

My only doubt about this is where is really the feature in the cost/benefit space?

Radbuglet · August 5, 2024, 2:29pm

Contextual parameters are treated like statics in terms of name resolution and borrow syntax. You have to import the item to use it. This would compile error since you reused the name in a single module.

That’s intentionally impossible. A realm tells you which context paramaters (or capabilities—no clue what to call them!) you are allowed to access. If you say that you operate solely on subsystem 2, it is impossible to then work on subsystem 1 unless subsystem 2 inherits subsystem 1’s realm. The WIP RFC linked in this comment goes in a lot more detail about the semantics of realms.

bestouff · August 5, 2024, 2:29pm

Ooh I love this ! Context (ha!): I have a quite big and involved parser with 2 modes: case-sensitive or case-insensitive. As I can't pass a parameter to the Hash/HashMap traits, I have to resort to generics which pollute everything in the code. That's quite frankly a mess. I'd love to have such a thing - but usable in prod.

Radbuglet · August 5, 2024, 2:37pm

See the motivation section of the WIP RFC for details on what we actually gain from this feature compared to regular parameter passing or bundles of dependencies.

Radbuglet · August 5, 2024, 2:39pm

That fails to pass the “granular borrow checking” design goal I laid out in the motivation section of my WIP RFC draft.

Wait how? What core point do we miss? With the realm proposal, we still have...

Granular borrows since each contextual parameter can be independently borrowed mutably or immutably.
Bounded refactors since updating the contextual requirements of a function deep in the call chain either requires you to define a new contextual parameter in that function's realm or extend that realm to inherit a sub-realm already containing it.
A checked system since realm borrow checking happens at compile time rather than runtime.

SkiFire13 · August 5, 2024, 2:41pm

If you can't pass a parameter then I doubt this feature will work for your usecase, since it's essentially sugar for implicit arguments.

farnz · August 5, 2024, 3:53pm

I've read the RFC, and I'm still confused. I am in the realm Sys_1; I use bind to add in Sys_2 realm items (which is possible as far as I can tell, since binds can be nested). What then stops me calling functions that need Sys_2, or even realm App: Sys_1, Sys_2?

Radbuglet · August 5, 2024, 4:04pm

Realms purely serve to restrict the potential set of values a function has the permission to access—they don't say anything about which values actually have to be borrowable.

Consider the following example:

realm Sys1;
realm Sys2;
realm App: Sys1, Sys2;

ctx FLAG_1 in Sys1 = bool;
ctx FLAG_2_A in Sys2 = bool;
ctx FLAG_2_B in Sys2 = bool;

fn main() {
    let mut flag_1 = false;
    let mut flag_2 = false;

    bind(FLAG_1 = &mut flag_1, FLAG_2_A = &mut flag_2) {
        demo();
    }
}

fn demo() use App {
    *FLAG_1 = true;
    *FLAG_2_A = true;
}

main can call demo even though Sys2 is missing the FLAG_2_B item. It's only if FLAG_2_B is requested by demo that this code will fail to compile.

If demo calls a function sys_1 which only requests access to Sys1, that function is giving up the right to access anything outside of Sys1—whether it be directly or indirectly.

// (continuing from the previous snippet)

fn demo() use App {
    sys_1();
}

fn sys_1() use Sys1 {
    // We relinquished access to `Sys2` and can therefore no
    // longer access any of its contextual parameters.
    *FLAG_2_A = true;

    // Nice try but this is also denied since `Sys1` does not
    // grant access to `Sys2`.
    covertly_access_sys_2_through_indirection();

    // You'd have to rebind a new value to `FLAG_2_A` to be able
    // to call `covertly_access_sys_2_through_indirection` again.
    bind(FLAG_2_A = &mut false) {
        // This works now but is modifying a *different* value.
        covertly_access_sys_2_through_indirection();
    }

    // If you only rebind `FLAG_2_B`, you won't also be able to
    // access `FLAG_2_A` because this function essentially forgot
    // about all contextual parameters not accessible by `Sys1`.
    //
    // Hence, this will result in a missing context error:
    bind(FLAG_2_B = &mut false) {
        covertly_access_sys_2_through_indirection();
    }
}

fn covertly_access_sys_2_through_indirection() use Sys2 {
    *FLAG_2_A = true;
}

Maybe, instead of "permission to access," we rephrase it as "permission to carry" since you lose both the ability to access the contextual parameter directly and the ability to carry it along to a function that may wish to access it.

farnz · August 5, 2024, 4:47pm

I'm now completely confused; you said that it's impossible for code in sys_1 to switch around the realms so that it can call a function in that needs the Sys2 realm, but in your example code, you have code in sys_1 that calls a function that needs the Sys2 realm, by binding a new instance of the realm together.

I get that if I'm in sys_1(), I don't have any access to the variant of the Sys2 realm that's present in demo(), since I've given that up by saying I'm in the Sys1 realm; but I expected to be able to switch around the realms so that I can call code in the Sys2 realm very deliberately. What's blocked is accidentally calling code in the Sys2 realm, because I need to bind a new context for it.

Radbuglet · August 5, 2024, 5:42pm

Right, I can see how rebinding can be confusing since rebindings don't really have a realm of their own. Sorry!

I think a better way to explain this is that there are two rules:

The context checking rule, which operates on the set of things actually borrowed by functions.
The realm checking rule, which operates on the set of things potentially borrowed by functions.

Only the former context checking rule is required to make this mechanism work but the latter can help clarify diagnostics.

The former context checking rule says that "only context items inside the realm of the current function can be accessed or forwarded to another function."

The latter realm checking rule says that "functions which live in one realm cannot call functions in a realm that isn't inherited by the current realm unless the call is in a bind block."

So, this would fail because of the realm checker:

realm Sys1;
realm Sys2;

fn foo() use Sys1 {
    bar();
}

fn bar() use Sys2 {}

...but this somewhat goofy code would be accepted:

realm Sys1;
realm Sys2;

fn foo() use Sys1 {
    bind () {
        // This realm could be anything. Since `bar` doesn't depend on
        // anything, we just let the realm of this scope resolve to `Sys2`
        // and everything is fine.
        bar();
    }
}

fn bar() use Sys2 {}

And this code would fail because of the context checking rule:

realm Sys1;
realm Sys2;

ctx SYS2_FLAG in Sys2 = bool;

fn main() {
    bind(SYS2_FLAG = &mut false) {
        foo();
    }
}

fn foo() use Sys1 {
    bind () {
        // This realm could be anything so we just let the realm of this scope
        // resolve to `Sys2`. However, we still can't call the function because
        // it depends on `SYS2_FLAG`, which was not forwarded to us because
        // we're operating in the `Sys1` realm which does not forward context
       // parameters in the `Sys2` realm.
        bar();
    }
}

fn bar() use Sys2 {
    *SYS2_FLAG = true;
}

You're right that this is a pretty subtle semantic rule and I'll try to clarify that in the RFC draft.

zackw · August 5, 2024, 8:42pm

Having read through the draft RFC, I have some notes. First off, a bunch of questions about the exact semantics of realms and bind expressions:

If a library crate exposes a realm in its public interface, does that mean that all of the parameters belonging to that realm are also part of the public interface?
If a library crate exposes a realm in its public interface, is it always a breaking change for the library to add a context parameter to that realm? (Obviously it has to be a breaking change at least some of the time.) What about renaming or removing context parameters?
If a library crate exposes functions that declare use of a realm, is it a breaking change for those functions to change which context parameters within the realm they actually use?
What happens if a bind expression doesn't provide values for all of the context parameters that are actually used by the functions that are called within the bind block? (Clearly this should be an error; can we make it always be a compile-time error, or are there situations where it has to be reported at runtime?)
What happens if a bind expression does provide values for all the parameters that are actually used, but not other context parameters associated with the realms declared by the functions that are called? (Before you say "of course this is also a compile time error", think about the implications for your answers to (2) and (3).)
Do the answers to (4) and (5) change if the call tree crosses crate boundaries?

Second, some high level observations in no particular order:

I don't like "realms define which context elements a function is permitted to borrow—not which elements it actually borrows." I think each context-using function, public or private, should be required to declare in its signature which elements it actually borrows.
For similar reasons, I think the use <realm> parameter "annotation" to calls to context-using functions should be mandatory for all callsites.
I found the part of the RFC about generics to be mostly incomprehensible. It might make more sense to someone who's more deeply familiar with the details of Rust generics, but please think hard about how to explain it better.
I think the restrictions that trait functions cannot consume context parameters, and that context-using functions cannot be converted to function pointers, are likely to be troublesome. Please also think about how these restrictions can be lifted.
Relatedly, please think about what it would take to be able to supply context parameters to closures.
To have confidence in a change of this magnitude to something as basic as function calls, we need to be sure it can be implemented. Therefore, the reference section should include at least an outline of how context parameter passing will be implemented at the level of assembly language.

Third, some specific comments on syntax.

```
 realm MyRealm;
 ctx MY_CTX in MyRealm = u32;
```
This is not how types or has-a relationships are written anywhere else in Rust. Suggest instead
```
realm MyRealm {
    ctx MY_CTX: u32;
}
```
```
 realm CompositeRealm1: MyRealm1, MyRealm2;
 realm CompositeRealm2  = MyRealm1, MyRealm2;
```
I do not understand whether there is a semantic difference between these. If there is a semantic difference, the syntactic difference between : and = is too small. If there is none, only one syntax should be accepted. This is also not very Rustish, I would suggest (assuming no semantic difference)
```
realm CompositeRealm1 {
    use MyRealm1;
    use MyRealm2;
    // possibly more `ctx` declarations here
}
```
```
bind ( ... ) { ... }
```
For consistency with the rest of the language, the parentheses should be dropped.

Radbuglet · August 6, 2024, 12:04am

These are some excellent questions!

No. A contextual parameter can be more private than the realm in which it's contained.

Removing a non-public contextual parameter from the realm is a non-breaking change since external crates can't observe the change. That is, of course, unless the borrow sets of public functions also change in tandem.

Removing a public contextual parameter from a realm is a breaking change, however, since users may attempt to access that parameter using that realm.

Adding any contextual parameter (whether public or private) into a realm also isn't a breaking change by itself.

It is a breaking change to add usages on additional contextual parameters. That's why #[borrows_only] exists. It is, however, always fine to remove usages since this can only allow new code to compile that didn't compile before. You don't have to worry about inadvertently weaking these sets, though, since #[borrows_only] also defines to the actually borrowed set.

This is always a compile time error—just as it was in the prototype AuToken.

If parameters which aren't actually used are not supplied, it's not an error.

Yes. Changing actual borrow sets has to be a breaking change anyways so we might as well permit users to provide partial context.

Ah, I see why you asked additions to and removals from existing realms. Yes, my answers to those two questions still hold.

Nope! Even with the AuToken prototype, I tried my hardest to keep behavior consistent between inter-crate and intra-crate scenarios.

The only exception is that you can't define new context parameters in realms defined by an upstream crate since there really isn't a reason to do that. Also, I think I'm going to adopt the new realm definition syntax you proposed to avoid this problem entirely.

Requiring this would break the bounded refactors principle. But, yes, users are encouraged to explicitly declare actual borrow sets for functions in their public interface by a warn-by-default rustc lint.

There's a lint force this. It's not on by default because I've heard opinions that other users who don't want to have to use the annotation. This is probably going to be a source of bike-shedding.

Thanks for the feedback! I was under the impression that the rules about generics would be intuitive since they're mostly defined by what the feature doesn't support. Are there specific things that are confusing or is it just generally difficult to follow?

It isn't supported because it creates a lot of hazards for semantic versioning. See the "semantics of generics" section of the AuToken README and my first design comment in this thread for details on how I got to this rule-set.

To pass context parameters to closures, so long as the closure does not have to live for 'static, you can do:

// Acquires a bundle with a fresh inference set.
let cx = infer_set!();

use_closure(|| {
    // Binds the bundled variables in the current function's scope.
    // The inference set is automatically extended to include everything
    // needed by this function.
    bind cx;

    ...
});

There's no way to make this work if the closure has to live for 'static, regardless of the rule-set chosen, since contextual parameters are all references.

Sorry, I'm going to need to enlist the help of a backend person to help me answer than question.

Good suggestion!

zackw:

 realm CompositeRealm1: MyRealm1, MyRealm2;
 realm CompositeRealm2  = MyRealm1, MyRealm2;
I do not understand whether there is a semantic difference between these. If there is a semantic difference, the syntactic difference between : and = is too small. If there is none, only one syntax should be accepted. This is also not very Rustish, I would suggest (assuming no semantic difference)
realm CompositeRealm1 {
    use MyRealm1;
    use MyRealm2;
    // possibly more `ctx` declarations here
}

These do actually have a pretty significant difference: the former syntax with the : denotes a realm that can contain context items of its own whereas the latter with the = denotes a realm that cannot contain context items of its own. This is important because, in the following example:

// Upstream crate
realm Foo;

realm Bar;

// Downstream crate
fn foo() use Foo {
    bar();
}

fn bar() use Bar {
    foo();
}

It's desirable that Foo and Bar unify by name rather than by their sets of contextual parameters since, otherwise, adding any context becomes a breaking change!

However, sometimes, you do actually just want to refer to the union of realms Foo and Bar without having to write that every single time. That's why aliases exist:

realm Foo;
realm Bar;

realm MyAlias = Foo, Bar;

Realm aliases can't have context parameters of their own since it's unclear what that means.

I think we can keep the = syntax so long as we use the...

realm RealmAlias = Foo, Bar;

realm CompositeRealm1 {
    use MyRealm1;
    use MyRealm2;
    // possibly more `ctx` declarations here
}

...syntax you recommended.

Good suggestion. I think I updated it for bundle binding syntax but forgot to update it for individual context parameter binding.

Topic		Replies	Views
Pre-RFC: typed context injection language design	8	1119	December 17, 2023
[Pre-RFC] Reborrow trait language design	28	1387	June 9, 2025
[RFC] Vec::append_from_within() language design	2	989	September 20, 2019
This week's older RFCs up for discussion	4	1984	March 25, 2019
[Pre-RFC] `std` aware Cargo language design	4	1430	May 16, 2019

Pre-pre-RFC/Working Prototype: Borrow-Aware Automated Context Passing

Related topics